| Literature DB >> 25403556 |
Abstract
Entities:
Mesh:
Year: 2014 PMID: 25403556 PMCID: PMC4350431 DOI: 10.1002/ar.23101
Source DB: PubMed Journal: Anat Rec (Hoboken) ISSN: 1932-8486 Impact factor: 2.064
Results from attempted replication of the curve fit of Figure 6 from Erickson et al. 2009
| Data set | Origin | Regression source | Version | a | b |
|---|---|---|---|---|---|
| Unknown | Unknown | Erickson et al. 2009 | 2.8.1 | 37.38 | 0.55 |
| Full data set | Table | 9.01 | 88.44 | 0.345 | |
| Microsoft Excel 2010 | 14.0.7116.5000 | 88.45 | 0.345 | ||
| Matlab | R2013 | 88.44 | 0.345 | ||
| R | 2.15.2 | 88.44 | 0.345 | ||
| Full data set— | Duplicate points | 9.01 | 79.17 | 0.349 | |
| Microsoft Excel 2010 | 14.0.7116.5000 | 79.17 | 0.349 | ||
| Matlab | R2013 | 79.17 | 0.349 | ||
| R | 2.15.2 | 79.17 | 0.349 | ||
| Histologically aged | Labeled in Table | 9.01 | 86.57 | 0.346 | |
| Microsoft Excel 2010 | 14.0.7116.5000 | 86.58 | 0.346 | ||
| Matlab | R2013 | 86.57 | 0.346 | ||
| R | 2.15.2 | 86.57 | 0.346 | ||
| Histologically aged | Duplicate points | 9.01 | 77.93 | 0.351 | |
| Microsoft Excel 2010 | 14.0.7116.5000 | 77.93 | 0.351 | ||
| Matlab | R2013 | 77.92 | 0.351 | ||
| R | 2.15.2 | 77.92 | 0.351 | ||
| As-plotted data set | Recovered from digital | 9.01 | 73.29 | 0.353 | |
| Microsoft Excel 2010 | 14.0.7116.5000 | 73.26 | 0.353 | ||
| Matlab | R2013 | 73.26 | 0.353 | ||
| R | 2.15.2 | 73.26 | 0.353 |
The first row is the original fit to a two-parameter (a, b) logistic function published in the caption of Figure 6 of Erickson et al. 2009. The following rows are the regression results from different software packages with the full data set (Table1 of Erickson et al. 2009) or variations on that data set described in the text. In no case do they match the first row. Each case uses the two-parameter logistic function specified in Erickson et al. 2009. In the case of Mathematica, nine different optimization algorithms for computing the least squares regression were tried, including conjugate–gradient, gradient, Newton, quasi-Newton, Levenberg–Marquardt, Nelder–Mead, differential evolution, simulated annealing, and random search, all of which gave the result to within four significant digits using double precision arithmetic (64 bit) and a maximum of 1,000 iterations. Using extended precision (50 digit or 166 bit) arithmetic and up to 100,000 iterations did not change the results to four significant digits.
Figure 1A plot of Erickson et al.'s best fit equation, given in the caption of Figure 6 in Erickson et al. 2009, as well as the published and recovered data sets and two of my regression attempts, overlaid on top of the suspect figure. Erickson et al.'s best fit curve according to the caption published with Figure 6 matches neither the plotted best fit curve nor any of my attempted regressions. The recovered data set which I used in my reanalysis, plotted as yellow points, shows close correspondence with the data set as plotted. Comparing the published data set with the recovered data set, it is clear that many of the original data points were not plotted in the published figure, and others appear to have been plotted inaccurately. The source of the data set as published, the best fit regression curve, and the regression curve as plotted are unknown.
Sum of squared residuals for Erickson's and two Myhrvold (2013) fits to the originally published P. lujiatunensis data set (Table1 of Erickson et al. 2009). The Myhrvold (2013) results were obtained using Mathematica. The center column is a two parameter fit to the original logistic function in Erickson et al. 2009. The right column shows the results of fixing the parameter, to match the value used by Erickson et al. 2009, and then performing a one-dimensional regression on the parameter. Even when one parameter is fixed, one cannot recover the Erickson et al. value for the remaining parameter. In general, Myhrvold fits have more than a 10-fold lower sum of square error than the Erickson equation, demonstrating that they are much better fits.
| Data set | Regression source | Sum of squared residuals | ||
|---|---|---|---|---|
| Full Data Set | Erickson | 37.38 | 0.55 | 2197.57 |
| Best fit | 88.44 | 0.345 | 133.52 | |
| Best fit setting | 37.38 | 0.388 | 181.102 | |
| Full Data Set –unique points only | Erickson | 37.38 | 0.55 | 1928.01 |
| Best fit | 79.17 | 0.349 | 81.64 | |
| Best fit setting | 37.38 | 0.390 | 121.39 | |
| Histologically aged subset | Erickson | 37.38 | 0.55 | 1414.23 |
| Best fit | 86.57 | 0.346 | 74.33 | |
| Best fit setting | 37.38 | 0.392 | 114.77 | |
| Histologically aged subset –unique points only | Erickson | 37.38 | 0.55 | 1237.67 |
| Best fit | 77.92 | 0.351 | 64.12 | |
| Best fit setting | 37.38 | 0.395 | 98.16 | |
| As-plotted in Figure 6 of Erickson | Erickson | 37.38 | 0.55 | 1235.51 |
| Best fit | 73.29 | 0.353 | 65.80 | |
| Best fit setting | 37.38 | 0.395 | 96.22 |