| Literature DB >> 27191170 |
T Alex Dececchi1, Paula M Mabee1, David C Blackburn2.
Abstract
Databases of organismal traits that aggregate information from one or multiple sources can be leveraged for large-scale analyses in biology. Yet the differences among these data streams and how well they capture trait diversity have never been explored. We present the first analysis of the differences between phenotypes captured in free text of descriptive publications ('monographs') and those used in phylogenetic analyses ('matrices'). We focus our analysis on osteological phenotypes of the limbs of four extinct vertebrate taxa critical to our understanding of the fin-to-limb transition. We find that there is low overlap between the anatomical entities used in these two sources of phenotype data, indicating that phenotypes represented in matrices are not simply a subset of those found in monographic descriptions. Perhaps as expected, compared to characters found in matrices, phenotypes in monographs tend to emphasize descriptive and positional morphology, be somewhat more complex, and relate to fewer additional taxa. While based on a small set of focal taxa, these qualitative and quantitative data suggest that either source of phenotypes alone will result in incomplete knowledge of variation for a given taxon. As a broader community develops to use and expand databases characterizing organismal trait diversity, it is important to recognize the limitations of the data sources and develop strategies to more fully characterize variation both within species and across the tree of life.Entities:
Mesh:
Year: 2016 PMID: 27191170 PMCID: PMC4871461 DOI: 10.1371/journal.pone.0155680
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
List of monographic and matrix publications used in this analysis along with anatomical focus of the study and the number of fin or limb and girdle EQs (phenotypes) associated with each taxon.
| Taxon | Monograph | Anatomical focus | # Monograph EQ | Matrix | # Matrix EQ |
|---|---|---|---|---|---|
| Coates 1996 | Whole body | 341 | Carroll 2007, Clack et al. 2012, Daeschler et al. 2006, Ruta 2011, Swartz 2012, Vallin and Laurin 2004 | 425 | |
| Garvey et al. 2005 | Pectoral fin | 103 | Ruta 2011, Swartz 2012 | 118 | |
| Boisvert 2005, Boisvert et al. 2008, Boisvert 2009 | Pelvic fin and girdle, Pectoral fin and girdle, Humerus | 52, 51, 103 (total = 206) | Clack et al. 2012, Daeschler et al. 2006, Ruta 2011, Swartz 2012, Vallin and Laurin 2004 | 287 | |
| Shubin et al. 2006, Shubin et al. 2014 | Pectoral limb and girdle | 117, 58 (total = 175) | Clack et al. 2012, Daeschler et al. 2006, Ruta 2011, Swartz 2012 | 226 |
Fig 1Boxplots showing comparison of mean EQ (A) and E (B) size per annotation between monographs and matrices.
See Table 2 for details. Comparison of quality size was not included as there was no significant difference between mean quality complexity between the two sources of data (t-test p = 0.18). Breakdown of types per publication shown in S2 Appendix.
Breakdown per monographic publication of the number of phenotype statements, percentage of comparative statements with the absolute number of statements given in parentheses, number of limb/fin and girdle EQ annotations, number of taxa referenced in the limb/fin and girdle section of the monograph, and the average EQ complexity (see Methods).
| Publication | Phenotype statements | No. Comp. statements | EQs | Taxa | EQ complexity (mean, med., max.) | E (mean, max.) | Q (mean, max.) |
|---|---|---|---|---|---|---|---|
| Boisvert 2005 | 19 | 2 | 52 | 3 | 3.1, 3, 7 | 2.0, 6 | 1.0, 3 |
| Boisvert et al. 2008 | 22 | 5 | 51 | 4 | 3.6, 3, 9 | 2.4,7 | 1.1, 3 |
| Boisvert 2009 | 42 | 20 | 103 | 11 | 3.3, 3, 7 | 2.0, 6 | 1.3, 5 |
| Coates 1996 | 158 | 15 | 341 | 18 | 3.4, 3, 11 | 2.4, 10 | 1.0, 5 |
| Garvey et al. 2005 | 54 | 10 | 113 | 10 | 3.7, 3, 15 | 2.7, 14 | 1.1, 3 |
| Shubin et al. 2006 | 46 | 8 | 117 | 9 | 3.5, 3, 11 | 2.3,10 | 1.1, 3 |
| Shubin et al. 2014 | 24 | 7 | 58 | 6 | 3.2, 3, 7 | 2.1, 6 | 1.1, 3 |
Breakdown per matrix publication of the number of characters and states, limb/fin and girdle EQ annotations, taxa referenced in the limb/fin and girdle section of the monograph, and the average EQ complexity (see Methods).
Char. = Character; Char. States = Character States.
| Publication | Char. | Char. States | EQs | Taxa | EQ complexity (mean, med., max.) | E (mean, max.) | Q (mean, max.) |
|---|---|---|---|---|---|---|---|
| Carroll 2007 | 49 | 199 | 422 | 22 | 3.1, 2, 11 | 2.1, 10 | 1.0, 3 |
| Clack et al. 2012 | 19 | 43 | 69 | 22 | 2.6, 2, 6 | 1.6, 5 | 1.0, 1 |
| Daeschler et al. 2006 | 32 | 67 | 85 | 9 | 2.9, 2, 7 | 1.9, 6 | 1.0, 1 |
| Ruta 2011 | 155 | 393 | 500 | 44 | 3.3, 2, 10 | 2.1, 9 | 1.2, 7 |
| Swartz 2012 | 46 | 96 | 123 | 47 | 2.9, 2, 6 | 1.8, 5 | 1.1, 5 |
| 36 | 89 | 96 | 49 | 2.9, 2, 6 | 1.9, 5 | 1.0, 1 | |
Total number of girdle and limb anatomical entities, unique and shared, described in monographs vs. matrices for each of the study taxa.
| Taxon | Total Entities monograph | Total Entities matrix | Unique Entities in monograph | Unique Entities in matrix | Number shared E and total E |
|---|---|---|---|---|---|
| 117 | 145 | 31/117 (26%) | 59/145 (41%) | 86/176 (49%) | |
| 30 | 53 | 9/30 (30%) | 32/53 (60%) | 21/62 (34%) | |
| 67 | 86 | 38/67 (57%) | 57/86 (67%) | 29/124 (23%) | |
| 42 | 68 | 21/42 (50%) | 47/68 (68%) | 21/89 (24%) | |
Number of anatomical entities, unique and shared, described in monographs vs. matrices for Tiktaalik.
| Total Entities Monograph | Total Entities Matrix | Unique Entities in monograph | Unique Entities in matrix | Intersection of Entities | |
|---|---|---|---|---|---|
| Daeschler et al. 2006 | - | 22 | - | 11 (50%) | 11 |
| Shubin et al. 2006 | 42 | - | 31 (74%) | - | 11 |
Average percent of character quality types (range in parentheses) for matrix publications (combined) and monographs (combined).
t-test *p<0.05; **p<0.001. Breakdown of types per publication shown in S2 Appendix.
| Character quality type | Matrix | Monograph |
|---|---|---|
| Morphology | 36.8 (17.8–48.4) | 54.3* (30.2–70.4) |
| Neomorphic | 37.2 (24.1–46.5) | 12.4** (0.0–16.7) |
| Position | 16.0 (9.9–26.4) | 30.5* (13.0–51.9) |
| Number | 10 (0.0–28.0) | 2.8 (0.0–7.4) |