| Literature DB >> 27312960 |
Hao-Yang Wu1, Yan-Hui Wang1,2, Qiang Xie1, Yun-Ling Ke3, Wen-Jun Bu1.
Abstract
With the great development of sequencing technologies and systematic methods, our understanding of evolutionary relationships at deeper levels within the tree of life has greatly improved over the last decade. However, the current taxonomic methodology is insufficient to describe the growing levels of diversity in both a standardised and general way due to the limitations of using only morphological traits to describe clades. Herein, we propose the idea of a molecular classification based on hierarchical and discrete amino acid characters. Clades are classified based on the results of phylogenetic analyses and described using amino acids with group specificity in phylograms. Practices based on the recently published phylogenomic datasets of insects together with 15 de novo sequenced transcriptomes in this study demonstrate that such a methodology can accommodate various higher ranks of taxonomy. Such an approach has the advantage of describing organisms in a standard and discrete way within a phylogenetic framework, thereby facilitating the recognition of clades from the view of the whole lineage, as indicated by PhyloCode. By combining identification keys and phylogenies, the molecular classification based on hierarchical and discrete characters may greatly boost the progress of integrative taxonomy.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27312960 PMCID: PMC4911608 DOI: 10.1038/srep28308
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Summary of the strategy in selecting apomorphies.
The black dot indicates the group of organism as the goal clade. The cross indicates that such a scenario should be refused, while the tick indicates that such a scenario can be accepted. Double ticks indicate an acceptance with high priority. (a) Preference of apomorphies based on the extent of overlapping. (b) Preference of apomorphies based on the data coverage in a site. (c) Preference of apomorphies based on the extent of uniqueness. (d) Preference of apomorphies based on the rarity of amino acids substitution.
Figure 2Molecular descriptions of clades in Hexapoda based on apomorphic amino acids.
The apomorphies of amino acids are coloured based on the respective biochemical attributes. States shown in rounded rectangles indicate plesiomorphic states, while states shown in rectangles indicate apomorphic states. The diagonal indicates a binary apomorphic state. (a) Tree-like descriptions for clades in Hexapoda. (b) Combined description for Diptera.
Figure 3Molecular descriptions of clades in Anophelinae based on apomorphic amino acids.
The apomorphies of amino acids are coloured based on the respective biochemical attributes. States shown as a rounded rectangle indicate plesiomorphic states, while states shown as a rectangle indicate apomorphic states. The diagonal indicates a binary apomorphic state. (a) Tree-like descriptions for clades in Anophelinae. (b) Combined description for Anopheles gambiae complex.
Figure 4Sequential descriptions of clades based on apomorphic amino acids shown in a two-dimensional table.
(a) Sequential descriptions of clades in Hexapoda. (b) Sequential descriptions of clades in Anophelinae. The number above each column is a numerical symbol. Apomorphies that are confirmed to be unique by comparing all of the organisms in the dataset are shown in white text. Non-apomorphic characters are shown in grey text. Each description for a lineage consists of two parts. The substantial parts for identification comprise apomorphic codes that are arranged following a strict hierarchical order (corresponding to the bars of discrete symbol). While the subordinate and trivial parts comprise the non-apomorphic characters, which only plays a structurally appurtenant role and contain no information for description and diagnoses (corresponding to the additional space of discrete symbologies). It should be noted that the minor variations in non-apomorphic characters are not shown for simplification, albeit the proportion of which are very small.
Results of query test of the 51 unknown transcriptomes.
| Pterygota | T00I02X03E06R08X09X0B | T00I02G03E06R08S09C0B | False-negative | |
| Odonata | X00I02X03E06R08X09C0B | T00I02G03E06R08S09C0B | Positive | |
| Ephemeroptera | T00I02X03X06R08X09Q0A | T00I02G03E06R08S09Q0A | Positive | |
| Ephemeroptera | T00I02X03E06R08S09Q0A | T00I02G03E06R08S09Q0A | Positive | |
| Dermaptera | X00I02?03E06R08X0CQ0DH0EF0F | T00I02G03E06R08S0CQ0DH0EF0F | Positive | |
| Plecoptera | X00I02X03?06R08X0CQ0DS0G | T00I02G03E06R08S0CQ0DS0G | Positive | |
| Orthoptera | T00I02X03E06R08S0CQ0DM0H | T00I02G03E06R08S0CQ0DM0H | Positive | |
| Orthoptera | ?00I02X03E06R08S0CQ0DM0H | T00I02G03E06R08S0CQ0DM0H | Positive | |
| Mantodea | X00I02X03E06R08S0CQ0DM0IH0PC0R | T00I02G03E06R08S0CQ0DM0IH0PC0R | Positive | |
| Phasmida | T00I02X03X06R08X0CQ0DM0IX0JL0MV0O | T00I02G03E06R08S0CQ0DM0IK0JL0MV0O | Positive | |
| Blattodea | T00I02X03E06R08S0CQ0DM0IH0PT0S | T00I02G03E06R08S0CQ0DM0IH0PT0S | Positive | |
| Blattodea | T00I02X03E06R08S0CQ0DM0IX0PT0S | T00I02G03E06R08S0CQ0DM0IH0PT0S | Positive | |
| Blattodea | T00I02X03E06R08X0CQ0DM0IX0PT0S | T00I02G03E06R08S0CQ0DM0IH0PT0S | Positive | |
| Archaeognatha | T00I02X03N05 | T00I02G03N05 | Positive | |
| Zygentoma | T00I02X03E06T07 | T00I02G03E06T07 | Positive | |
| Coleopterodea | T00I02X03?06R08S0CX0VN0XN0ZR10H16X18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | False negative | |
| Lepidoptera | T00I02G03X06R08S0CX0VN0XN0ZN19L1AE1C | T00I02G03E06R08S0CI0VN0XN0ZN19L1AE1C | Positive | |
| Coleoptera | ?00I02X03E06R08X0CX0VN0XX0ZR10H16K18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | Positive | |
| Diptera | ?00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive | |
| Diptera | T00I02?03E06R08S0CI0V?0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive | |
| Coleoptera | T00I02G03E06R08S0CX0VN0XN0ZR10X16K18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | Positive | |
| Neuroptera | T00I02G03E06R08S0CX0VN0XN0ZR10M11X13C15 | T00I02G03E06R08S0CI0VN0XN0ZR10M11M13C15 | Positive | |
| Coleoptera | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | Positive | |
| Megaloptera | T00I02X03?06R08X0CX0VX0XN0ZR10X11X13R14 | T00I02G03E06R08S0CI0VN0XN0ZR10M11M13R14 | Positive | |
| Hymenoptera | T00I02G03E06R08S0CX0VN0XL0Y | T00I02G03E06R08S0CI0VN0XL0Y | Positive | |
| Diptera | T00I02G03E06R08?0CI0VN0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive | |
| Coleoptera | T00I02X03E06R08S0CX0VN0XN0ZR10H16K18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | Positive | |
| Diptera | T00I02?03E06R08S0CI0VN0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive | |
| Hymenoptera | T00X02G03E06?08X0CI0VN0XL0Y | T00I02G03E06R08S0CI0VN0XL0Y | Positive | |
| Coleoptera | T00I02?03E06?08?0C?0VN0X?0ZR10H16K18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | Positive | |
| Coleoptera | T00X02?03?06R08?0CX0V?0XX0ZR10X16K18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | Positive | |
| Lepidoptera | T00I02?03X06R08X0CX0VN0XX0ZN19L1AE1C | T00I02G03E06R08S0CI0VN0XN0ZN19L1AE1C | Positive | |
| Diptera | T00I02G03E06R08S0CX0VX0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive | |
| Neuroptera | ?00?02?03X06?08?0CX0V?0X?0ZR10?11X13C15 | T00I02G03E06R08S0CI0VN0XN0ZR10M11M13C15 | Positive | |
| Coleoptera | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | T00I02G03E06R08S0CI0VN0XN0ZR10H16K18 | Positive * | |
| Siphonaptera | T00I02G03E06R08S0CI0VN0XN0ZN19A1DG1E | T00I02G03E06R08S0CI0VN0XN0ZN19A1DG1E | Positive * | |
| Hymenoptera | T00I02G03E06R08X0CX0VN0XL0Y | T00I02G03E06R08S0CI0VN0XL0Y | Positive | |
| Hymenoptera | T00I02G03E06R08S0CI0VX0XL0Y | T00I02G03E06R08S0CI0VN0XL0Y | Positive | |
| Raphidioptera | T00X02G03?06R08X0CX0V?0XX0Z?10X11N12 | T00I02G03E06R08S0CI0VN0XN0ZR10M11N12 | Positive | |
| Lepidoptera | T00I02G03E06R08S0CX0VN0XN0ZN19L1AE1C | T00I02G03E06R08S0CI0VN0XN0ZN19L1AE1C | Positive | |
| Lepidoptera | T00I02G03X06R08S0CX0VN0XX0ZN19L1AE1C | T00I02G03E06R08S0CI0VN0XN0ZN19L1AE1C | Positive | |
| Diptera | ?00I02G03?06R08S0CI0VN0XN0ZN19?1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive | |
| Diptera | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive * | |
| Lepidoptera | ?00?02G03X06R08X0CX0V?0X?0ZN19?1AE1C | T00I02G03E06R08S0CI0VN0XN0ZN19L1AE1C | Positive | |
| Hymenoptera | T00X02?03E06R08S0CX0VN0XL0Y | T00I02G03E06R08S0CI0VN0XL0Y | Positive | |
| Diptera | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive * | |
| Hymenoptera | T00I02?03E06R08S0CX0VN0XL0Y | T00I02G03E06R08S0CI0VN0XL0Y | Positive | |
| Lepidoptera | ?00?02?03?06R08?0C?0VX0X?0ZN19XAE1C | T00I02G03E06R08S0CI0VN0XN0ZN19L1AE1C | Positive | |
| Diptera | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | T00I02G03E06R08S0CI0VN0XN0ZN19A1DY1H | Positive * | |
| Holometabola | ?00?02?03X06X08?0CX0VN0X?0Y | T00I02G03E06R08S0CI0VN0XL0Y | False-negative | |
| Lepidoptera | X00I02X03X06R08S0CX0VN0XX0ZN19L1AE1C | T00I02G03E06R08S0CI0VN0XN0ZN19L1AE1C | Positive |
“X” represents missing amino acid residues, while “?” represents missing gene. “*” represents complete-matching result.
Figure 5Comparison of DNA barcoding methods and the hierarchical molecular apomorphy-based classification system.
Complete definition of known known, known unknown, unknown known, unknown unknown can be found in the study produced by Collins and Cruickshank59. (a) General workflow of database construction in the previous barcoding methods. (b) General workflow of database construction in the hierarchical molecular apomorphy-based system. (c) General workflow of identification in the previous barcoding methods. (d) General workflow of identification in the hierarchical molecular apomorphy-based classification system.