| Literature DB >> 35627160 |
Liliya Doronina1, Olga Reising1, Hiram Clawson2, Gennady Churakov1, Jürgen Schmitz1,3.
Abstract
Euarchontoglires, once described as Supraprimates, comprise primates, colugos, tree shrews, rodents, and lagomorphs in a clade that evolved about 90 million years ago (mya) from a shared ancestor with Laurasiatheria. The rapid speciation of groups within Euarchontoglires, and the subsequent inherent incomplete marker fixation in ancestral lineages, led to challenged attempts at phylogenetic reconstructions, particularly for the phylogenetic position of tree shrews. To resolve this conundrum, we sampled genome-wide presence/absence patterns of transposed elements (TEs) from all representatives of Euarchontoglires. This specific marker system has the advantage that phylogenetic diagnostic characters can be extracted in a nearly unbiased fashion genome-wide from reference genomes. Their insertions are virtually free of homoplasy. We simultaneously employed two computational tools, the genome presence/absence compiler (GPAC) and 2-n-way, to find a maximum of diagnostic insertions from more than 3 million TE positions. From 361 extracted diagnostic TEs, 132 provide significant support for the current resolution of Primatomorpha (Primates plus Dermoptera), 94 support the union of Euarchonta (Primates, Dermoptera, plus Scandentia), and 135 marker insertion patterns support a variety of alternative phylogenetic scenarios. Thus, whole genome-level analysis and a virtually homoplasy-free marker system offer an opportunity to finally resolve the notorious phylogenetic challenges that nature produces in rapidly diversifying groups.Entities:
Keywords: 2-n-way; 4-lineage statistical test; Euarchontoglires; GPAC; ancestral incomplete lineage sorting; presence/absence; retrophylogenomics; transposed elements (TEs)
Mesh:
Year: 2022 PMID: 35627160 PMCID: PMC9141288 DOI: 10.3390/genes13050774
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Figure 1Procedure to screen for phylogenetically diagnostic TE presence/absence patterns. Reference species are shown on the left. We applied multi-way and combinations of multiple 2-way genome screenings for diagnostic TEs. All investigated genomes were previously repeat masked. Repeat coordinates were sorted in the genome presence/absence compiler (GPAC) (multi-way genome alignments) and 2-n-way (combinations of 2-way alignments) for their presence (+) or absence (−). In addition to presence/absence patterns, 2-n-way also provided the necessary sequence alignments that were carefully checked manually to verify orthology and remove duplicated loci. For GPAC, we compiled and verified alignments previously retrieved from genomic coordinates. Additional species needed for verifying the consistent presence of elements in each group were added via manual blast screening. The final steps of analysis involve tree reconstruction and 4-lineage statistics (4-LIN) to determine the significance of the trees.
Figure 2Structure of an example alignment for a MER57F#LTR/ERV1 TE (marker Euarch187). The presence of the TE is shown in green and indicated by (+). The absence state is displayed in yellow and indicated by (−). TSD indicates the TE-flanking tandem sequence duplications that appear during the insertion process of elements and are hallmarks of orthology. The stringencies of selecting a diagnostic position are given below the sequences.
Figure 3Phylogenetic reconstruction derived from TE insertion presence/absence patterns and statistically analyzed using the 4-LIN tool. Primatomorpha received the most presence/absence support (132 TE insertions), followed by Euarchonta by 94 diagnostic TEs. All markers supporting conflicting tree topologies (together 135) are indicated as grey balls with respective numbers and relationships. Branches assigned by black arrowheads show the presence of shared grey balls (orthologous insertions). All other species represent their absence. For example, nine TE insertions were present in Dermoptera and Scandentia and absent in Primates and Glires. Lagomorpha only accounted for 299 (including ten rodent-lagomorph conflicting patterns) of the 361 TE markers shown here (for details see text). The complete presence/absence matrix is given as Supplementary Table S1. We did not screen for monophyly markers of Glires and Euarchontoglires, monophyly markers of euarchontogliran orders, or phylogenetic signals within them.
Figure 4ASTRAL_BP quartet-based species tree for the 361 retrotransposon markers. Branch labels are posterior probabilities and bootstrap values, respectively. Branch lengths in coalescent units are indicated by a scale bar.