| Literature DB >> 28264444 |
Roman A Gershgorin1, Konstantin Yu Gorbunov2, Oleg A Zverkov3, Lev I Rubanov4, Alexandr V Seliverstov5, Vassily A Lyubetsky6,7.
Abstract
Recent phylogenetic analyses are incorporating ultraconserved elements (UCEs) and highly conserved elements (HCEs). Models of evolution of the genome structure and HCEs initially faced considerable algorithmic challenges, which gave rise to (often unnatural) constraints on these models, even for conceptually simple tasks such as the calculation of distance between two structures or the identification of UCEs. In our recent works, these constraints have been addressed with fast and efficient solutions with no constraints on the underlying models. These approaches have led us to an unexpected result: for some organelles and taxa, the genome structure and HCE set, despite themselves containing relatively little information, still adequately resolve the evolution of species. We also used the HCE identification to search for promoters and regulatory elements that characterize the functional evolution of the genome.Entities:
Keywords: Ciliophora; chromosome structure; evolution; highly conserved elements; mitochondria; proteins clustering
Year: 2017 PMID: 28264444 PMCID: PMC5370409 DOI: 10.3390/life7010009
Source DB: PubMed Journal: Life (Basel) ISSN: 2075-1729
Figure 1The tree of mitochondrial evolution generated using 393 HCEs identified by our algorithm. The tree was generated by the RAxML program based on a matrix with 12 rows and 393 columns, with the matrix cells containing 1 or 0 to indicate the presence or absence of a given HCE in the mitochondrial genome of a given species, respectively.
Six highly conserved elements (HCEs) represented in the class Oligohymenophorea.
| Species | 1st Position | Sequence Fragments |
|---|---|---|
| 2984 | AATTTAAATACTTGCATTAAGACTAATCGTGG | |
| 2988 | AATTTAAATACTTGCATTAAGACTAATCGTGG | |
| 2988 | AATTTAAAAGCTTGCATTAATACTAATCTTGG | |
| 2943 | AATTTAAACACTTGCATTAAAACTAATCTTGG | |
| 10523 | GACACAC | |
| 10558 | GATAAAC | |
| 10589 | GATAGAC | |
| 10500 | GATAGAC | |
| 4810 | ATAAAATAAGTTCTAAAAATG | |
| 5270 | ATAAAATAAGTTCTTAATATA | |
| 4811 | ATAAAATATGTTCTAAAAATA | |
| 4839 | ATAAAATAAGTTCTAAAAATA | |
| 4788 | TTTTTTTAAATATCTAAAAGTAATAAAATAAGTTCTAAA | |
| 5248 | TTTTTTTAAATATCTAAATGTTATAAAATAAGTTCTTAA | |
| 4789 | TTTTTTAAAATATCTAAAAGTTATAAAATATGTTCTAAA | |
| 4817 | TTTTTTGATATATCTAAAAGTGATAAAATAAGTTCTAAA | |
| 4756 | TTTTTTTAAATATCTAAAAGTAATAAAATAAGTTCTAAA | |
| 1364 | TTTAGGTGCAGCTAT | |
| 47702 | TATAGCTGCACCTAAAAAAAAAAAA | |
| 27009 | AATAGCCGCACCTAAAAGAAAAAAATCTA | |
| 26884 | AATAGCTGCTCCAAAAAGAAAAAAATCAA | |
| 26364 | AATAGCCGCACCTAAAAGAAAAAAATCCA | |
| 26770 | AATGGCCGCACCTAAAAGAAAAAAATCAA | |
| 27061 | AATAGCCGCACCTAAAAGAAAAAAATCTA | |
| 26891 | ATAA | |
| 26211 | TCAA | |
| 26678 | TTAA | |
| 26921 | TTAA | |
Figure 2Alignment of 5′-leader sequences upstream of the cob gene.
Figure 3Tree of NADH dehydrogenase subunit 9 (Nad9) family according to our clustering. The tree was generated by PhyloBayes.
Figure 4Size distribution of the clusters. The bar height shows the number of clusters including proteins from the number of species indicated on the abscissa.
Distribution of proteins in clusters and singletons. Three columns on the right specify the numbers of proteins encoded in the mitochondrion, nontrivial clusters, and singletons for each species.
| Locus | Species | Proteins | Clusters | Singletons |
|---|---|---|---|---|
| NC_015981.1 | 41 | 39 | 2 | |
| GQ903131.1 | 29 | 25 | 4 | |
| GQ903130.1 | 36 | 30 | 6 | |
| GU057832.1 | 35 | 13 | 22 | |
| JN383843.1 | 99 | 31 | 68 | |
| NC_001324.1 | 46 | 41 | 5 | |
| NC_014262.1 | 42 | 41 | 1 | |
| NC_008337.1 | 45 | 44 | 0 | |
| NC_008338.1 | 44 | 43 | 1 | |
| NC_008339.1 | 44 | 44 | 0 | |
| NC_000862.1 | 44 | 44 | 0 | |
| NC_003029.1 | 45 | 44 | 0 |
Figure 5Evolutionary tree of mitochondria generated by PhyloBayes using the identified protein families. All nodes have the maximum support values.
Figure 6Evolutionary tree of mitochondrial chromosome structures. The tree was generated by the neighbor-joining method using distances between chromosome structures calculated as described in [19].