| Literature DB >> 19737395 |
Huai-Chun Wang1, Edward Susko, Andrew J Roger.
Abstract
BACKGROUND: The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like evolution at a site manifests as conservation of nucleotide or amino acid states among some homologs where the states are not conserved in other homologs (or groups of homologs). Covarion-like evolution has been shown to relate to changes in functions at sites in different clades, and, if ignored, can adversely affect the accuracy of phylogenetic inference.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19737395 PMCID: PMC2758850 DOI: 10.1186/1471-2148-9-225
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Estimated parameters under SPR-based full tree search for datasets of 250 amino acid sites simulated based on a 17-taxa chaperonin tree with JTT + the listed models.
| RAS | α | 0.50 | 0.50 | -4245.78 | 698 | 17 min |
| Tuffley-Steel | s01 | 1.875 | 1.66 | -4752.23 | 648 | 14 min |
| s10 | 1.25 | 0.80 | ||||
| Galtier | α | 0.5 | 0.41 | -4188.25 | 673 | 6 hr 52 min |
| s11 | 1.5 | 2.29 | ||||
| π | 0.6 | 0.57 | ||||
| Huelsenbeck | α | 0.5 | 0.46 | -4070.79 | 648 | 6 hr 8 min |
| s01 | 1.875 | 1.89 | ||||
| s10 | 1.25 | 0.83 | ||||
| General | α | 0.5 | 0.50 | -4156.24 | 672 | 6 hr 26 min |
| s01 | 1.5 | 1.00 | ||||
| s10 | 2 | 1.35 | ||||
| s11 | 2.5 | 5.35 | ||||
| π | 0.6 | 0.62 | ||||
Figure 1The true tree length of a 17-taxa CPN60 tree used for simulating the datasets and the estimates under the different models. The tree lengths (the Y axis) are shown as the sum of the external branches (External) and the sum of the internal branches (Internal) and the total tree length (Total) separately. The data was simulated under JTT + the general covarion model and estimated under the other models.
Log-likelihoods of the two competing trees of the Microsporidia data [42] calculated with PROCOV under the general covarion model (GCM) and the RAS model, respectively, and with QmmRAxML under the class frequency mixture model (cF) [47].
| Microsporidia-fungi-clade | -737,304.13 | -742,093.43 | -731,758.97 |
| Microsporidia-archaea-clan | -741,721.00 | -741,895.93 | -731,780.03 |
| Log likelihood difference between the two trees | 4416.87 | -197.50 | 21.06 |
Figure 2A phylogenetic tree of the bacterial EF-Tu and eukaryotic EF-1α inferred with PROCOV under WAG + the general covarion model. Brackets refer to the amino acids of the two groups at position 256, a site illustrating a non-typical covarion pattern where both eukaryotic (EF-1α) and bacterial (EF-Tu) sequences are very variable. Changes in EF-1α are more radical (i.e. substitutions between amino acid with different physicochemical properties) whereas those in EF-Tu are structurally more conserved changes.
Figure 3The distribution of the difference between covarion log-likelihood and RAS log-likelihood at sites for the EF data analysed with the general covarion model.
Figure 4A frequency density distribution of Λ, the difference between covarion log-likelihood and RAS log-likelihood at sites, estimated under the general covarion model for a dataset (10,000 sites) simulated under the RAS model based on the EF tree (shown in Figure 1).
Forty three sequence positions in the EF data show the highest differences between covarion site likelihood and RAS site likelihood.
| 1 | 34 | 7.280 | c | <0.001 |
| 2 | 36 | 6.503 | c | <0.001 |
| 3 | 325 | 6.246 | c | <0.001 |
| 4 | 305 | 5.652 | c | <0.001 |
| 5 | 138 | 5.458 | c | <0.001 |
| 6 | 336 | 4.873 | c | <0.001 |
| 7 | 329 | 4.790 | c | <0.001 |
| 8 | 153 | 4.702 | c | <0.001 |
| 9 | 327 | 4.632 | c | 0.014 |
| 10 | 35 | 4.595 | c | 0.022 |
| 11 | 123 | 4.438 | c | 0.004 |
| 12 | 311 | 4.199 | c | 0.003 |
| 13 | 189 | 4.034 | c | 0.007 |
| 14 | 103 | 3.906 | c | 0.001 |
| 15 | 69 | 3.726 | c | 0.004 |
| 16 | 131 | 3.430 | c | 0.002 |
| 17 | 256 | 3.256 | c2 | 0.122 |
| 18 | 351 | 3.202 | c | 0.027 |
| 19 | 38 | 3.120 | c1 | 0.043 |
| 20 | 51 | 3.073 | c1 | 0.013 |
| 21 | 42 | 3.057 | c1 | 0.029 |
| 22 | 106 | 2.896 | c1 | 0.064 |
| 23 | 67 | 2.866 | c | 0.120 |
| 24 | 271 | 2.793 | c1 | 0.064 |
| 25 | 133 | 2.789 | c | 0.045 |
| 26 | 144 | 2.700 | c1 | 0.002 |
| 27 | 163 | 2.604 | c | 0.006 |
| 28 | 263 | 2.588 | c | 0.033 |
| 29 | 31 | 2.557 | c1 | 0.012 |
| 30 | 39 | 2.442 | c1 | 0.065 |
| 31 | 160 | 2.274 | c | 0.073 |
| 32 | 64 | 2.23 | c1 | 0.038 |
| 33 | 32 | 2.143 | c | 0.147 |
| 34 | 82 | 2.116 | c1 | 0.029 |
| 35 | 96 | 2.092 | c1 | 0.080 |
| 36 | 326 | 2.060 | c | 0.021 |
| 37 | 178 | 2.047 | c1 | 0.341 |
| 38 | 37 | 1.935 | c1 | 0.081 |
| 39 | 40 | 1.921 | c1 | 0.023 |
| 40 | 350 | 1.875 | c1 | 0.071 |
| 41 | 355 | 1.818 | c1 | 0.039 |
| 42 | 288 | 1.755656 | c1 | 0.043 |
| 43 | 356 | 1.694422 | c1 | 0.111 |
* Sequence position is based on the EF alignment [24]. **Λ = ln(lcov) - ln(lras). ***c: sites were found to be covarion sites of functional or structural significance in [24]; c1: sites having typical covarion site pattern but missed in [24]; c2: site showing non-typical covarion site pattern. ****P-values from a Bivar analysis of the EF data.
Figure 5Forty three sites that were detected by PROCOV as the covarion sites in the EF dataset (the upper part of the alignment is the bacterial EF-Tu and the lower part is the eukaryotic EF1α). The positions are 31, 32, 34 - 40, 42, 51, 64, 67, 69, 82, 96, 103, 106, 123, 131, 133, 138, 144, 153, 160, 163, 178, 189, 256, 263, 271, 288, 305, 311, 325 - 327, 329, 336, 350, 351 and 355, 356. Site 256 is shown in red.
Figure 6Tertiary structure of E. coli EF-Tu (PDB ID: 1EFC) [57], which has two identical polymer chains (A and B). The covarion residues are mapped on the A chain (the top polymer). The red arrow points to the purple strip of a loop region of 10 nearly consecutive covarion sites (sites 37, 38, 40 - 46, 48 in 1EFC), which corresponds to sites 31, 32, 34 - 40, 42 on the EF alignment listed in Figure 5. The loop region is connected to the two helices, one at either end.