| Literature DB >> 21394200 |
Hugo Martins1, Palle Villesen.
Abstract
BACKGROUND: Endogenous retroviruses (ERVs) are genetic fossils of ancient retroviral integrations that remain in the genome of many organisms. Most loci are rendered non-functional by mutations, but several intact retroviral genes are known in mammalian genomes. Some have been adopted by the host species, while the beneficial roles of others remain unclear. Besides the obvious possible immunogenic impact from transcribing intact viral genes, endogenous retroviruses have also become an interesting and useful tool to study phylogenetic relationships. The determination of the integration time of these viruses has been based upon the assumption that both 5' and 3' Long Terminal Repeats (LTRs) sequences are identical at the time of integration, but evolve separately afterwards. Similar approaches have been using either a constant evolutionary rate or a range of rates for these viral loci, and only single species data. Here we show the advantages of using different approaches.Entities:
Mesh:
Year: 2011 PMID: 21394200 PMCID: PMC3048862 DOI: 10.1371/journal.pone.0014745
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Two species estimation of integration time.
Estimation of insertion times based on multi-species phylogenetic data. LTR insertion date can be estimated through phylogenetic data by adding known speciation times of species pairs where those LTRs are known to be present (T1). Calculating the substitution rates for each separate species along T1 (bold blue line, bold green line) and LTR along T2 (dashed lines) as shown in the figure will allow the implementation of a final corrected estimation date (bottom formula).
Integration times estimated from independent LTR substitution rates and phylogenies, compared with MCMC estimations.
| File | HKY | GTR | MCMC | ||
| Med | Mean | Int | |||
|
| 42.44 | 42.13 | 35.62 | 36.05 | 29.52–44.89 |
|
| 24.19 | 24.07 | 31.77 | 32.09 | 27.11–38.84 |
|
| 53.58 | 54.74 | 56.58 | 58.19 | 39.51–85.82 |
|
| 19.68 | 19.64 | 18.27 | 18.54 | 14.97–23.76 |
|
| 165.6 | 166.5 | 105.8 | 106.9 | 96.27–123.2 |
|
| 24.49 | 24.63 | 32.32 | 33.11 | 23.69–47.30 |
|
| 58.89 | 59.11 | 48.17 | 48.69 | 39.18–61.12 |
|
| 9.270 | 9.269 | 9.131 | 9.025 | 7.155–10.37 |
|
| 6.373 | 6.375 | 7.008 | 7.065 | 5.577–8.878 |
|
| 38.98 | 39.43 | 40.13 | 40.69 | 32.44–52.04 |
Integration time estimates in million years ago (Mya) for the 10 LTR loci using independent LTR rates and several phylogenetic inference methods. For the HKY and GTR substitution models, the genetic distance between LTRs was calculated using the Maximum Composite Likelihood method in MEGA 4.0. MCMC 95% confidence interval of the first node age calculated using a 25000 sample analysis after an initial 50000 sample stabilization run. A sample every 100 from the analysis was taken to build the Bayesian estimation. Three separate runs were made and all values for node ages were congruent (data not shown).
Integration times from traditional LTR divergence analysis.
|
| R = 0.002(1) | R = 0.0026(2) | R = 0.0014(3) | R = 0.0013(4) | 0.0023<R<0.005(5) | 0.0025<R<0.0045(6) |
|
| 26.11 | 20.08 | 37.30 | 40.17 | 10.44–22.70 | 11.60–20.89 |
|
| 16.85 | 13.18 | 24.07 | 25.93 | 6.741–14.65 | 7.490–13.48 |
|
| 22.75 | 17.50 | 32.50 | 35.00 | 9.101–19.78 | 10.11–18.20 |
|
| 15.15 | 11.66 | 21.65 | 23.31 | 6.062–13.18 | 6.735–12.12 |
|
| 98.32 | 75.63 | 140.5 | 151.3 | 39.33–85.49 | 43.70–78.66 |
|
| 32.73 | 25.18 | 46.76 | 50.36 | 13.09–28.46 | 14.55–26.19 |
|
| 39.11 | 30.09 | 55.87 | 60.17 | 15.65–34.01 | 17.38–31.14 |
|
| 18.49 | 14.23 | 26.42 | 28.45 | 7.397–16.08 | 8.219–14.80 |
|
| 16.88 | 12.99 | 24.12 | 25.98 | 6.754–14.68 | 7.504–13.51 |
|
| 32.95 | 25.35 | 47.07 | 50.69 | 13.18–28.65 | 14.65–26.36 |
Integration time estimates (Mya) calculated by using fixed global rates. (1) Andersen et al (1997), (2) Lavrentieva et al (1998), (3) Lebedev et al (2000), (4) Majer and Freeman (1995), (5) Wang et al (2007), (6) Johnsson and Coffin (1999). Human 5′-3′ pairwise distances calculated on HKY model phylogenetic trees. Genetic distances calculated in MEGA 4.0 using the maximum composite likelihood model.
Variation of 3′ LTR and 5′ LTR substitution rates.
|
|
|
| |||||||
| 5′ LTR | 3′ LTR | Avg | 5′ LTR | 3′ LTR | Avg | 5′ LTR | 3′ LTR | Avg | |
|
| 0,862 | 1,615 | 1,238 | 0,819 | 1,303 | 1,061 | 1,429 | 1,342 | 1,385 |
|
| 1,718 | 1,254 | 1,486 | 1,358 | 1,647 | 1,502 | 1,271 | 1,352 | 1,311 |
|
| 0,377 | 0,754 | 0,565 | 0,952 | 0,849 | 0,901 | 1,301 | 0,659 | 0,980 |
|
| 1,306 | 1,454 | 1,380 | 1,488 | 1,863 | 1,675 | - | - | - |
|
| 0,276 | 1,591 | 0,933 | 1,116 | 1,462 | 1,289 | 1,254 | 1,411 | 1,332 |
|
| 3,626 | 3,418 | 3,522 | 1,736 | 1,791 | 1,763 | - | - | - |
|
| 1,417 | 2,411 | 1,914 | 1,116 | 1,662 | 1,389 | 1,326 | 2,362 | 1,844 |
|
| 4,177 | 3,803 | 3,990 | - | - | - | - | - | - |
|
| 4,665 | 5,933 | 5,299 | - | - | - | - | - | - |
|
| 1,748 | 2,063 | 1,905 | 1,055 | 1,847 | 1,451 | 1,511 | 1,920 | 1,715 |
Substitution rate estimates along HKY tree branches in number of substitutions per site per 103 million years. Rates are given for each species pair in the tree, calculated from genetic distance between species pairs and assumed speciation times. These rates were used to calculate integration time estimations. For the GTR tree (data not shown), corresponding rates were calculated using the same methodology.
Figure 2LTR substitution rates.
Comparison of substitution rates between 5′ and 3′ LTRs. In blue, pairwise rates for loci estimated to be less than 25 million years old; in red, rates for loci estimated to be more than 25 million years old. The black dashed line represents identical rates between 5′ and 3′ LTRs.