| Literature DB >> 28025273 |
Ville N Pimenoff1,2, Cristina Mendes de Oliveira3,4, Ignacio G Bravo3,5.
Abstract
Every human suffers through life a number of papillomaviruses (PVs) infections, most of them asymptomatic. A notable exception are persistent infections by Human papillomavirus 16 (HPV16), the most oncogenic infectious agent for humans and responsible for most infection-driven anogenital cancers. Oncogenic potential is not homogeneous among HPV16 lineages, and genetic variation within HPV16 exhibits some geographic structure. However, an in-depth analysis of the HPV16 evolutionary history was still wanting. We have analyzed extant HPV16 diversity and compared the evolutionary and phylogeographical patterns of humans and of HPV16. We show that codivergence with modern humans explains at most 30% of the present viral geographical distribution. The most explanatory scenario suggests that ancestral HPV16 already infected ancestral human populations and that viral lineages co-diverged with the hosts in parallel with the split between archaic Neanderthal-Denisovans and ancestral modern human populations, generating the ancestral HPV16A and HPV16BCD viral lineages, respectively. We propose that after out-of-Africa migration of modern human ancestors, sexual transmission between human populations introduced HPV16A into modern human ancestor populations. We hypothesize that differential coevolution of HPV16 lineages with different but closely related ancestral human populations and subsequent host-switch events in parallel with introgression of archaic alleles into the genomes of modern human ancestors may be largely responsible for the present-day differential prevalence and association with cancers for HPV16 variants.Entities:
Keywords: Hominin evolution; divergence; evolutionary medicine; host-switch; human papillomavirus; infection and cancer; sexually transmitted infection; variant; virus-host coevolution
Mesh:
Year: 2016 PMID: 28025273 PMCID: PMC5854117 DOI: 10.1093/molbev/msw214
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FPhylogenetic tree of the 118 HPV16 complete genomes. (A) Unrooted ML phylogenetic tree inferred using the complete coding region alignment after excluding a total of 26 positively selected and 115 negatively selected codons. The final alignment encompassed 118 sequences, 6,093 bp and 316 distinct alignment patterns. Bootstrap support after 500 replicates is given for branches leading to the A1–3, A4, B, C, and D variant lineages. Scale bar in substitutions per nucleotide site. (B) Maximum clade credibility tree inferred for selection-filtered HPV16 genome coding region alignment under the HHS scenario, 9.6 × 10−9 [5.5 × 10−9–13.6 × 10−9] subs/site/yr as substitution rate prior and enforcing two time point calibrations: archaic Hominin divergence at 500 kya (95% CI, 400–600 kya) and recent modern human migration out-of-Africa at 90 kya (95% CI, 60–120 kya). Scale bar in millions of years ago. Node bars indicate the 95% probability intervals for the corresponding node age.
FPhylogeographic distribution of the 1,680 HPV16 sequences encompassing the LCR, E6 and L2 genome loci. Each sequence was assigned to a specific HPV16 variant lineage (see the color coding for A1–3, A4, B, C, and D variants). The size of the pie charts is proportional to the number of sequences from the corresponding geographic region (supplementary table S3, supplementary material online).
FTimeline of divergence for archaic and modern human ancestors, and for HPV16. (A) Dated and classification-confirmed Hominin fossil taxa (light green represents uncertain classification, as reviewed by Scally and Durbin 2012). (B) Phylogenetic relationship of modern human, Neanderthal and Denisovan populations. Vertical arrows indicate the proposed interbreeding and subsequent gene flow from the Neanderthal and Denisovan populations to the ancestors of present-day modern humans. The evolutionary relationships between Neanderthals and Denisovans are still unresolved (Sawyer et al. 2015; Stringer and Barnes 2015). (C) Phylogenetic relationships for 118 selection-filtered HPV16 coding genome sequences using maximum clade credibility tree under the Hominin host-switch scenario. Blue bars indicate the 95% probability intervals for the corresponding node age. Arrows indicate the nodes used for calibration.
Mantel’s Permutation Test for Similarity of FST Distance Matrices between HPV16 and Human Genome Sequence Data Sets.
| Selection-filtered | Including positions under selection | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Taxa | Loci | Region/SNPs | Correlation (%) | FDR | Correlation (%) | FDR | |||
| mtDNA | 1 | 16 kb genome | 875 | 12.4 | 0.037 | 0.017 | 15.8 | 0.026 | 0.018 |
| NRY | 1 | 5,000 kb | 551 | 8.5 | 0.062 | 0.028 | 11.8 | 0.044 | 0.036 |
| Europe | chr1 | 6,353 | 773 | 8.8 | 0.062 | 0.029 | 12.7 | 0.033 | 0.026 |
| Europe | chr2 | 5,918 | 773 | 6.3 | 0.091 | 0.040 | 9.6 | 0.054 | 0.040 |
| Europe | chr3 | 5,325 | 773 | 8.8 | 0.063 | 0.032 | 12.7 | 0.037 | 0.030 |
| Europe | chr4 | 4,211 | 773 | 8.9 | 0.063 | 0.030 | 12.7 | 0.036 | 0.029 |
| Europe | chr5 | 4,743 | 773 | 5.6 | 0.104 | 0.047 | 8.7 | 0.061 | 0.047 |
| Europe | chr6 | 4,736 | 773 | 6.3 | 0.095 | 0.041 | 9.5 | 0.055 | 0.042 |
| Europe | chr7 | 4,162 | 773 | 5.2 | 0.112 | 0.048 | 8.1 | 0.062 | 0.048 |
| Europe | chr8 | 3,750 | 773 | 8.0 | 0.070 | 0.036 | 11.6 | 0.045 | 0.037 |
| Europe | chr9 | 3,222 | 773 | 8.8 | 0.064 | 0.033 | 12.7 | 0.038 | 0.033 |
| Europe | chr10 | 4,223 | 773 | 9.2 | 0.059 | 0.026 | 13.1 | 0.035 | 0.028 |
| Europe | chr11 | 3,850 | 773 | 5.9 | 0.099 | 0.046 | 9.2 | 0.058 | 0.046 |
| Europe | chr12 | 3,339 | 773 | 5.1 | 0.116 | 0.049 | 8.0 | 0.066 | 0.049 |
| Europe | chr13 | 2,234 | 773 | 5.8 | 0.098 | 0.045 | 9.2 | 0.057 | 0.045 |
| Europe | chr14 | 2,307 | 773 | 5.9 | 0.097 | 0.043 | 9.3 | 0.056 | 0.043 |
| Europe | chr15 | 2,346 | 773 | 6.2 | 0.096 | 0.042 | 9.3 | 0.054 | 0.041 |
| Europe | chr16 | 2,636 | 773 | 7.5 | 0.078 | 0.038 | 11.1 | 0.045 | 0.038 |
| Europe | chr17 | 2,109 | 773 | 6.8 | 0.084 | 0.039 | 10.1 | 0.045 | 0.039 |
| Europe | chr18 | 2,243 | 773 | 9.8 | 0.056 | 0.025 | 13.9 | 0.033 | 0.025 |
| Europe | chr19 | 1,175 | 773 | 3.9 | 0.139 | 0.050 | 6.5 | 0.085 | 0.050 |
| Europe | chr20 | 2,067 | 773 | 9.9 | 0.055 | 0.023 | 13.8 | 0.031 | 0.021 |
| Europe | chr21 | 973 | 773 | 10.0 | 0.055 | 0.024 | 14.1 | 0.031 | 0.022 |
| Europe | chr22 | 1,141 | 773 | 7.7 | 0.073 | 0.037 | 11.6 | 0.042 | 0.034 |
Note.—HPV16 sequence data (1,192 bp) encompassing LCR (735 bp) and E6 (457 bp) regions for 1,101 worldwide isolates were used in the correlations. mitochondrial DNA (mtDNA); nonrecombining part of the Y chosomosome (NRY); autosomal SNP data stratified by chromosome and ascertained using a European (Europe) ancestry data (autosomal SNP data ascertained using a sub-Saharan ancestry data is provided in supplementary table S8, supplementary material online). For both datasets nine geographic metapopulations were defined: North/East/West/Southern Africa, South Asia, East Asia, Europe, Central and South America (supplementary table S2 and S5. supplementary material online).
R2 of the correlation coefficient.
Mantel-test after 10,000 permutations.
False discovery rate controlling procedure (q = 0.05).
Substitution Rate and Divergence Time Estimates with Highest Probability Density (HPD) in Thousands of Years Ago (kya) Inferred for HPV16 Complete (Full) or Selection-Filtered Genome Alignment (Neutral), under the HHS or the ROOA Scenarios, and Considering Different Priors.
| Model | Calibration | Rate uniform prior | Data | ×10−9 (subs/site/yr) | Log marginal | 2lnBF | tmrca ABCD (kya) | tmrca A1–4 (kya) | tmrca BCD (kya) | tmrca CD (kya) | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ×10−9 (subs/site/yr) | µ | (95% HPD) | likelihood | mean | (95% HPD) | mean | (95% HPD) | mean | (95% HPD) | mean | (95% HPD) | ||||
| HHS | HHS 2 calib | PVs estímate [13.2–24.7] | Neutral | 18.4 | (14.5–22.1) | − | 461 | (365–561) | 87 | (48–133) | 197 | (121–293) | 105 | (82–129) | |
| HHS | HHS 2 calib | Human mtDNA [17.6–32.3] | Neutral | 20.3 | (16–25.1) | −12480.34 | 6.8 | 456 | (358–553) | 78 | (42–123) | 184 | (109–273) | 101 | (76–124) |
| HHS | HHS 2 calib | PVs estimate [5.5–13.6] | Neutral | 12.8 | (11.1–14.3) | −12484.91 | 15.9 | 484 | (396–580) | 130 | (80–190) | 250 | (158–360) | 122 | (100–144) |
| HHS | HHS 2 calib | Mammal genome [2.0–2.4] | Neutral | 3.6 | (3.1–4.1) | −12567.19 | 180.5 | 571 | (488–655) | 423 | (284–578) | 408 | (267–575) | 180 | (158–201) |
| HHS | HHS 2 calib | 4.5-4.5E–11 | Neutral | 20.2 | (14.0–26.8) | −12482.18 | 10.5 | 457 | (357–5559) | 79 | (40–128) | 185 | (105–280) | 101 | (75–128) |
| HHS | HHS 1 calib 500 kya | 4.5-4.5E–11 | Neutral | 13.6 | (8.1–20.3) | −12482.41 | 10.9 | 490 | (389–587) | 125 | (55–206) | 311 | (157–471) | 197 | (95–312) |
| HHS | HHS 1 calib 90 kya | 4.5-4.5E–11 | Neutral | 32.1 | (17.0–49.6) | −12483.21 | 12.5 | 210 | (92–346) | 54 | (23–91) | 134 | (72–203) | 85 | (53–115) |
| ROOA | 90 kya | 4.5-4.5E–11 | Neutral | 80.5 | (40.3–128) | −12481.41 | 8.9 | 85 | (54–115) | 22 | (9–38) | 54 | (24–87) | 34 | (14–56) |
| HHS | HHS 2 calib | PVs estímate [13.2–24.7] | Full | 22.9 | (20.3–25.3) | −19313.69 | 4.7 | 476 | (387–568) | 147 | (101–200) | 246 | (164–343) | 127 | (106–148) |
| HHS | HHS 2 calib | Human mtDNA [17.6–32.3] | Full | 27.5 | (23.5–31.0) | − | 459 | (364–551) | 120 | (79–166) | 216 | (140–300) | 116 | (95–138) | |
| HHS | HHS 2 calib | PVs estimate [5.5–13.6] | Full | 14.4 | (13.1–15.8) | −19333.81 | 44.9 | 519 | (431–607) | 240 | (167–324) | 330 | (212–455) | 152 | (8131–173) |
| HHS | HHS 2 calib | Mammal genome [2.0–2.4] | Full | 4.6 | (4.0–5.2) | −19502.91 | 383.1 | 650 | (571–727) | 616 | (518–705) | 411 | (320–498) | 214 | (8189–232) |
| HHS | HHS 2 calib | 4.5-4.5E–11 | Full | 33.9 | (24.4–43.4) | −19314.23 | 5.8 | 446 | (346–544) | 95 | (53–142) | 186 | (110–271) | 105 | (80–132) |
| HHS | HHS 1 calib 500 kya | 4.5-4.5E–11 | Full | 22.0 | (13.2–31.2) | −19311.87 | 1.1 | 490 | (390–586) | 156 | (78–240) | 320 | (176–473) | 209 | (111–319) |
| HHS | HHS 1 calib 90 kya | 4.5-4.5E–11 | Full | 55.0 | (31.8–82.9) | −19311.82 | 1 | 199 | (99–322) | 63 | (31–100) | 130 | (73–193) | 84 | (54–115) |
| ROOA | 90 kya | 4.5-4.5E–11 | Full | 130.6 | (71.8–206.0) | −19311.94 | 1.2 | 84 | (54–115) | 27 | (12–44) | 55 | (27–88) | 36 | (17–58) |
Note.—Likelihood values in bold are the best-supported models, taken as reference (Ref) for the path sampling Bayes factor (2lnBF) comparison.
Rector et al. (2007).
Fu et al. (2014).
Shah et al. (2010).
Kumar and Subramanian (2002).
Substitution Rate and Divergence Time Estimates with Highest Probability Density (HPD) in Thousands of Years Ago (kya) Inferred for HPV16 Complete (Full) or Selection-Filtered Genome Alignment (Neutral) Either under the HHS Scenario or without Imposing Any Prior on Tree Topology, and without Using Any Time Point Calibrations, and Using Instead Only Substitution Rate Priors.
| Data | Model | Calibration | Uniform prior | ×10−9 (subs/site/yr) | tmrca ABCD (kya) | tmrca A1-4 (kya) | tmrca BCD (kya) | tmrca CD (kya) | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ×10−9 (subs/site/yr) | µ | (95% HPD) | Mean | (95% HPD) | Mean | (95% HPD) | Mean | (95% HPD) | Mean | (95% HPD) | |||
| Full | No topology | No calib | PVs estimate [13.2–24.7] | 18.0 | (12.5–23.8) | 581 | (298–911) | 188 | (109–278) | 391 | (221–589) | 261 | (146–392) |
| Neutral | No topology | No calib | 18.2 | (12.4–24.4) | 340 | (168–573) | 90 | (46–141) | 231 | (121–355) | 156 | (83–247) | |
| Neutral | HHS | No calib | 18.2 | (12.4–24.2) | 361 | (178–592) | 92 | (50–147) | 228 | (123–353) | 145 | (81–222) | |
| Full | No topology | No calib | Human mtDNA [17.6–32.3] | 23.6 | (16.4–31.1) | 440 | (223–693) | 142 | (85–214) | 296 | (167–442) | 198 | (112–296) |
| Neutral | No topology | No calib | 24.0 | (16.4–31.7) | 258 | (124–420) | 68 | (37–107) | 175 | (94–269) | 118 | (61–186) | |
| Neutral | HHS | No calib | 23.9 | (16.4–31.6) | 273 | (136–444) | 70 | (37–108) | 173 | (95–271) | 110 | (60–167) | |
| Full | No topology | No calib | PVs estimate [5.5–13.6] | 9.0 | (5.2–12.8) | 1181 | (532–2037) | 384 | (201–627) | 803 | (400–1330) | 536 | (267–870) |
| Neutral | No topology | No calib | 9.1 | (5.2–13.2) | 698 | (296–1228) | 185 | (85–313) | 185 | (86–312) | 317 | (145–539) | |
| Neutral | HHS | No calib | 9.1 | (5.2–13.1) | 742 | (324–1297) | 189 | (90–320) | 469 | (220–791) | 298 | (141–493) | |
| Full | No topology | No calib | Mammal genome [2.0–2.4] | 2.1 | (1.8–2.4) | 4819 | (2914–7086) | 1564 | (1077–2133) | 3249 | (2108–4458) | 2177 | (1466–3015) |
| Neutral | No topology | No calib | 2.1 | (1.8–2.5) | 2863 | (1576–4415) | 755 | (457–1086) | 1919 | (1206–2741) | 1301 | (784–1916) | |
| Neutral | HHS | No calib | 2.1 | (1.8–2.4) | 3010 | (1686–4582) | 767 | (475–1123) | 1900 | (1187–2738) | 1208 | (761–1691) | |
Rector et al. (2007).
Fu et al. (2014).
Shah et al. (2010).
Kumar and Subramanian (2002).
Summary Statistics of the HPV16 E6 Gene Sequences (n = 1,123) with Geographic Origin of the Sample.
| Metapopulation | bp | S | h | Hd | (95% CI)a | (95% CI)b | MPD | (95% CI)a | (95% CI)b | WIMP | Tajima’s D | (95% CI)b | Fu’s-Li’s D* | (95% CI)b | R2 | (95% CI)b | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sub-Saharan Africa | 213 | 456 | 34 | 32 | 0.754 | (0.712 –0.796) | (0.439 –0.892) | 3.049 | (2.739 –3.359) | (0.815 –7.389) | 1.459 | –1.427 | >0.1 | (–1.642 to 2.053) | 0.047 | –3.459 | (–2.178 to 1.523) | 0.003 | 0.044 | (0.033 –0.148) | 0.100 |
| North Africa | 221 | 456 | 30 | 26 | 0.861 | (0.839 –0.883) | (0.653 –0.918) | 4.985 | (4.52 –5.45) | (1.543 –12.592) | 0.969 | –0.196 | >0.1 | (–1.598 to 2.097) | 0.519 | –3.288 | (–1.959 to 1.311) | 0.466 | 0.081 | (0.035 –0.14) | 0.533 |
| Europe | 161 | 456 | 31 | 25 | 0.734 | (0.688 –0.78) | (0.16 –0.838) | 1.62 | (1.401 –1.839) | (0.194 –4.219) | 1.187 | –2.043 | <0.05 | (–1.62 to 2.05) | 0.001 | –3.825 | (–1.958 to 1.447) | 0.002 | 0.026 | (0.028 –0.164) | 0.017 |
| South Asia | 139 | 456 | 8 | 6 | 0.523 | (0.455 –0.591) | (0.056 –0.797) | 0.998 | (0.831 –1.165) | (0.043 –2.744) | 0.464 | –0.724 | >0.1 | (–1.509 to 2.026) | 0.253 | –0.454 | (–2.302 to 1.297) | 0.281 | 0.062 | (0.026 –0.193) | 0.302 |
| East Asia | 207 | 456 | 31 | 37 | 0.643 | (0.567 –0.719) | (0.495 –0.894) | 3.425 | (3.079 –3.771) | (0.853 –8.014) | 1.712 | 1.211 | >0.1 | (–1.593 to 2.11) | 0.091 | –2.848 | (–2.415 to 1.606) | 0.014 | 0.054 | (0.032 –0.146) | 0.180 |
| Central America | 50 | 456 | 21 | 14 | 0.889 | (0.851 –0.927) | (0.589 –0.925) | 4.551 | (3.641 –5.461) | (1.197 –10.979) | 1.803 | –0.237 | >0.1 | (–1.683 to 2.033) | 0.475 | –1.027 | (–2.568 to 1.511) | 0.167 | 0.103 | (0.052 –0.177) | 0.503 |
| South America | 132 | 402 | 18 | 17 | 0.78 | (0.744 –0.816) | (0.486 –0.893) | 3.184 | (2.774 –3.594) | (0.772 –7.859) | 0.947 | –0.096 | >0.1 | (–1.613 to 1.896) | 0.526 | –0.850 | (–2.371 to 1.543) | 0.185 | 0.087 | (0.035 –0.156) | 0.542 |
Note.—Number of samples (n), base pair (bp), Segregating sites (S), haplotypes (h), haplotype diversity (Hd), mean pairwise difference (MPD), weighted intralineage mean pairwise difference (WIMP) (Hurles et al. 2002), Tajima’s D, Fu’s and Li’s D* and R2 statistics were calculated using DnaSP (Librado and Rozas 2009).
Calculcated from empirical distribution.
Calculated from a coalescent simulation with 1,000 replications and no recombination.
Calculated assuming beta distribution.
HPV16 E6 Gene Major Haplotype Frequencies within Each Lineage and Geographical Background.
| Origin | A1–3 | A1–3 | A4 | B | C | D | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| % | % | % | % | % | % | % | ||||||||
| America | 50 | 19.5 | 40 | 19.0 | 2 | 1.5 | 1 | 1.2 | 4 | 2.6 | 45 | 57.7 | 8 | 21.1 |
| Eurasia | 177 | 68.9 | 118 | 56.2 | 127 | 98.5 | 1 | 1.2 | 7 | 5.0 | 26 | 33.3 | 7 | 18.4 |
| North Africa | 27 | 10.5 | 46 | 21.9 | — | — | 4 | 4.7 | 53 | 37.6 | 4 | 5.0 | 22 | 57.9 |
| Sub-Saharan Africa | 3 | 1.2 | 6 | 2.9 | — | — | 79 | 92.9 | 77 | 54.6 | 3 | 3.8 | 1 | 2.6 |
Major haplotype including HPV16 E6 350G allele.
Major haplotype including HPV16 E6 350T allele.
FCartoon timeline depicting interbreeding and subsequent gene flow of archaic alleles from Neanderthals and Denisovans into modern humans and the proposed sexual transmission of HPV16A lineage to the ancestors of modern human populations in Eurasia. Viral lineages HPV16A, B, C, and D are labeled at the bottom.
Global Prevalence of HPV16A Lineages (Upper Panel) and Proportion of Nonrandomly Introgressed Archaic Alleles into Modern Human Ancestors (Lower Panel).
| Human papillomavirus 16A lineages | Geography | Ref. | ||
|---|---|---|---|---|
| Sub-Saharan Africa | West Eurasia | East Eurasia | ||
| HPV16A1-3 | <5% | 93% | 25-91% | This study |
| HPV16A4 | Absent | ∼1% | 60% | This study |
| Neanderthal ancestry loci | Absent | 64% | 62% | |
| Loci involved in keratinocyte differentiation | Absent | 40–70% | 40–66% | |
| HLA I loci (innate immunity) | <7% | 52–59% | 72–82% | |
| Toll-like receptor loci (innate immunity) | Absent | 15–39% | 17–51% | |
| APOBEC3A deletion (innate immunity) | <1% | 7% | 14-93% | |
Note.—A selection of host’s genome loci is listed, directly involved in the Papillomavirus life-cycle through keratinocyte differentiation and innate immunity.