| Literature DB >> 31285478 |
Louise Zanella1,2,3, Ismael Riquelme4, Kurt Buchegger1,2,3, Michel Abanto3, Carmen Ili5,6,7, Priscilla Brebi8,9,10.
Abstract
The Epstein-Barr virus (EBV) infects more than 90% of the human population, playing a key role in the origin and progression of malignant and non-malignant diseases. Many attempts have been made to classify EBV according to clinical or epidemiological information; however, these classifications show frequent incongruences. For instance, they use a small subset of genes for sorting strains but fail to consider the enormous genomic variability and abundant recombinant regions present in the EBV genome. These could lead to diversity overestimation, alter the tree topology and misinterpret viral types when classified, therefore, a reliable EBV phylogenetic classification is needed to minimize recombination signals. Recombination events occur 2.5-times more often than mutation events, suggesting that recombination has a much stronger impact than mutation in EBV genomic diversity, detected within common ancestral node positions. The Hierarchical Bayesian Analysis of Population Structure (hierBAPS) resulted in the differentiation of 12 EBV populations showed seven monophyletic and five paraphyletic. The populations identified were related to geographic location, of which three populations (EBV-p1/Asia/GC, EBV-p2/Asia II/Tumors and EBV-p4/China/NPC) were related to tumor development. Therefore, we proposed a new consistent and non-simplistic EBV classification, beneficial in minimizing the recombination signal in the phylogeny reconstruction, investigating geography relationship and even infer associations to human diseases. These EBV classifications could also be useful in developing diagnostic applications or defining which strains need epidemiological surveillance.Entities:
Mesh:
Year: 2019 PMID: 31285478 PMCID: PMC6614506 DOI: 10.1038/s41598-019-45986-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Map of EBV genome. Circles from inner to outer display the genome reference NC_007605 coordinates (innermost circle), the second circle indicates remaining regions after removing poorly aligned sites. Third circle highlights the recombination sites in red. The outermost circle displays a gene map, where genes shown outside of circle are transcribed clockwise, whereas those shown inside are transcribed counterclockwise. The genes used in ML analyses are highlighted by red labels. This figure was performed by using GenomeVx (http://wolfe.ucd.ie/GenomeVx/).
Figure 2Comparison of phylogenetic trees according to EBV-1 and EBV-2 classification. Phylogenetic ML trees based on sequences of entire genome (~85000pb), BLLF1 (2228pb), EBNA3A (2659pb), EBNA3B (2286pb), EBNA3C (2577pb), EBNA2 (1095pb), BZLF1 (801pb), EBNA1 (1019pb), LMP1 (1088pb) and LMP2 (1418pb). In green the isolates classified as EBV-2 and in yellow the isolates classified as EBV-1. The five trees on top could not be used to distinguish between EBV type 1 and type 2. The red triangles indicate bootstrap values above 80%.
Figure 3Comparison of phylogenetic trees from unmasked genomes and masked genomes. ML tree based on sequences from the original genome alignment (with putative recombination) and masked alignment (without recombination). The isolates classified as EBV-2 are shown in green and those classified as EBV-1 are represented in yellow. The red triangles indicate bootstrap values above 80%. The traditional classification was recovered only in the unmasked tree; the masking process of recombinant regions in entire genome tree made it possible to re-examine the previously identified EBV-1 and EBV-2 clusters.
Figure 4Comparison of isolates positions in entire unmasked and masked genome phylogeny. ML tree (a) Unmasked genome (sequences with putative recombination sites) and (b) masked genome tree alignment (with recombinant regions masked). The blue lines connect the identical isolates according to their location in each tree. The isolates classified as EBV-2 by the traditional classification are shown in red. Some isolates presented a slight position change between the two trees without affecting their phylogenetic relationship with other isolates. However, other isolates showed radical changes, deeply affecting their association with other isolates.
Figure 5Overlap between population and phylogenetic groups of masked genomes. Branches are colored according to the 12 EBV-phylopopulation identified. The red triangles indicate bootstrap values above 80%. BL - Burkitt’s lymphoma, GC - gastric cancer, HL - Hodgkin’s lymphoma, IM - infectious mononucleosis, LC - lung carcinoma, NPC - nasopharyngeal carcinoma and PTLD - post-transplant lymphoproliferative disease