| Literature DB >> 27635051 |
Matteo Chiara1, Caterina Manzari2, Claudia Lionetti2, Rosella Mechelli3, Eleni Anastasiadou4, Maria Chiara Buscarinu3, Giovanni Ristori3, Marco Salvetti3, Ernesto Picardi2,5, Anna Maria D'Erchia2,5, Graziano Pesole2,5, David S Horner6,2.
Abstract
Epstein-Barr virus (EBV) latently infects the majority of the human population and is implicated as a causal or contributory factor in numerous diseases. We sequenced 27 complete EBV genomes from a cohort of Multiple Sclerosis (MS) patients and healthy controls from Italy, although no variants showed a statistically significant association with MS. Taking advantage of the availability of ∼130 EBV genomes with known geographical origins, we reveal a striking geographic distribution of EBV sub-populations with distinct allele frequency distributions. We discuss mechanisms that potentially explain these observations, and their implications for understanding the association of EBV with human disease.Entities:
Keywords: Epstein-Barr virus; comparative genomics; genome sequence; population structure
Mesh:
Year: 2016 PMID: 27635051 PMCID: PMC5203774 DOI: 10.1093/gbe/evw226
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Accession numbers of the novel EBV genomes assembled in this study and basic statistics concerning the number of variants detected
| Genome | Acc no | Mean coverage (X) | Common variants | Private variants |
|---|---|---|---|---|
| CAR | ERS1100719 | 2,048.89 | 852 | 59 |
| PP | ERS1100731 | 2,227.2 | 741 | 126 |
| MV | ERS1100733 | 1,947.52 | 856 | 67 |
| BL | ERS1100735 | 2,116.65 | 904 | 2 |
| VL | ERS1100730 | 1,815.83 | 911 | 9 |
| NM | ERS1100715 | 1,975.69 | 900 | 0 |
| MC | ERS1100718 | 1,680.85 | 883 | 37 |
| MFA | ERS1100723 | 1,968 | 1,031 | 27 |
| GV | ERS1100726 | 2,059.6 | 949 | 97 |
| GIOVS | ERS1100717 | 2,159.79 | 1,033 | 29 |
| CS | ERS1100724 | 2,167.5 | 903 | 0 |
| GR | ERS1100714 | 2,001.78 | 971 | 110 |
| PT | ERS1100716 | 2,107.06 | 997 | 78 |
| BA | ERS1100728 | 1,735.74 | 1,134 | 7 |
| MM | ERS1100734 | 2,331.46 | 1,056 | 69 |
| LOL | ERS1100725 | 1,956.2 | 899 | 0 |
| IM | ERS1100727 | 2,078.37 | 874 | 108 |
| GF | ERS1100729 | 1,974.4 | 909 | 28 |
| BR | ERS1100721 | 1,952.53 | 1,130 | 2 |
| CM | ERS1100732 | 1,742.88 | 1,043 | 8 |
| LUL | ERS1100722 | 1,883.03 | 894 | 0 |
| TM | ERS1100713 | 2,018.29 | 891 | 7 |
| SC | ERS1100710 | 1,882.56 | 1,131 | 194 |
| SA | ERS1100711 | 2,027.31 | 803 | 0 |
| RT | ERS1100712 | 2,199.21 | 1,013 | 61 |
| MST | ERS1100720 | 2,031.73 | 897 | 43 |
| CAS | ERS1100709 | 1,849.45 | 960 | 70 |
. 1.—Population structure, PCA and phenetic clustering of EBV genome sequences. (A) Barplot displaying the probability of provenance as inferred by Structure for all the 127 EBV genomes considered in this study, geographic origins are shown for each isolate. (B) Scatterplot of the PCA of 75 “pure” genomes. (C) Phenetic tree of the “pure” genomes. Different groups are indicated by colors and the root position is arbitrary. “Pure” genomes are defined as those where Structure assigned a ≥ 90% probability of provenance from a single population. Colors are consistent between panels A, B and C.
. 2.—NeighborNet and Population allele frequency relationships. (A) NeighborNet analysis of 69 non-admixed representatives of inferred EBV sub-populations. NeighborNet resolves the same clusters of non-admixed genomes as Structure and phenetic clustering, highlighting conflicts that correspond to allele types shared between sub-populations. (B) Allele frequency bootstrap tree of possible relationships between inferred EBV subpopulations. The tree was estimated using Treemix (Pickrell and Pritchard 2012) using default parameters (no migration, no linkage disequilibrium).