| Literature DB >> 35340265 |
Hsiao-Mei Liao1, Hebing Liu1, Pei-Ju Chin1, Bingjie Li1, Guo-Chiuan Hung1, Shien Tsai1, Isaac Otim2,3, Ismail D Legason3,4, Martin D Ogwang2,3, Steven J Reynolds5, Patrick Kerchan3,4, Constance N Tenge3,6, Pamela A Were3,6, Robert T Kuremu3,6, Walter N Wekesa3,6, Nestory Masalu3,7, Esther Kawira3,8, Leona W Ayers9, Ruth M Pfeiffer10, Kishor Bhatia10, James J Goedert10, Shyh-Ching Lo1, Sam M Mbulaiteye10.
Abstract
Epstein-Barr virus (EBV) is associated with endemic Burkitt lymphoma (eBL), but the contribution of EBV variants is ill-defined. Studies of EBV whole genome sequences (WGS) have identified phylogroups that appear to be distinct for Asian versus non-Asian EBV, but samples from BL or Africa, where EBV was first discovered, are under-represented. We conducted a phylogenetic analysis of EBV WGS and LMP-1 sequences obtained primarily from BL patients in Africa and representative non-African EBV from other conditions or regions using data from GenBank, Sequence Read Archive, or Genomic Data Commons for the Burkitt Lymphoma Genome Sequencing Project (BLGSP) to generate data to support the use of a simpler biomarker of geographic or phenotypic associations. We also investigated LMP-1 patterns in 414 eBL cases and 414 geographically matched controls in the Epidemiology of Burkitt Lymphoma in East African children and minors (EMBLEM) study using LMP-1 PCR and Sanger sequencing. Phylogenetic analysis revealed distinct genetic patterns of African versus Asian EBV sequences. We identified 281 single nucleotide variations (SNVs) in LMP-1 promoter and coding region, which formed 12 unique patterns (A to L). Nine patterns (A, AB, C, D, F, I, J, K and L) predominated in African EBV, of which four were found in 92% of BL samples (A, AB, D, and H). Predominant patterns were B and G in Asia and H in Europe. EBV positivity in peripheral blood was detected in 95.6% of EMBLEM eBL cases versus 79.2% of the healthy controls (odds ratio [OR] =3.83; 95% confidence interval 2.06-7.14). LMP-1 was successfully sequenced in 66.7% of the EBV DNA positive cases but in 29.6% of the controls (ORs ranging 5-11 for different patterns). Four LMP-1 patterns (A, AB, D, and K) were detected in 63.1% of the cases versus 27.1% controls (ORs ranges: 5.58-11.4). Dual strain EBV infections were identified in WGS and PCR-Sanger data. In conclusion, EBV from Africa is phylogenetically separate from EBV in Asia. Genetic diversity in LMP-1 formed 12 patterns, which showed promising geographic and phenotypic associations. Presence of multiple strain infection should be considered in efforts to refine or improve EBV markers of ancestry or phenotype. Lay Summary: Epstein-Barr virus (EBV) infection, a ubiquitous infection, contributes to the etiology of both Burkitt Lymphoma (BL) and nasopharyngeal carcinoma, yet their global distributions vary geographically with no overlap. Genomic variation in EBV is suspected to play a role in the geographical patterns of these EBV-associated cancers, but relatively few EBV samples from BL have been comprehensively studied. We sought to compare phylogenetic patterns of EBV genomes obtained from BL samples in Africa and from tumor and non-tumor samples from elsewhere. We concluded that EBV obtained from BL in Africa is genetically separate from EBV in Asia. Through comprehensive analysis of nucleotide variations in EBV's LMP-1 gene, we describe 12 LMP-1 patterns, two of which (B and G) were found mostly in Asia. Four LMP-1 patterns (A, AB, D, and F) accounted for 92% of EBVs sequenced from BL in Africa. Our results identified extensive diversity of EBV, but BL in Africa was associated with a limited number of variants identified, which were different from those identified in Asia. Further research is needed to optimize the use of PCR and sequencing to study LMP-1 diversity for classification of EBV variants and for use in epidemiologic studies to characterize geographic and/or phenotypic associations of EBV variants with EBV-associated malignancies, including eBL.Entities:
Keywords: Burkitt lymphoma; EBV variants; East Africa; Epstein-Barr virus; LMP-1 patterns; childhood cancer; epidemiology
Year: 2022 PMID: 35340265 PMCID: PMC8948429 DOI: 10.3389/fonc.2022.812224
Source DB: PubMed Journal: Front Oncol ISSN: 2234-943X Impact factor: 6.244
Figure 1The workflow of the EBV genomic variation identification and pattern analysis.
Figure 2Data source files, data processing, and analysis workflow.
Characteristics of whole EBV genomes analyzed for phylogenetic pattern and EBV patterns.
| Characteristic | Selected genomes N=219 | Total genome set N=730 |
|---|---|---|
|
| ||
|
| 191 (87.2%) | 605 (82.9%) |
|
| 21 (9.6%) | 63 (8.6%) |
|
| 7 (3.2%) | 62 (8.5%) |
|
| ||
|
| 74 (33.8%) | 77 (10.5%) |
| | 145 (66.2%) | 654 (89.5%) |
|
| ||
| | 3 (1.4%) | 4 (0.5%) |
| | 128 (58.4%) | 176 (24.1%) |
| | 2 (0.9%) | 2 (0.3%) |
| | 1 (0.5%) | 14 (1.9%) |
| | 19 (8.7%) | 26 (3.6%) |
| | 7 (3.2%) | 15 (2.0%) |
| | 15 (6.8%) | 162 (22.2%) |
| | 10 (4.6%) | 82 (11.2%) |
| | 26 (3.6%) | |
| | 5 (2.3%) | 25 (3.4%) |
| | 4 (1.8%) | 33 (4.5%) |
| | 14 (6.4%) | 115 (15.7%) |
| | 6 (2.7%) | 15 (2.0%) |
| | 4 (1.8%) | 26 (3.6%) |
| | 1 (0.5%) | 2 (0.3%) |
| | 25 (3.4%) | |
|
| ||
| | 46 (21.0%) | 85 (11.6%) |
| | 20 (9.1%) | 26 (3.6%) |
| | 79 (36.1%) | 398 (54.5%) |
| | 0 (0%) | 7 (0.9%) |
| | 36 (16.4%) | 98 (13.4%) |
| | 23 (10.5%) | 28 (3.8%) |
| | 0 (0%) | 8 (1.1%) |
| | 5 (2.3%) | 12 (1.6%) |
| | 5 (2.3%) | 5 (0.7%) |
| | 1 (0.5%) | 6 (0.8%) |
|
| 4 (1.8%) | 57 (7.8%) |
Of 730 EBV genomic samples identified, 431 were high-quality with genome size >170 kbp and gap <2000 ambiguous nucleotides. Because we were specifically interested in exploring genomic patterns of samples from Africa and facing the limitation of computational power for aligning whole genome sequences of a large dataset, we selected 219 EBV samples, including all qualifying samples from BL or Africa plus around 35% sequences selected from all other qualifying samples (see ).
Figure 3Phylogenetic tree of whole EBV genomes from samples with various conditions and from different geographic areas. (A) 219 whole EBV genomes, (B) 9 tumor-normal pairs of the BLGSP dataset. The sample conditions were color-coded. The rings from the inner side to the outer side are annotations for the Geographic area, EBV type, LMP-1 pattern, and phenotype of each sample. The missing data were tan color. The black dots indicate the positions of each sample away from the root (center). The scale bar value for distancing: (A) 0.006 (B) 0.009. The dominated LMP-1 pattern of the corresponding clade was annotated in the inner circle. The color of the extension line of each sample is consistent with the color of the Geographic area. Three genomic sequences of EBV type 1 obtained from GenBank, including the original NC_07605 derived from B95-8 cell line and genomic sequencing datasets of the same cell line by 2 other different labs, were used as references for analytic classification in Figure (B).
Associations of LMP-1 viremia with endemic Burkitt lymphoma in the EMBLEM Study.
| Characteristic | eBL cases (%) | Controls (%) | Crude OR (95% CI) | Adjusted OR (95% CI)* |
|---|---|---|---|---|
|
| ||||
| | 18 (4.4%) | 86 (20.8%) | Ref. | Ref. |
| | 396 (95.6%) | 328 (79.2%) | 5.76 (3.40-9.79) | 3.83 (2.06-7.14) |
|
| ||||
| | 132 (33.3%) | 231 (70.4%) | Ref. | Ref. |
| | 264 (67.7%) | 97 (29.6%) | 4.76 (3.47-6.53) | 8.28 (5.27-13.0) |
|
| ||||
| | 132 (33.3%) | 231 (70.4%) | Ref. | Ref. |
| | 73 (18.4%) | 18 (5.5%) | 7.09 (4.06-12.4) | 11.4 (5.89-22.0) |
| | 43 (10.9%) | 19 (5.8%) | 3.96 (2.21-7.07) | 5.58 (2.62-11.9) |
| | 82 (20.7%) | 28 (8.5%) | 5.12 (3.17-8.27) | 7.68 (4.09-14.4) |
| | 52 (13.1%) | 24 (7.3%) | 3.79 (2.23-6.43) | 7.90 (3.98-15.7) |
| | 14 (3.55%) | 8 (2.4%) | 3.06 (1.25-7.47) | 7.98 (2.43-26.3) |
*Adjusted for sex, age group (5 categories), P. falciparum infection, country, village characteristics (rural and proximity to surface water), and anemia (hemoglobin <11.6 g/dl).
Figure 4Phylogenetic tree of LMP-1 sequences from samples with various conditions and from different geographic areas. (A) 668 available sequences of LMP-1, (B) 360 LMP-1 sequences from 194 African samples and 166 non-African samples for lowering the graphic density for better visualization. The rings from the inner side to the outer side are annotations for the Geographic area, EBV type, LMP-1 pattern, and phenotype of each sample. The black dots indicate the positions of each sample away from the center. The scale bar value for distancing: (A) 0.022 (B) 0.035. The dominant LMP-1 pattern of the corresponding clade was annotated in the inner circle. The color of the extension line of each sample is consistent with the color of the Geographic area.
LMP-1 promoter and coding pattern variations in the EBV genomes in 114 primary tumor or cell lines with WGS data.
| Characteristic | ||
|---|---|---|
|
| Number | Percentage |
|
| 38 | 33.3% |
|
| 2 | 1.8% |
|
| 18 | 15.7% |
|
| 28 | 24.6% |
|
| 21 | 18.4% |
|
| 4 | 3.5% |
|
| 1 | 0.8% |
|
| 2 | 1.8% |
|
| 114 | 100% |
Details of samples included in this analysis are in .
EBV LMP-1 patterns in 23 subjects with whole genome sequence extracted from BLGSP, including 14 with samples in EMBLEM who were also genotyped in this study using the Sanger method.
| Case ID | WGS depth | EBV type |
|
| ||
|---|---|---|---|---|---|---|
| Tumor | Normal | Tumor | Normal | Normal | ||
|
| ||||||
|
| 1525.09 | 10.20 | 1 | F | F | – |
|
| 549.83 | 569.85 | 1 | D | D | – |
|
| 773.10 | 724.81 | 2 | AI | AI | – |
|
| 1129.61 | 144.91 | 1 | AIII | AIII | – |
|
| 1275.68 | 240.18 | 1 | D | D | – |
|
| 655.55 | 27.22 | 1 | ABIII | ABIII | – |
|
| 1708.80 | 3848.28 | 1/2 | AIII | AVI | – |
|
| 2972.55 | 358.95 | 1 | AIII | AIII | – |
|
| 3965.62 | 15.03 | 1 | F | n/a | – |
|
| ||||||
|
| 1492.4 | 1.1 | 2 | D | n/a | D |
|
| 645.5 | 1.0 | 2 | D | n/a | D |
|
| 1055.9 | 1.6 | 1 | D | n/a | AB7 |
|
| 1766.1 | 1.3 | 1 | D | n/a | A3 |
|
| 1424.7 | 4.3 | n/a | ABII | n/a | ABII |
|
| 3.6 | 1.8 | n/a | NA | n/a | n/a |
|
| 2705.2 | 15.2 | 1 | AIII | AIII | n/a |
|
| 3053.0 | 32.6 | 1 | I | I | K |
|
| 2914.7 | 3.6 | 1 | I | n/a | n/a |
|
| 1526.9 | 2.9 | 1 | D | n/a | D |
|
| 2818.9 | 2.5 | 1 | ABII | n/a | AB7 |
|
| 1116.6 | 3.4 | 1 | D | n/a | n/a |
|
| 3558.9 | 2.9 | 1 | I | n/a | I |
|
| 1776.2 | 3.1 | 1 | D | n/a | K |
*The last three digits of the samples used to annotate the phylogenetic tree in .
#These samples could not be classified for EBV type due to insufficient sequence coverage in EBNA-2.
n/a, Not applicable because sequence data was insufficient to genotype samples for EBV LMP-1 patterns.
Demographic and clinical characteristics of the participants in the EMBLEM Study population.
| Characteristic | Cases, n (%) | Controls, n (%) | P value* |
|---|---|---|---|
|
| 414 (50.0%) | 414 (50.0%) | |
|
| 0.050 | ||
| | 168 (40.6%) | 196 (47.3%) | |
| | 246 (59.4%) | 218 (52.7%) | |
|
| 7.24 (3.66) | 7.73 (3.33) | 0.043 |
|
| 0.073 | ||
| | 38 (9.2%) | 23 (5.6%) | |
| | 107 (25.8%) | 90 (21.7%) | |
| | 125 (30.2%) | 131 (31.6%) | |
| | 77 (18.6%) | 101 (24.4%) | |
| | 67 (16.2%) | 69 (16.7%) | |
|
| 0.52 | ||
| | 213 (51.5%) | 214 (51.7%) | |
| | 89 (21.5%) | 100 (24.1%) | |
| | 112 (27.0%) | 100 (24.1) |
*Computation of percentages includes categories with missing information, but computation of p-values excludes those subjects.