| Literature DB >> 31208405 |
Weiling Li1, Lei Yang2,3, Robert S Harris2, Lin Lin4, Thomas L Olson5, Cait E Hamele5, David J Feith5, Thomas P Loughran5, Mary Poss6,7,8.
Abstract
BACKGROUND: Large granular lymphocyte (LGL) leukemia is an uncommon cancer characterized by sustained clonal proliferation of LGL cells. Antibodies reactive to retroviruses have been documented in the serum of patients with LGL leukemia. Culture or molecular approaches have to date not been successful in identifying a retrovirus.Entities:
Keywords: Genomic insertion; HERV-K; Large granular lymphocyte leukemia; Retrovirus; Visualization tool
Mesh:
Year: 2019 PMID: 31208405 PMCID: PMC6580525 DOI: 10.1186/s12920-019-0549-9
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Utilizing long insert mate pair reads to localize retrovirus integrations. Reference human genome is shown as a blue line with the location of a novel inserted retrovirus in a patient sequence, in orange, indicated by a dotted vertical orange line. Long insert mate pair reads are linked by gray dotted lines, with the read derived from the new retrovirus, which will not map, shown in orange, it’s mate that maps to the human reference genome shown in blue. Depending on the length of the retrovirus, which typically is 6–10 kbp, some mate pairs may span the entire inserted virus and hence both mate pairs will originate from the host (light blue), resulting in mate pairs that map at a distance shorter than the expected insert distribution of 5–12 kbp. A retrovirus insertion site is suggested by a combination of several features of mate pair mapping including short insert intervals and discordant or broken mate pairs. The insert length and depth of mapped reads are key signals in our retrovirus insertion pipeline (see Additional file 1: Supplementary Methods; Figure S1). The unmapped reads (orange in the figure) from discordant mate pairs at each called insertion site are assembled and used to determine the sequence of a candidate retrovirus
Prevalence (proportion) of LGL patients and individuals from the five super-populations represented in the KGP data carrying a polymorphic HERV-K provirus
| HERV-K (hg19 coordinate) | LGL | ELGL | KGP | AFR | AMR | EAS | EUR | SAS |
|---|---|---|---|---|---|---|---|---|
| chr1:75842771–75,849,143 | 70.6 | 70.0 | 42.9 | 26.8 | 56.5 | 6.0 | 68.9 | 66.8 |
| chr3:148281477–148,285,396 | 49.0 | 52.5 | 41.9 | 38.9 | 42.6 | 45.0 | 46.5 | 37.4 |
| chr12: 58721242–58,730,698 | 94.1 | 97.5 | 70.7 | 58.9 | 78.4 | 60.0 | 87.3 | 75.5 |
| chr19: 21841536–21,841,542 | 5.9 | 7.5 | 27.0 | 39.2 | 11.9 | 32.2 | 10.7 | 32.4 |
| chr19:22414379–22,414,380 | 51.0 | 47.5 | 67.8 | 89.2 | 60.8 | 56.9 | 55.8 | 67.2 |
AFR African, AMR Admixed American, EAS East Asian, EUR European, SAS South Asian, LGL All LGL patients in this study (51 total), ELGL T-LGL patients with European ancestry (T-LGL-EUR, 40 total)
Fig. 2Linear discriminant analysis based on HERV-K status of T-LGL-EUR patients and KGP super populations. Linear discriminant analysis (LDA) was conducted on data generated by a comprehensive analysis of polymorphic HERV-Ks in an individual genome [18]. a Data is based on three HERV-K states of ‘absence,’ ‘solo LTR’, ‘provirus’ or b. The n/T ratio of each known HERV-K provirus for T-LGL leukemia patients of European ancestry and the 28 individuals from KGP super populations with high coverage data. The ratio indicates the proportion of k-mers derived from a person’s WGS dataset (n) that are 100% match to a set of unique k-mers (T) characterizing each HERV-K provirus. The improved resolution of T-LGL-EUR patients from other individuals using n/T likely reflects that alleles of HERV-K contribute to population differentiation. The symbols and colors for each KGP populations and T-LGL-EUR leukemia patients are indicated in the key on the right
Fig. 3Histogram of the number of polymorphic HERV-K proviruses identified in LGL leukemia patients compared to individuals of European origin from KGP. Data are shown for 51 LGL patients (blue) and for the subset of 40 patients with T-LGL leukemia of European ancestry (T-LGL-EUR, orange). Data for the 505 EUR individuals (gray) from the KGP data is from Li et al. [18]
Fig. 4The prevalence of combinations of polymorphic HERV-K provirus in KGP populations and T-LGL-EUR leukemia patients. The combinations of polymorphic HERV-K provirus evaluated are indicated at the top right of each panel. a The prevalence of three polymorphic HERV-K proviruses that include chr12: 58721242–58,730,698 in KGP individuals and T-LGL-EUR patients. b The prevalence of three polymorphic HERV-K, excluding chr12: 58721242–58,730,698, in KGP individuals and T-LGL-EUR leukemia patients. Coordinates are referenced to hg19. Bubble size is proportional to the number of individuals in the population and color gradient represents prevalence from 0 to 100%. Absolute values are given in the text on the right for each population. KGP population abbreviations are given in Table 1 and additional information can be found at (http://www.internationalgenome.org/category/population/)