| Literature DB >> 31409822 |
Aarif M N Batcha1,2, Stefanos A Bamopoulos3, Paul Kerbs3, Ashwini Kumar4, Vindi Jurinovic5,3, Maja Rothenberg-Thurley3, Bianka Ksienzyk3, Julia Philippou-Massier6, Stefan Krebs6, Helmut Blum6, Stephanie Schneider3,7, Nikola Konstandin3, Stefan K Bohlander8, Caroline Heckman4, Mika Kontro9, Wolfgang Hiddemann3,10,11, Karsten Spiekermann3,10,11, Jan Braess12, Klaus H Metzeler3,10,11, Philipp A Greif3,10,11, Ulrich Mansmann5,13,10,11, Tobias Herold14,15,16,17.
Abstract
The patho-mechanism of somatic driver mutations in cancer usually involves transcription, but the proportion of mutations and wild-type alleles transcribed from DNA to RNA is largely unknown. We systematically compared the variant allele frequencies of recurrently mutated genes in DNA and RNA sequencing data of 246 acute myeloid leukaemia (AML) patients. We observed that 95% of all detected variants were transcribed while the rest were not detectable in RNA sequencing with a minimum read-depth cut-off (10x). Our analysis focusing on 11 genes harbouring recurring mutations demonstrated allelic imbalance (AI) in most patients. GATA2, RUNX1, TET2, SRSF2, IDH2, PTPN11, WT1, NPM1 and CEBPA showed significant AIs. While the effect size was small in general, GATA2 exhibited the largest allelic imbalance. By pooling heterogeneous data from three independent AML cohorts with paired DNA and RNA sequencing (N = 253), we could validate the preferential transcription of GATA2-mutated alleles. Differential expression analysis of the genes with significant AI showed no significant differential gene and isoform expression for the mutated genes, between mutated and wild-type patients. In conclusion, our analyses identified AI in nine out of eleven recurrently mutated genes. AI might be a common phenomenon in AML which potentially contributes to leukaemogenesis.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31409822 PMCID: PMC6692371 DOI: 10.1038/s41598-019-48167-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Flow diagram of primary and validation cohorts. The dotted blue boxes indicate general criteria applied on excluding genes and samples. The 11 genes included in the analyses were PTPN11, U2AF1, IDH2, FLT3, SRSF2, TET2, RUNX1, GATA2, CEBPA, WT1 and NPM1, respectively.
Figure 2RNA-Seq read depths of all detected variants. (a) RNA-Seq read depths grouped based on the different variant classes. (c) RNA-Seq read depth of transcribed variants (variants detected in both DNA and RNA) grouped according to variant genotype information. (b,d) Read depth distribution based on variant groups.
Figure 3Variant allele frequency differences of transcribed and DNA-exclusive variants (2,606) including recurrent mutations (284) for SNVs (a) and INDELs (b). Expected and observed RNA variant read depths of SNVs (c) and INDELs (d). The diagonal lines represent the expected DNA vs. RNA trend in terms of VAFs (a,b) and RNA variant read depths (c,d). The genotype conversion of AB → AA and AB → BB represent the allele specific transcript abundance of wild-type and mutant allele, respectively. The observation of BB → AB genotype change artefacts might be due to the arbitrary definition of homozygous and heterozygous variants. We excluded regions with DNA VAF < 2% and regions with BB → AA genotype change.
Figure 4Weighted allelic imbalance (WAI) of recurrent mutations per gene in the AMLCG cohort for SNVs (a) and INDELs (b). WAI of recurrent mutations per mutation type in the AMLCG cohort for SNVs (c) and INDELs (d). The dotted vertical line at WAI of 1 indicates no allelic imbalance among the variants in DNA and RNA. WAI ≥ 1 indicates preferential mutant transcript abundance and WAI ≤ 1 represents preferential wild-type transcript abundance.
Figure 5Weighted allelic imbalance of recurrent mutations per gene among the pooled DKTK, TCGA and HELSINKI cohorts for SNVs (a) and INDELs (b).
Figure 6Weighted allelic imbalance of common SNPs in the AMLCG cohort without recurrent mutations in the respective genes.
Figure 7Gene-level and transcript-level differential expression calculated with limma after precision-weighting with voom for all recurrently mutated genes with a significant WAI in the AMLCG cohort. The green boxes indicate gene fold change and black boxes indicate different transcript isoforms. Dots below or above the bars represent recurrent mutations present within the transcripts. Crosses represent significant fold change differences (adjusted p value < 0.05). This plot is to provide a visual representation of significant fold change difference and the location of mutations within the transcripts and thus the transcript identifiers were removed.