| Literature DB >> 30131401 |
Xushen Xiong1,2, Xiaoyu Li1, Kun Wang1, Chengqi Yi1,3,4.
Abstract
N 1-methyladenosine was recently reported to be a chemical modification in mRNA. However, while we identified hundreds of m1A sites in the human transcriptome in a previous work, others have detected only nine sites in cytosolic and mitochondrial mRNAs. Herein, we provide additional evidence that hundreds of m1A sites are present in the human transcriptome. Moreover, we show that both the improper bioinformatic tools and the poor quality of sequencing data in a previous study led to the failure in identifying the majority of m1A sites. Our analysis hence provides an explanation of the divergence in the prevalence of this newly discovered mRNA mark.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30131401 PMCID: PMC6191714 DOI: 10.1261/rna.067694.118
Source DB: PubMed Journal: RNA ISSN: 1355-8382 Impact factor: 4.942
FIGURE 1.Reanalyses of sequencing data by Li et al. (2017) and Safra et al. (2017) provide clear evidence of the presence of m1A in mRNA. (A) IGV views of the known m1A site at position 9 of a cytosolic and mitochondrial tRNA, showing enriched misincorporations at the first two positions of sequencing reads. (B) Barplots depicting the distribution of misincorporation events along the first 50 positions of the sequencing reads. Misincorporation in m1A9 sites (in blue) are highly skewed to occur in the first two positions, whereas misincorporation in mt-rRNA sites (in red) are relatively uniformly distributed. (C) Mismatch rates of the first five positions in sequencing reads of tRNA (left panel) and mRNA (right panel) in the input sample. (D) The mismatch rate of the first base in all reads of the “IP” and “IP + demethylation” samples, plotted along the mRNA transcripts. Upon demethylase treatment, a clear decrease of mutation rate is specifically observed for A but not T/C/G. Bin size is 10 nt. (E) Distribution of the A/C/G/T sites that pass the detection threshold by Li et al. (2017). Bin size is 10 nt. (F) A “reverse calculation” to evaluate the potential false positives caused by m1A-independent mutations. Mutation rates in the “IP” and “IP + demethylation” samples are artificially exchanged and modification calling procedure is performed using the same criteria. Very limited sites can be detected in such reverse calculation. (G) A known m1A site at position 9 in an isoform of tRNAAsp(GUC) can be identified by the end-to-end aligning mode, but not the soft clipping mode. Data taken from Li et al. (2017). (H) IGV view of an m1A site at the 5′ end of the ANKRD13A transcript. While the m1A-induced mismatch rate is decreased upon demethylase treatment (“A” site within the black lines), mutation likely arising from nontemplated additions remains the same (“C” site 5′ to the m1A site).
FIGURE 2.Reanalyses of sequencing data by Safra et al. (2017) provide an explanation as to why they failed to identify the majority of m1A in mRNA. (A) Reads’ coverage for the eight TRMT6/61A sites (detected in both studies) in different sequencing data sets. The inset box in each panel provides a zoomed-in view of reads’ coverage by Safra and coworkers. (B) Reads’ coverage for the 44 TRMT6/61A sites and 196 “reclassified” TSS sites in different sequencing data sets. All these sites are missed in the study by Safra and coworkers. The inset box in each panel provides a zoomed-in view of reads’ coverage by Safra et al. (2017). (C) Reads’ duplication levels and rRNA contamination levels of different sequencing data sets. (D) Mismatch rates in the “IP” and “IP + demethylation” samples for m1A1322 of 28S rRNA and m1A947 of 16S mt-rRNA. (E) Mismatch rate in the “IP” and “IP + demethylation” samples for a novel m1A575 site of 12S mt-rRNA. This site is biochemically validated by Li et al. (2017) but missed by Safra et al. (2017). (F) IGV views for the raw sequencing data and misincorporations for m1A575 site of 12S mt-rRNA. (G) Reanalysis of the SSIII data by Safra et al. (2017) showing poor data quality that does not even support their own TGIRT data. The analysis was performed by aligning the reads to the genome reference using TopHat2. (H) IGV views for one modification site that is claimed as false positive due to its coincidence with SNP by Schwartz (2018). (I) IGV views for one modification site that is claimed as false positive due to its location within a polyC stretch.