| Literature DB >> 31425522 |
Fabio Cp Navarro1,2, Jacob Hoops1,2, Lauren Bellfy3, Eliza Cerveira3, Qihui Zhu3, Chengsheng Zhang3, Charles Lee3,4, Mark B Gerstein1,2,5.
Abstract
The Long interspersed nuclear element 1 (LINE-1) is a primary source of genetic variation in humans and other mammals. Despite its importance, LINE-1 activity remains difficult to study because of its highly repetitive nature. Here, we developed and validated a method called TeXP to gauge LINE-1 activity accurately. TeXP builds mappability signatures from LINE-1 subfamilies to deconvolve the effect of pervasive transcription from autonomous LINE-1 activity. In particular, it apportions the multiple reads aligned to the many LINE-1 instances in the genome into these two categories. Using our method, we evaluated well-established cell lines, cell-line compartments and healthy tissues and found that the vast majority (91.7%) of transcriptome reads overlapping LINE-1 derive from pervasive transcription. We validated TeXP by independently estimating the levels of LINE-1 autonomous transcription using ddPCR, finding high concordance. Next, we applied our method to comprehensively measure LINE-1 activity across healthy somatic cells, while backing out the effect of pervasive transcription. Unexpectedly, we found that LINE-1 activity is present in many normal somatic cells. This finding contrasts with earlier studies showing that LINE-1 has limited activity in healthy somatic tissues, except for neuroprogenitor cells. Interestingly, we found that the amount of LINE-1 activity was associated with the with the amount of cell turnover, with tissues with low cell turnover rates (e.g. the adult central nervous system) showing lower LINE-1 activity. Altogether, our results show how accounting for pervasive transcription is critical to accurately quantify the activity of highly repetitive regions of the human genome.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31425522 PMCID: PMC6715295 DOI: 10.1371/journal.pcbi.1007293
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1As pervasive transcription is a major factor leading to reads mapping to L1 instances, TeXP functions as an approach to decouple pervasive transcription from autonomous transcription.
(A) The number of reads mapped to LINE-1 subfamilies is proportional to the number of bases annotated as the subfamily for most RNA sequencing experiments. Point colors represent the subfamily average identity to LINE-1 consensus. (B) Healthy human tissues show varied distributions of the genomic-transcriptomic correlation. (C) Pipeline chart describes the TeXP approach.
Fig 2Quantification and validation of L1Hs autonomous transcription in human cell lines.
(A) The proportion of reads emanating from pervasive transcription and L1P1, L1PA2, L1PA3, L1PA4, and L1Hs subfamilies in MCF-7 RNA sequencing experiments are shown from the different cell compartments and transcript fractions prior to (left) and after (right) TeXP processing. (B) The absolute number of reads emanating from pervasive transcription and LINE-1 subfamilies are shown across the distinct cell and transcript fractions of the human-derived cell lines GM12878, K-562, and MCF7. (C-D) The quantification of autonomous and pervasive transcripts of L1Hs in the cell lines is shown using ddPCR. (C) The ratio of L1Hs 5’ and 3’ transcripts shows the enrichment of the 3’ end of L1Hs for all cell lines. (D) The absolute quantification of autonomous and pervasive transcripts reveals higher expression of pervasive compared to autonomous transcripts in all cell lines except MCF-7. All data were run in duplicate. All errors bars are mean ± SEM. These data represent two independent experiments. (E) L1Hs autonomous transcription landscape of human healthy primary tissues. Each point is a RNA sequencing experiment, separated by tissue of origin.