| Literature DB >> 27301971 |
Steven W Criscione1, Nicholas Theodosakis2,3, Goran Micevic2,3, Toby C Cornish4, Kathleen H Burns4,5,6,7, Nicola Neretti8, Nemanja Rodić9,10.
Abstract
BACKGROUND: Long INterspersed Element-1 (LINE-1 or L1) is the only autonomously active, transposable element in the human genome. L1 sequences comprise approximately 17 % of the human genome, but only the evolutionarily recent, human-specific subfamily is retrotransposition competent. The L1 promoter has a bidirectional orientation containing a sense promoter that drives the transcription of two proteins required for retrotransposition and an antisense promoter. The L1 antisense promoter can drive transcription of chimeric transcripts: 5' L1 antisense sequences spliced to the exons of neighboring genes.Entities:
Keywords: Antisense promoter; Chimeric transcript; EST; L1; LINE-1; PacBio; Retrotransposon; Transposon; YY1
Mesh:
Substances:
Year: 2016 PMID: 27301971 PMCID: PMC4908685 DOI: 10.1186/s12864-016-2800-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Identification of novel L1 antisense promoter (ASP) transcripts using a computational pipeline. a Schematic representation of our method to identify L1 ASP transcripts. The coordinates of human spliced ESTs were intersected with L1s and gene exons to identify ESTs with a TSS antisense to an L1 and overlapping an exon of a cognate gene. L1 ASP transcripts that displayed an independent EST supporting L1 exonization were removed. b Total number of L1 ASP transcripts identified in the current report (n = 988) cross-referenced to known existing L1 ASP transcripts found in published reports including recently identified L1 ORF0 gene fusions and the C-GATE database. c Histogram of the total number of ESTs identified per gene for L1 ASP transcripts (n = 2015 ESTs, some overlapping multiple genes)
Fig. 2L1 subfamily and tissue distributions of L1 ASP transcripts. a Evolutionary age of L1 subfamilies indicates a role for most L1 evolutionary subtypes in genesis of L1 ASP transcripts. b Characterization of source material used for ESTs cDNA that support L1 ASP transcripts. c Tissue source for ESTs supporting L1 ASP transcripts identified in hyper-proliferative or cancerous samples. d Normal tissue sources for ESTs used to identify L1 ASP transcripts
Fig. 3Most L1 ASP transcripts are sense to cognate genes and possess protein coding potential. a L1 ASP transcripts tend to be in sense strandedness relative to the cognate gene, typically serving as an alternative TSS of the canonical gene TSS (left panel). Rarely, ESTs supporting L1 ASP transcripts overlap anti-sense to cognate gene (right panel). The bias towards sense-strandedness of L1 ASP transcripts suggested the possibility for protein coding potential. ESTs supporting L1 ASP transcripts that had multiple alignments in the database (~2.3 % of all ESTs) were excluded. b Independent analysis of long-read PacBio RNA-seq data identified transcript isoforms supporting 124/988 L1 ASP transcripts. We validated both instances where the L1 ASP transcript was sense to the cognate gene (L1-PPP1R1C, left panel). We also validated the opposite strand L1 ASP transcript (L1-ABCA9, right panel). The red indicates the positive strand and the blue indicates the negative strand for the genome browser view. c Coding potential of L1 ASP transcripts was assessed by the ability to encode an open reading frame (ORF) of at least 100 amino acids (aa) and to begin with a start codon. We identified that 27.1 % of ESTs contained the potential for coding by these criteria
Fig. 4Features of the L1 antisense promoter revealed from ESTs, ENCODE ChIP-seq, and GRO-seq data. a ESTs supporting alternative L1 ASP transcripts for L1HS, L1PA2-8, ancient primate L1, and ancient mammalian L1 subfamilies were aligned to full-length L1 consensus sequences. The plot revealed evolutionarily recent L1HS and L1PA2-8 subfamilies possess the ASP activity in the 5’ UTR, whereas ancient primate and ancient mammalian L1 subfamilies display minor ASP activity in the 5’ UTR and the majority is at the 3’ end of the element overlapping the end of ORF2. b YY1 enrichment profile in the 5’UTR of L1HS consensus sequence for ENCODE YY1 ChIP-seq in various cell-lines. The schematic above is a reference point to the L1HS consensus position. c The TSS enrichment profile for YY1, H3K9me2, and H3K9me3 ChIP-seq for L1HS and L1PA2-8 AS ESTs in units of input subtracted reads per million mapping reads (RPM) using the K562 cell-line. The bottom panel displays the GRO-seq TSS enrichment profile in RPM enrichment units. The schematic above is a reference point to the L1HS consensus position
Fig. 5Validation of selected L1 ASP transcripts by combined PCR and Sanger-sequencing methods. a Upper panel: Genome browser view of KIAA1324L genomic locus displaying the ESTs supporting L1- KIAA1324L. Middle panel: Representative PCR product obtained by PCR amplification with L1-specific primer and KIAA1324L-specific primer. Lower panel: selected Sanger sequencing read across L1 to KIAA1324L exon 1 boundary; blue letters denote L1 AS promoter sequence, orange letters denote KIAA1324L exon 1 sequence. b Empirical validation of L1-UVRAG, with same orientation as Figure 5a
Fig. 6L1 driven transcripts are expressed in many normal human tissues. a-b Tissue specific expression levels of L1 ASP transcripts L1-NF1 and L1-SEC22B relative to wild-type cognate genes, respectively. Upper panel: Genome browser view of genomic locus displaying the ESTs supporting L1 ASP transcripts. Lower panel: Quantitative RT-PCR results of L1 ASP transcripts and cognate wild type gene in 9 tissues