| Literature DB >> 29315213 |
Takafumi Chishima1,2, Junichi Iwakiri3, Michiaki Hamada4,5,6,7,8.
Abstract
It has been recently suggested that transposable elements (TEs) are re-used as functional elements of long non-coding RNAs (lncRNAs). This is supported by some examples such as the human endogenous retrovirus subfamily H (HERVH) elements contained within lncRNAs and expressed specifically in human embryonic stem cells (hESCs), as required to maintain hESC identity. There are at least two unanswered questions about all lncRNAs. How many TEs are re-used within lncRNAs? Are there any other TEs that affect tissue specificity of lncRNA expression? To answer these questions, we comprehensively identify TEs that are significantly related to tissue-specific expression levels of lncRNAs. We downloaded lncRNA expression data corresponding to normal human tissue from the Expression Atlas and transformed the data into tissue specificity estimates. Then, Fisher's exact tests were performed to verify whether the presence or absence of TE-derived sequences influences the tissue specificity of lncRNA expression. Many TE-tissue pairs associated with tissue-specific expression of lncRNAs were detected, indicating that multiple TE families can be re-used as functional domains or regulatory sequences of lncRNAs. In particular, we found that the antisense promoter region of L1PA2, a LINE-1 subfamily, appears to act as a promoter for lncRNAs with placenta-specific expression.Entities:
Keywords: long non-coding RNA; tissue-specific expression; transposable element
Year: 2018 PMID: 29315213 PMCID: PMC5793176 DOI: 10.3390/genes9010023
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Gene expression datasets obtained from Expression Atlas.
| ID | Expression Atlas ID | Data Provider | # Tissues | # Samples | Reference |
|---|---|---|---|---|---|
| 1 | E-MTAB-513 | Illumina Body Map | 16 | 19 | [ |
| 2 | E-MTAB-2836 | Human Protein Atlas | 32 | 122 | [ |
| 3 | E-MTAB-2919 | Genotype-Tissue Expression (GTEx) | 53 | 3282 | [ |
ID is used to refer to each of the three datasets in Table 2, Table 3, Table 4 and Table 5. Respectively, # tissues and # samples indicate the number of tissues and samples in each dataset.
Figure 1The flow of the analysis. (a) First, expression levels are converted into tissue specificity by ROKU. (b) Then, the results for each tissue were aggregated separately for long non-coding RNAs (lncRNA) containing a specific transposable element (TE; shown as TE-lncRNA) and for lncRNA not containing that specific TE (shown as dTE-lncRNA), and the significance of the difference between lncRNAs in these categories was determined using Fisher’s exact tests.
TE families significantly related to tissue-specific expression of long non-coding RNAs (lncRNAs).
| No. | TE Family | Tissue | Strand | Effect | Data ID |
|---|---|---|---|---|---|
| 1 | LINE.L1 | Brain | − | more specific | 1 |
| 2 | LINE.L1 | Cerebral_cortex | − | more specific | 2 |
| 3 | LTR.ERV1 | Leukocyte | + | more specific | 1 |
| 4 | LTR.ERV1 | Placenta | + | more specific | 2 |
| 5 | LTR.ERV1 | Testis | + | more specific | 1 |
| 6 | LTR.ERV1 | Testis | + | more specific | 2 |
| 7 | LTR.ERV1 | Testis | + | more specific | 3 |
| 8 | LTR.ERVL | Bone_marrow | + | less specific | 2 |
| 9 | LTR.ERVL.MaLR | Bone_marrow | +/− | less specific | 2 |
| 10 | SINE.Alu | Adrenal | +/− | more specific | 1 |
| 11 | SINE.Alu | Bone_marrow | + | more specific | 2 |
| 12 | SINE.Alu | Brain | + | more specific | 3 |
| 13 | SINE.Alu | Lymph_node | − | more specific | 1 |
| 14 | SINE.Alu | Skin | +/− | more specific | 2 |
| 15 | SINE.Alu | Testis | +/− | less specific | 1 |
| 16 | SINE.Alu | Testis | +/− | less specific | 2 |
| 17 | SINE.Alu | Testis | +/− | less specific | 3 |
A list of transposable element (TE) families related to tissue specificity of lncRNAs is shown. Strand indicates the orientation of the TE relative to the lncRNAs: +, relations were detected only when TEs were sense relative to lncRNAs; −, relations were detected only when TEs were antisense relative to lncRNAs; +/−, relations were detected when TEs are in both sense and antisense orientations relative to lncRNAs. Effect indicates whether lncRNAs including TEs (i.e., TE-lncRNAs) tended to be expressed specifically in that tissue: more specific, TE-lncRNAs were likely to be expressed specifically in that tissue; less specific, TE-lncRNAs were less likely to be expressed specifically in that tissue. Data id refers to dataset IDs provided in Table 1.
Figure 2Coverage of (a) ERV1 elements and (b) Alu elements around transcription start site (TSS) in long non-coding RNAs (lncRNAs), where ERV1 and Alu elements with the same orientation as their corresponding lncRNAs are considered. In each figure panel, the horizontal axis shows the relative position with respect to lncRNA TSSs (where 0 indicates the TSSs), and the vertical axis shows the coverage of the transposable element.
TE subfamilies significantly related to tissue-specific expression of long non-coding RNAs (lncRNAs).
| No. | TE Subfamily | Tissue | Strand | Effect | Data ID |
|---|---|---|---|---|---|
| 1 | AluJb | Adrenal | − | more specific | 1 |
| 2 | AluSc | Adrenal | +/− | more specific | 1 |
| 3 | AluSg | Adrenal | − | more specific | 1 |
| 4 | AluSp | Adrenal | +/− | more specific | 1 |
| 5 | AluSq2 | Adrenal | − | more specific | 1 |
| 6 | AluSx | Adrenal | +/− | more specific | 1 |
| 7 | AluSx | Testis | +/− | less specific | 1 |
| 8 | AluSx1 | Adrenal | +/− | more specific | 1 |
| 9 | AluSx1 | Testis | + | less specific | 1 |
| 10 | AluSz | Adrenal | +/− | more specific | 1 |
| 11 | AluY | Adrenal | + | more specific | 1 |
| 12 | L1PA2 | Placenta | − | more specific | 2 |
A list of transposable element (TE) subfamilies related to tissue specificity of lncRNAs is shown. For a detailed explanation of each column, see the caption for Table 2. Placenta samples were included only in the Human Protein Atlas (data ID: 2).
TE families significantly related to tissue-specific expression of mRNAs.
| No. | TE Family | Tissue | Strand | Effect | Data ID |
|---|---|---|---|---|---|
| 1 | DNA | Brain | − | more specific | 1 |
| 2 | DNA.TcMar.Tigger | Testis | − | less specific | 3 |
| 3 | DNA.hAT.Blackjack | Lung | − | more specific | 3 |
| 4 | DNA.hAT.Charlie | Brain | − | more specific | 1 |
| 5 | DNA.hAT.Charlie | Testis | + | less specific | 1 |
| 6 | DNA.hAT.Charlie | Testis | + | less specific | 2 |
| 7 | DNA.hAT.Charlie | Thyroid | + | more specific | 1 |
| 8 | LINE.CR1 | Brain | +/− | more specific | 1 |
| 9 | LINE.CR1 | Cerebral_cortex | +/− | more specific | 2 |
| 10 | LINE.CR1 | Kidney | + | more specific | 1 |
| 11 | LINE.L2 | Brain | − | more specific | 1 |
| 12 | LINE.L2 | Gall_bladder | + | more specific | 2 |
| 13 | LINE.L2 | Ovary | + | more specific | 1 |
| 14 | LTR.ERV1 | Testis | + | less specific | 1 |
| 15 | LTR.ERVK | Liver | − | more specific | 1 |
| 16 | LTR.ERVL | Skeletal_muscle | + | less specific | 1 |
| 17 | LTR.Gypsy | Brain | + | more specific | 1 |
| 18 | RC..Helitron. | Heart | + | more specific | 1 |
| 19 | SINE.Alu | Esophagus | − | less specific | 2 |
| 20 | SINE.Alu | Lung | +/− | less specific | 1 |
| 21 | SINE.Alu | Lymph_node | − | less specific | 1 |
| 22 | SINE.Alu | Minor_salivary_gland | + | less specific | 3 |
| 23 | SINE.Alu | Salivary_gland | + | less specific | 2 |
| 24 | SINE.Alu | Stomach | + | less specific | 2 |
| 25 | SINE.Alu | Testis | +/− | less specific | 1 |
| 26 | SINE.Alu | Testis | +/− | less specific | 2 |
| 27 | SINE.MIR | Brain | +/− | more specific | 1 |
| 28 | SINE.MIR | Brain | +/− | more specific | 3 |
| 29 | SINE.MIR | Cerebral_cortex | +/− | more specific | 2 |
| 30 | SINE.MIR | Ovary | +/− | more specific | 1 |
| 31 | SINE.MIR | Prostate | + | more specific | 1 |
| 32 | SINE.MIR | Testis | +/− | less specific | 1 |
| 33 | SINE.MIR | Testis | − | less specific | 2 |
A list of TE families related to tissue specificity of mRNA expression is shown. For a detailed explanation of each column, see the caption in Table 2.
TE subfamilies significantly related to tissue-specific expression of mRNAs.
| No. | TE Subfamily | Tissue | Strand | Effect | Data ID |
|---|---|---|---|---|---|
| 1 | MIR3 | Brain | +/− | more specific | 1 |
| 2 | MIR3 | Testis | − | less specific | 2 |
| 3 | MIRc | Brain | +/− | more specific | 1 |
| 4 | MIRc | Cerebral_cortex | − | more specific | 2 |
| 5 | MIRc | Ovary | − | more specific | 1 |
| 6 | MamGyp.int | Brain | + | more specific | 1 |
A list of TE subfamilies related to tissue specificity of mRNAs is shown. For a detailed explanation of each column, see the caption in Table 2.
Figure 3Coverage of L1PA2 elements around the transcription start site (TSSs) of long non-coding RNAs (lncRNAs). (a) L1PA2 elements in the same orientation as the lncRNAs are considered. (b) L1PA2 elements in the opposite orientation relative to the lncRNAs are considered. (c) The region around the TSSs in (a) is enlarged (showing greater detail between positions −2000 and 500 in (a)). In each panel, the horizontal axis shows the relative position with respect to lncRNA TSSs (where 0 indicates TSSs), and the vertical axis shows the coverage of the transposable element.
Figure 4The H3K4me3 histone modification level in the 5’ regions of L1PA2 elements (positions 0–1000). Each rows represents a L1PA2 element, and each columns represent a sample. Only L1PA2 elements overlapping peaks in one or more tissues are shown. Samples in which all L1PA2 elements did not overlap with peaks were excluded from the figure. The intensity of the color of each cell indicates the maximum value of the peak score (−10log(Q-value)) within the 5’ region for a L1PA2 element. (If there are no peaks in the region, the score is 0.) (a) Only L1PA2 elements overlapping with long non-coding RNA (lncRNA) TSSs in the opposite orientation were considered. (b) L1PA2 elements in the human genome including those not overlapping with any lncRNAs were considered.