| Literature DB >> 25723451 |
Lam C Tsoi, Matthew K Iyer, Philip E Stuart, William R Swindell, Johann E Gudjonsson, Trilokraj Tejasvi, Mrinal K Sarkar, Bingshan Li, Jun Ding, John J Voorhees, Hyun M Kang, Rajan P Nair, Arul M Chinnaiyan, Goncalo R Abecasis, James T Elder.
Abstract
BACKGROUND: Although analysis pipelines have been developed to use RNA-seq to identify long non-coding RNAs (lncRNAs), inference of their biological and pathological relevance remains a challenge. As a result, most transcriptome studies of autoimmune disease have only assessed protein-coding transcripts.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25723451 PMCID: PMC4311508 DOI: 10.1186/s13059-014-0570-4
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Overview of the analysis pipeline. We first performed Tophat alignment and identified uniquely mapped reads for each RNA-seq sample, we then assembled the transcripts using Cufflinks for each sample. We used a computational approach to nominate potential novel transcripts (Prensner JR et al., [13]) by comparing with Ensembl gene set. We removed those potential novel transcripts which are close (that is, <2 kb) to any exons from any annotated transcripts, inhabited in regions with lower mappability/alignability, or less than 200 bp in length. We quantified the gene expressions using read counts. We then normalized the values across the samples and performed differential expression analysis using DESeq. We inferred the properties and biological functions of the lncRNAs by comparing results with other RNA-seq experiments and using co-expression analysis.
Transcripts remaining after application of various filtering steps
|
| ||||||||
|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| Raw | 16,246 | 2,902 | 3,670 | 4,593 | 823 | 1,825 | 142 | 113 |
| ≥5% (11) samples | 16,225 | 2,897 | 3,650 | 4,585 | 822 | 1,820 | 141 | 112 |
| Distance (≥2 kb) | 16,225 | 2,897 | 3,650 | 4,585 | 336 | 1,269 | 22 | 46 |
| Mappability | 14,468 | 2,338 | 1,698 | 3,646 | 249 | 993 | 18 | 36 |
| Length (≥200 bp) | 14,461 | 2,336 | 1,693 | 3,642 | 244 | 984 | 18 | 36 |
| ≥216 mapped reads | 14,011 | 2,294 | 1,476 | 2,942 | 196 | 840 | 15 | 29 |
Figure 2Genomic map of lncRNAs expressed in skin tissues across the genome. The number of lncRNAs identified in this study (y-axis) per megabase across the genome (x-axis).
Figure 3Expression behaviors for different gene categories. The mean gene expression in RPKM is shown in (a) and coefficient of variation in the normal skin samples for different transcript categories is shown in (b).
Numbers and proportions (in percentage) of differentially expressed genes for different gene categories under three different comparisons: normal vs. lesional psoriatic skin (NN vs. PP), uninvolved vs. lesional psoriatic skin (PN vs. PP), and normal vs. uninvolved skin (NN vs. PN)
|
| |||||||||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
| NN vs. PP | 2,342 | 408 | 138 | 709 | 505 | 81 | 396 | 9 | 19 |
| (17%) | (18%) | (9%) | (24%) | (47%) | (41%) | (47%) | (60%) | (66%) | |
| PN vs. PP | 2,146 | 369 | 161 | 613 | 436 | 76 | 337 | 8 | 15 |
| (15%) | (16%) | (11%) | (21%) | (40%) | (39%) | (40%) | (53%) | (52%) | |
| NN vs. PN | 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| (0.03%) | (0%) | (0%) | (0%) | (0%) | (0%) | (0%) | (0%) | (0%) | |
Figure 4Tissue specificity analysis for different gene categories. (a) Heatmap showing the proportion of genes from each category expressed in different tissue types. (b) Tissue specificity (T ) for different gene categories in skin when comparing with 16 other tissue types.
Figure 5Relative distance to enhancers (a) and promoters (b) for different transcript classes. The means and error bars depict the relative distance (D /D ) to the enhancer (a) and promoter (b) elements for genes in each category in these two ectodermally derived cell types (HMEC and NHEK). D is the closest distance to the enhancer (or promoter) in NHEK (or HMEC), and D is the average closest distance to the enhancer (or promoter) to the other cell types.
Enriched (FDR ≤ 0.1) inferred functions among all the differentially expressed lncRNAs (DE lncRNAs) and differentially expressed novel lncRNAs (DE novel lncRNAs) in psoriatic skin
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|
|
|
| 42 | 41 | 5.28E-12 | 1.91 | 3.72E-08 |
|
| 69 | 60 | 4.96E-11 | 1.70 | 3.50E-07 | |
|
| 87 | 72 | 7.77E-11 | 1.62 | 5.48E-07 | |
|
| 68 | 56 | 1.78E-08 | 1.61 | 1.25E-04 | |
|
| 42 | 37 | 1.50E-07 | 1.72 | 1.05E-03 | |
|
| 87 | 66 | 4.21E-07 | 1.48 | 2.96E-03 | |
|
| 46 | 39 | 6.41E-07 | 1.66 | 4.51E-03 | |
|
| 45 | 38 | 1.11E-06 | 1.65 | 7.80E-03 | |
|
| 23 | 22 | 1.90E-06 | 1.87 | 1.34E-02 | |
|
| 34 | 30 | 2.18E-06 | 1.73 | 1.54E-02 | |
|
|
| 69 | 30 | 3.99E-07 | 2.33 | 1.32E-03 |
|
| 87 | 34 | 1.42E-06 | 2.09 | 4.72E-03 | |
|
| 42 | 20 | 6.80E-06 | 2.55 | 2.26E-02 | |
|
| 68 | 27 | 1.34E-05 | 2.13 | 4.46E-02 | |
|
| 87 | 32 | 1.42E-05 | 1.97 | 4.72E-02 | |
|
| 6 | 6 | 1.97E-05 | 5.36 | 6.54E-02 | |
|
| 48 | 21 | 2.14E-05 | 2.34 | 7.09E-02 |
FC refers to the observed to expected ratios for the enrichment. For illustration purposes, only inferred functions annotated with at most 100 genes are shown.