| Literature DB >> 26451251 |
Raoul J P Bonnal1, Valeria Ranzani1, Alberto Arrigoni1, Serena Curti1, Ilaria Panzeri1, Paola Gruarin1, Sergio Abrignani2, Grazisa Rossetti1, Massimiliano Pagani3.
Abstract
To help better understand the role of long noncoding RNAs in the human immune system, we recently generated a comprehensive RNA-seq data set using 63 RNA samples from 13 subsets of T (CD4(+) naive, CD4(+) TH1, CD4(+) TH2, CD4(+) TH17, CD4(+) Treg, CD4(+) TCM, CD4(+) TEM, CD8(+) TCM, CD8(+) TEM, CD8(+) naive) and B (B naive, B memory, B CD5(+)) lymphocytes. There were five biological replicates for each subset except for CD8(+) TCM and B CD5(+) populations that included 4 replicates. RNA-Seq data were generated by an Illumina HiScanSQ sequencer using the TruSeq v3 Cluster kit. 2.192 billion of paired-ends reads, 2×100 bp, were sequenced and after filtering a total of about 1.7 billion reads were mapped. Using different de novo transcriptome reconstruction techniques over 500 previously unknown lincRNAs were identified. The current data set could be exploited to drive the functional characterization of lincRNAs, identify novel genes and regulatory networks associated with specific cells subsets of the human immune system.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26451251 PMCID: PMC4587370 DOI: 10.1038/sdata.2015.51
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Figure 1Description of the study: cellular subsets hierarchy and bioinformatics pipeline.
(a) Hierarchical representation of the different cell subset originating from hematopoietic stem cells. In this study 13 human primary lymphocyte subsets were profiled: CD4+ naive; CD4+ Th1; CD4+ Th2; CD4+ Th17; CD4+ Treg; CD4+ TEM; CD4+ TCM; CD8+ TCM; CD8+ TEM; CD8+ naive; B naive; B memory; B CD5+. The number of biological replicates and the expressed genes (FPKM>0.21) for each population is indicated. The total number of samples profiled in this study is 63. (b) General overview of the bioinformatic steps and approaches used for the identification of novel lincRNAs.
Purification and RNA-Seq of human primary lymphocyte subsets
|
|
|
|
|
|---|---|---|---|
| Purity achieved (middle left) by the sorting of 13 human lymphocyte subsets (isolated from peripheral blood lymphocytes of four to five different donors per subset) by various surface marker combinations (Sorting phenotype). Treg,regulatory T cells; TCM, central memory T cells; TEM, effector memory T cells; B, B cells. Data are representative of at least four experiments (mean±s.d. for purity). | |||
| CD4+ naive | 99,8±0,1 | CD4+ CCR7+ CD45RA+ CD45RO− | 5 |
| CD4+ TH1 | 99,9±0,05 | CD4+ CXCR3+ | 5 |
| CD4+ TH2 | 99,7±0,3 | CD4+ CRTH2+ CXCR3− | 5 |
| CD4+ TH17 | 99,1±1 | CD4+ CCR6+ CD161+ CXCR3− | 5 |
| CD4+ Treg | 99,0±0,8 | CD4+ CD127− CD25+ | 5 |
| CD4+ TCM | 98,4±2,8 | CD4+ CCR7+ CD45RA− CD45RO+ | 5 |
| CD4+ TEM | 95,4±5,5 | CD4+ CCR7− CD45RA− CD45RO+ | 5 |
| CD8+ TCM | 98,3±0,8 | CD8+ CCR7+ CD45RA− CD45RO+ | 4 |
| CD8+ TEM | 96,8±0,9 | CD8+ CCR7− CD45RA− CD45RO+ | 5 |
| CD8+ naive | 99,3±0,2 | CD8+ CCR7+ CD45RA+ CD45RO− | 5 |
| B naive | 99,9±0,1 | CD19+ CD5− CD27− | 5 |
| B memory | 99,1±0,8 | CD19+ CD5− CD27+ | 5 |
| B CD5+ | 99,1±0,8 | CD19+ CD5+ | 4 |
Overall read depth and coverage information
|
|
|
|
|
|
|---|---|---|---|---|
| Data aggregated by population, the number of raw reads, the number of trimmed reads and the number of mapped reads for both TopHat and STAR on the Ensembl human sequence, version 67 from May 2012. Number of raw reads, trimmed reads and mapped reads for both TopHat and Star are reported for all 13 lymphocytes populations. | ||||
| CD4+ naive | 237 | 232 | 185 | 210 |
| CD4+ TH1 | 129 | 123 | 104 | 104 |
| CD4+ TH2 | 126 | 120 | 107 | 106 |
| CD4+ TH17 | 121 | 112 | 87 | 86 |
| CD4+ Treg | 148 | 140 | 125 | 124 |
| CD4+ TCM | 185 | 145 | 125 | 127 |
| CD4+ TEM | 148 | 145 | 151 | 127 |
| CD8+ TCM | 147 | 120 | 103 | 105 |
| CD8+ TEM | 187 | 154 | 136 | 138 |
| CD8+ naive | 185 | 150 | 129 | 130 |
| B naive | 172 | 137 | 121 | 123 |
| B memory | 261 | 249 | 220 | 223 |
| B CD5+ | 146 | 118 | 106 | 108 |
Figure 2Quality control assessments.
(a) Phred quality score of the average distribution over all reads across all samples in each base before and (b) after trimming. (c) %GC content before and after trimming. (d) Detailed overview of the human lymphocyte subsets profiled: raw reads (black), the reads trimmed and filtered by quality (blue), and (e) the comparison of the mapped reads using TopHat (light green) and STAR (light orange).
Figure 3Analysis of intra-population consistency: Principal Component Analysis and hierarchical clustering.
(a) Principal Component Analysis (PCA) performed using DESeq2 rlog-normalized RNA-seq data. Loadings for principal components 1 (PC1) and PC2 are reported in graph (on x and y-axes). (b) Hierarchical clustering analyses performed using DESeq2 rlog-normalized RNA-seq data. Color code (from white to dark blue) refers to the distance metric used for clustering (dark blue corresponds to the maximum of correlation values). (c) Violin plot of the normalized FPKM values for the newly identified lincRNAs, previously annotated lincRNAs and transcription factors genes. The black line represents the normalized FPKM threshold (0.21 FPKM).