| Literature DB >> 25599403 |
Matthew K Iyer1, Yashar S Niknafs2, Rohit Malik3, Udit Singhal4, Anirban Sahu3, Yasuyuki Hosono5, Terrence R Barrette5, John R Prensner5, Joseph R Evans6, Shuang Zhao6, Anton Poliakov5, Xuhong Cao4, Saravana M Dhanasekaran3, Yi-Mi Wu5, Dan R Robinson5, David G Beer7, Felix Y Feng8, Hariharan K Iyer9, Arul M Chinnaiyan10.
Abstract
Long noncoding RNAs (lncRNAs) are emerging as important regulators of tissue physiology and disease processes including cancer. To delineate genome-wide lncRNA expression, we curated 7,256 RNA sequencing (RNA-seq) libraries from tumors, normal tissues and cell lines comprising over 43 Tb of sequence from 25 independent studies. We applied ab initio assembly methodology to this data set, yielding a consensus human transcriptome of 91,013 expressed genes. Over 68% (58,648) of genes were classified as lncRNAs, of which 79% were previously unannotated. About 1% (597) of the lncRNAs harbored ultraconserved elements, and 7% (3,900) overlapped disease-associated SNPs. To prioritize lineage-specific, disease-associated lncRNA expression, we employed non-parametric differential expression testing and nominated 7,942 lineage- or cancer-associated lncRNA genes. The lncRNA landscape characterized here may shed light on normal biology and cancer pathogenesis and may be valuable for future biomarker development.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25599403 PMCID: PMC4417758 DOI: 10.1038/ng.3192
Source DB: PubMed Journal: Nat Genet ISSN: 1061-4036 Impact factor: 38.330