| Literature DB >> 22929540 |
Alayne L Brunner, Andrew H Beck, Badreddin Edris, Robert T Sweeney, Shirley X Zhu, Rui Li, Kelli Montgomery, Sushama Varma, Thea Gilks, Xiangqian Guo, Joseph W Foley, Daniela M Witten, Craig P Giacomini, Ryan A Flynn, Jonathan R Pollack, Robert Tibshirani, Howard Y Chang, Matt van de Rijn, Robert B West.
Abstract
BACKGROUND: Molecular characterization of tumors has been critical for identifying important genes in cancer biology and for improving tumor classification and diagnosis. Long non-coding RNAs, as a new, relatively unstudied class of transcripts, provide a rich opportunity to identify both functional drivers and cancer-type-specific biomarkers. However, despite the potential importance of long non-coding RNAs to the cancer field, no comprehensive survey of long non-coding RNA expression across various cancers has been reported.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22929540 PMCID: PMC4053743 DOI: 10.1186/gb-2012-13-8-r75
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Distribution and mean expression of 3SEQ peaks. (a) The distribution plot shows a tight cluster of exonic peaks approximately 275 bp upstream of the 3' end of known genes (n = 29,024 peaks in known exons; distances are based on genomic coordinates and not the spliced transcriptome). (b) Boxplots show the distribution of mean expression levels for each peak by peak category. Raw sequence count data was normalized by dividing each value by the sample mean, and then taking the square root. Boxes range from the first to the third quartiles. Median expression is marked with a line. Mean values are 0.799, 0.407 and 0.324 for coding, lncRNA and novel transcripts, respectively. Plots are truncated to show mean expression values less than 2. Outlier peaks show expression as high as 17.9.
Figure 2Variably expressed lncRNAs and novel intergenic transcripts. Heatmaps illustrating the 368 lncRNAs (left) and 297 novel transcripts (right) with variable expression as defined by standard deviation >0.25 across 66 cancer samples. Transcripts with differential expression in at least one of the 17 two-class SAM analyses (top) were clustered separately from those transcripts not significantly differentially expressed (bottom). Normalized read data were median centered, hierarchically clustered and plotted on a low (green) to high (red) heatmap. Samples are grouped by cancer type; the number in parentheses indicates the number of libraries for each cancer type. Red and pink is used for libraries made from adenocarcinomas of breast, lung, colon and prostate, as well as normal breast, lung and colon. Orange and yellow show squamous cell carcinomas of the head and neck, skin, lung and other carcinomas: papillary urothelial carcinoma and nasopharyngeal carcinoma. Green indicates sarcomas with known translocations: endometrial stromal sarcoma, Ewing's sarcoma, extraskeletal myxoid chondrosarcoma, synovial sarcoma and myxoid liposarcoma. Blue shows other sarcomas: gastrointestinal stromal tumor, leiomyosarcoma and dedifferentiated liposarcoma. Normal samples and cancer samples were combined for hierarchical clustering, but are displayed separately for clarity. Samples are ordered according to Table S1 in Additional file 1. Breast, breast invasive ductal carcinoma; colon, colon adenocarcinoma; DDLPS, dedifferentiated liposarcoma; EMC, extraskeletal myxoid chondrosarcoma; ESS, endometrial stromal sarcoma; EWS, Ewing's sarcoma; GIST, gastrointestinal stromal tumor; HN SCC, head and neck squamous cell carcinoma; LMS, leiomyosarcoma; Lung, lung adenocarcinoma; Lung SCC, lung squamous cell carcinoma; MLS, myxoid liposarcoma; NPC, nasopharyngeal carcinoma; prostate, prostate adenocarcinoma; PUC, papillary urothelial carcinoma; Skin SCC, skin squamous cell carcinoma; SS, synovial sarcoma.
Figure 3Significant lncRNAs and novel transcripts in breast, lung and colon cancer. LncRNAs and novel transcripts significantly differentially expressed in (a) breast (n = 13), (b) colon (n = 10) and (c) lung (n = 6) cancers. Normalized, uncentered read data for cancer and normal samples were hierarchically clustered and plotted on low (black) to high (red) heatmaps.
Figure 4A case study of novel, breast-specific peak 13741. (a) Browser shot showing expression for a breast cancer sample in the region downstream of ANKRD30A on chromosome 10. The first two tracks show the known genes and RNAs in this locus. The third track shows the peaks identified in this study, including three highly expressed peaks: novel 13741, lncRNA 13742 and novel 13743. The fourth track shows the raw 3SEQ reads (transcript abundance levels) on the forward strand (blue) and reverse strand (red). The final tracks show the longest transcripts that overlap peak 13742, a Scripture-assembled transcript produced using normal breast RNAseq reads from the Illumina BodyMap data set and GENCODE lncRNA ENSG00000235687. (b) Zoom-in browser shot of peak 13741 on chromosome 10 shows the location of the RNA in situ hybridization probe (top track) as well as the raw sequence reads for one breast cancer sample (bottom track). This peak illustrates the shape of a typical 3SEQ peak from a high-expressing transcript. (c) ER staining on an ER+ breast cancer (top left) and an ER-breast cancer (top right). RNA in situ hybridization for peak 13741 performed on the same ER+ breast cancer specimen (bottom left) and same ER- breast cancer (bottom right). Specimens were matched but ER and 13741 stains used different tissue slices. All images are at 400× magnification. 3SEQ, 3'-end sequencing for expression quantification; chr10, chromosome 10; ER, estrogen receptor; lncRNA, long non-coding RNA.
Novel, breast-specific peak 13741 is associated with estrogen-receptor-positive and progesterone-receptor-positive cells and Grade 1 breast cancer
| ER+ | ER- | PR+ | PR- | Grade 1 | Grade 3 | |
|---|---|---|---|---|---|---|
| ( | ( | ( | ( | ( | ( | |
| 13741+ | 122 | 5 | 112 | 15 | 37 | 12 |
| 13741- | 24 | 28 | 17 | 35 | 5 | 29 |
RNA in situ hybridization results for analyses correlating the expression of peak 13741. Numbers of samples are reported. Each test was significant by the Kruskal-Wallis rank sum test. +, staining in >30% of cells; -, staining in <10% of cells; ER, estrogen receptor; PR, progesterone receptor.