| Literature DB >> 22808097 |
Dominick Sinicropi1, Kunbin Qu, Francois Collin, Michael Crager, Mei-Lan Liu, Robert J Pelham, Mylan Pho, Andrew Dei Rossi, Jennie Jeong, Aaron Scott, Ranjana Ambannavar, Christina Zheng, Raul Mena, Jose Esteban, James Stephans, John Morlan, Joffre Baker.
Abstract
RNA biomarkers discovered by RT-PCR-based gene expression profiling of archival formalin-fixed paraffin-embedded (FFPE) tissue form the basis for widely used clinical diagnostic tests; however, RT-PCR is practically constrained in the number of transcripts that can be interrogated. We have developed and optimized RNA-Seq library chemistry as well as bioinformatics and biostatistical methods for whole transcriptome profiling from FFPE tissue. The chemistry accommodates low RNA inputs and sample multiplexing. These methods both enable rediscovery of RNA biomarkers for disease recurrence risk that were previously identified by RT-PCR analysis of a cohort of 136 patients, and also identify a high percentage of recurrence risk markers that were previously discovered using DNA microarrays in a separate cohort of patients, evidence that this RNA-Seq technology has sufficient precision and sensitivity for biomarker discovery. More than two thousand RNAs are strongly associated with breast cancer recurrence risk in the 136 patient cohort (FDR <10%). Many of these are intronic RNAs for which corresponding exons are not also associated with disease recurrence. A number of the RNAs associated with recurrence risk belong to novel RNA networks. It will be important to test the validity of these novel associations in whole transcriptome RNA-Seq screens of other breast cancer cohorts.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22808097 PMCID: PMC3396611 DOI: 10.1371/journal.pone.0040092
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Scatter plot of recurrence risk hazard ratios of RNA sequences.
RT-PCR results versus RNA-Seq results. Each point represents a distinct RNA. Genes in the 21-gene breast cancer recurrence risk assay are marked with alphanumeric symbols.
Figure 2Relationship of increased RNA expression to risk of breast cancer recurrence in 136 breast cancer patients.
Each point represents a distinct RNA. The magnitude of the effect size is given by the hazard ratio from Cox proportional hazard analysis and statistical significance by P-Value. Genes in the 21-gene breast cancer recurrence risk assay are marked with alphanumeric symbols. A. Analysis of 192 genes measured by RT-PCR. B. Analysis of assembled RefSeq transcripts as measured by whole transcriptome RNA-Seq.
Figure 3Box plots of normalized expression values of RNAs in breast cancer patients, stratified by recurrence status.
Each point represents a patient tumor. The bottom and top of the box are the 25th and 75th percentiles and the band within the box is the 50th percentile (the median) of the points in the group. The ends of the vertical lines represent the lowest datum still within 1.5 interquartile range of the lower quartile, and the highest datum still within 1.5 interquartile range of the upper quartile. Values from RNA-Seq (left panel) and RT-PCR (right panel) are shown: A. BCL2. B. GSTM1. C. AURKA. D. MKI67.
Relation of median RNA count with frequency of RNA identification at FDR<10%.
| Median RNA Count | Number of RNAs* | Number of RNAs Identified atFDR<10% | Percent of RNAs Identified at FDR<10% |
| <10 | 5817 | 286 | 4.9% |
| 10–99 | 6245 | 399 | 6.4% |
| 100–999 | 7657 | 551 | 7.2% |
| ≥1000 | 743 | 71 | 9.6% |
| Total | 20,462 | 1307 | 6.4% |
*Number across entire 136 patient population.
Figure 4Multiple RefSeq RNA networks with common biological themes.
Among the set of 1307 identified RefSeq RNAs, a subset was selected that contains all RNAs that co-express (at R>0.6) with at least one other RNA in the set of 1307. Cytoscape 2.8 visualization [31], [32]. The degree of node interaction (total number of co-expression interactions) is mapped to node size and color (indicated by scale). Biological annotation of genes that are highly represented in identified networks is indicated by letter labels. A: cell cycle; B: co-expression with ESR1; C: genes mapping to Ch17q23-24; D: genes mapping to Chr821-24; E: genes mapping to Chr9q22; F: olfactory signaling, glucose metabolism, glucuronidation.