| Literature DB >> 28081703 |
Milica Vukmirovic1, Jose D Herazo-Maya2, John Blackmon3, Vesna Skodric-Trifunovic4,5, Dragana Jovanovic4,5, Sonja Pavlovic6, Jelena Stojsic7, Vesna Zeljkovic8, Xiting Yan2, Robert Homer9,10, Branko Stefanovic3, Naftali Kaminski2.
Abstract
BACKGROUND: Idiopathic Pulmonary Fibrosis (IPF) is a lethal lung disease of unknown etiology. A major limitation in transcriptomic profiling of lung tissue in IPF has been a dependence on snap-frozen fresh tissues (FF). In this project we sought to determine whether genome scale transcript profiling using RNA Sequencing (RNA-Seq) could be applied to archived Formalin-Fixed Paraffin-Embedded (FFPE) IPF tissues.Entities:
Keywords: DEGs; FFPE; Idiopathic Pulmonary Fibrosis; MMP7; Microarray; NanoString nCounter®; Network; Pathways; RNA-Seq; Validation
Mesh:
Substances:
Year: 2017 PMID: 28081703 PMCID: PMC5228096 DOI: 10.1186/s12890-016-0356-4
Source DB: PubMed Journal: BMC Pulm Med ISSN: 1471-2466 Impact factor: 3.317
Fig. 1Study design. The summary of study cohorts, sequencing approaches and data analysis. Arrows represent directions of how experiments were performed for each cohort and how comparison between data sets were done. Microarray data is a publically available dataset (GSE47460)
Summary of RNA-Seq (FFPE) gene mapping
| FFPE | RNA Seq | Orig Reads | QC failed reads | Unmapped Reads | Mapped Reads | Hits | Proper hits | Mapping Rate | Hits Rate | Proper hits Rate |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | SL32670 | 109112224 | 1585386 | 55426188 | 52100650 | 61090707 | 49057850 | 0.4775 | 0.5599 | 0.4496 |
| 2 | SL32671 | 112024858 | 1901325 | 65686848 | 44436685 | 52541559 | 40745436 | 0.3967 | 0.469 | 0.3637 |
| 3 | SL32674 | 137705852 | 1225400 | 58878848 | 77601604 | 93224772 | 74057010 | 0.5635 | 0.677 | 0.5378 |
| 4 | SL32677 | 114532074 | 3536592 | 65895279 | 45100203 | 52390752 | 36239592 | 0.3938 | 0.4574 | 0.3164 |
| 5 | SL32679 | 94880502 | 1404145 | 41223487 | 52252870 | 60762016 | 46203712 | 0.5507 | 0.6404 | 0.487 |
| 6 | SL32680 | 98383774 | 3001192 | 45024767 | 50357815 | 58405722 | 44600578 | 0.5119 | 0.5937 | 0.4533 |
| 7 | SL32681 | 76149832 | 2406659 | 57793748 | 15949425 | 17505889 | 8309682 | 0.2094 | 0.2299 | 0.1091 |
| 1C | SL32683 | 191953246 | 1025564 | 44983418 | 145944264 | 1.72E + 08 | 135808266 | 0.7603 | 0.8955 | 0.7075 |
| 2C | SL71047 | 126656896 | 4828921 | 72935707 | 48892268 | 59052405 | 40219390 | 0.386 | 0.4662 | 0.3175 |
| 3C | SL71048 | 92915968 | 5523822 | 45167979 | 42224167 | 51054887 | 38389264 | 0.4544 | 0.5495 | 0.4132 |
| 4C | SL71049 | 130059384 | 8904855 | 55527773 | 65626756 | 82086047 | 60494026 | 0.5046 | 0.6311 | 0.4651 |
| 5C | SL71050 | 121812940 | 7765072 | 43474914 | 70572954 | 83920938 | 60927332 | 0.5794 | 0.6889 | 0.5002 |
Fig. 2MDS analysis based on gene expression demonstrates a clear separation between the IPF and control FFPE samples. The top three MDS dimensions based on the top 5,000 genes differentially expressed between IPF and control were plotted using edge R package for data visualization. Each dot is one sample. Blue represents IPF and black represents control, respectively
Fig. 3Direct comparison of gene expression between RNA-Seq (FFPE) and microarray data (FF). a Microarray Log2(FC) IPF vs control was plotted on x axis and RNA-Seq Log2(FC) IPF vs control was plotted on y axis. Yellow dots indicate common increased genes, purple dots indicate common decreased genes, grey dots indicate genes with discordant patterns of differential expression and white dots indicate genes that are not significantly differentially expressed genes in both datasets or not significant in microarray and significant in RNA-Seq and vice versa. b Venn diagram colored in yellow indicates gene overlap between increased genes in RNA-Seq and microarray. 760 represents commonly increased genes, 940 is a number of genes that is increased in RNA-Seq data and do not overlap with microarrays while 1,546 is a number of increased genes in microarrays that do not overlap with RNA-Seq (FDR adjusted p < 0.05). Venn diagram colored in purple represents overlap between of decreased genes in both sets (FDR adjusted p < 0.05) and it follows the same logical relations as a Venn diagram in yellow
Fig. 4Heat map of top scored signaling pathways enriched in commonly increased and decreased genes from RNA-Seq (FFPE) and microarrays (FF). Every raw represents a gene and every column represents a signaling pathway. Top significant signaling pathways for commonly increased genes are presented on the heat map in yellow and for decreased genes are presented on the heat map in purple. Pathway enrichment analysis was done in MetaCore and full list of pathways could be found in Additional file 5: Table S2 and Additional file 6: Table S3. Only genes that have fold change above one were presented in the heat map
Fig. 5MMP7 network analysis from RNA-Seq (FFPE) and microarrays (FF) data independently. All differentially expressed genes from RNA-Seq (4,131) and from microarrays (5,859) were submitted to MetaCore to build and draw the network around MMP7 gene, a common network gene for both datasets. Increased genes were marked in yellow and decreased genes were marked in purple for both datasets (RNA-Seq (a), and microarrays (b)). Gene homologues that have mixed expression values were marked with yellow/purple*. The rest of the genes that are present in network but are not detected in our datasets, belong to a pre-build network for MMP7 in MetaCore database. Canonical pathways identified in network are marked in light blue. Red arrows represent inhibitory effect between two genes in the network and green arrows represent activation effect. Function of each network gene is defined by different shape and explained in the figure legend. *Please note yellow/purple colors were manually added, instead of red/blue originally proposed by MetaCore, to keep consistent gene expression visualization through the manuscript
Fig. 6Validation of gene expression in fresh frozen and FFPE tissues using NanoString nCounter®. a Microarrays Log2(FC) IPF vs control (FF) was plotted on x axis and RNA-Seq Log2(FC) IPF vs control (FFPE) was plotted on y axis, b NanoString Log2(FC) IPF vs control (FF) was plotted on x axis and NanoString Log2(FC) IPF vs control (FFPE) was plotted on y axis. 15 concordant genes, 10 discordant genes, and 10 data set specific genes were analyzed. Gene names and categories are labeled