| Literature DB >> 26589497 |
Yong Huang1, Shwu-Fan Ma2, Rekha Vij3, Justin M Oldham4, Jose Herazo-Maya5, Steven M Broderick6, Mary E Strek7, Steven R White8, D Kyle Hogarth9, Nathan K Sandbo10, Yves A Lussier11,12, Kevin F Gibson13, Naftali Kaminski14, Joe G N Garcia15, Imre Noth16.
Abstract
BACKGROUND: The course of disease for patients with idiopathic pulmonary fibrosis (IPF) is highly heterogeneous. Prognostic models rely on demographic and clinical characteristics and are not reproducible. Integrating data from genomic analyses may identify novel prognostic models and provide mechanistic insights into IPF.Entities:
Mesh:
Year: 2015 PMID: 26589497 PMCID: PMC4654815 DOI: 10.1186/s12890-015-0142-8
Source DB: PubMed Journal: BMC Pulm Med ISSN: 1471-2466 Impact factor: 3.317
Demographic and Clinical Characterizations among Study Cohorts
| Characteristic | Training cohort ( | UCV cohort ( | UPV cohort ( |
|
|---|---|---|---|---|
| Age, mean (±SD) | 67.1 (8.2) | 68.9 (8.2) | 68.5 (7.8) | 0.48 |
| Male gender, | 40 (90) | 15 (71.4) | 52 (69.3) | 0.05 |
| White race, | 37 (82.2) | 18 (81.8) | 73 (97.3) | 0.004 |
| Follow-up months, mean (±SD) | 18.8 (11.9) | 43.8 (29.4) | 23.5 (12.7) | <0.001 |
| Months to death, mean (±SD) | 12.7 (10.9) | 26.8 (20.1) | 14.2 (10.6) | 0.02 |
| FVC % predicted, mean (±SD) | 60.6 (14.3) | 64.7 (12.7) | 65.4 (16.7) | 0.25 |
| DLCO % predicted, mean (±SD) | 43.4 (17.7) | 43.2 (15.6) | 48.9 (18.6) | 0.19 |
| CPI, mean (±SD) | 55.6 (13) | 54.7 (10.7) | 50.7 (13.7) | 0.11 |
| Lung transplantation, | 1 (2.2) | 2 (9.5) | 15 (20) | 0.009 |
Fig. 1Correlation of gene co-expression modules with clinical traits in training cohort (n = 45). Gene co-expression modules were constructed using R package WGCNA (see Methods and Additional file 1 for detail), and denoted by different colors. The parameters for topological overlap matrix generation and unsupervised gene clustering are displayed in Additional file 3: Figure S2A & S2B. The number of genes in each gene module is labeled on left. The module eigengene is the principal component of each gene module computed across all samples. Correlation of module eigengene with each clinical trait was determined by Pearson’s correlation algorithm and displayed in the corresponding box (coefficient on top and p-value in parenthesis on bottom). The color of each box represents the direction of correlation (red) or anti-correlation (green) and the degree of correlations are scaled by the bar on the right. Traits significantly associated with specific modules are highlighted with a purple frame. FVC % predicted = forced vital capacity percent predicted; DLCO % predicted = diffusion capacity of carbon monoxide percent predicted; CPI = composite physiologic index
Fig. 2Compilation and functional characterization of IPF prognostic predictor gene set. a A flowchart illustrates the procedures and approaches used for IPF prognostic predictor gene set compilation. Left panel: Arrary data processing. Affymetrix Exon 1.0 ST Array data was normalized, probe sets mapped to U133 plus 2 Array, and filtered based on redundancy, intensity, and coefficient of variation across all samples. Middle panel: IPF prognostic predictor gene set compilation. Three approaches used to compile IPF prognostic predictor gene set: Co-expressed gene modules correlated with pulmonary function identified by WGCNA; Differentially expressed genes between “good” and “poor” prognosis patients identified by SAM (fold change > 1.5 & FDR < 2.5 %); Survival-correlated genes identified by Cox regression (p < 0.005). Right panel: Genomic model IPF prognosis prediction. IPF prognostic predictor gene set was used to construct a genomic model; Prognostic Index (PI) score was calculated from each patient in training cohort; Prediction specificity was assessed by 10-fold cross validation; Genomic model was validated in two independent cohorts using weights of PI calculated from training cohort. b Venn diagram illustrates the selection criteria for IPF prognostic predictor genes. A total of 118 genes were compiled for downstream data analyses. c Canonical pathways enriched from IPF prognostic predictor genes by Ingenuity Pathway Analysis software. Significant pathways were set with criterion of q-value < 0.05 (i.e. -log (q-value) > 1.3) using one-tailed Fisher’s exact test. X-axis represents -log (q-value)
List of 118 IPF prognostic predictor genes within the red, black and turquoise gene modules
| Gene | FC | Gene | FC | Gene | FC | Gene | FC |
|---|---|---|---|---|---|---|---|
| IL1R2¥ | 2.0 | PPWD1 | −1.7 | ASF1A | −1.6 | ABCD2 | −1.5 |
| ERAF§ | 2.0 | CETN3 | −1.6 | LMO7 | −1.6 | GZMK | −1.5 |
| CEACAM8¥ | 1.8 | SH2D1A | −1.6 | GCET2 | −1.6 | TRIM52 | −1.5 |
| ARG1¥ | 1.6 | SLC39A10 | −1.6 | PAQR8 | −1.6 | C8orf15 | −1.5 |
| FOXO3§ | 1.5 | SHPRH | −1.6 | BIRC3 | −1.6 | ITK | −1.5 |
| TNS1§ | 1.5 | WDR75 | −1.6 | CAMK4 | −1.6 | ICOS | −1.5 |
| CYP4F2¥ | 1.5 | C14orf64 | −1.6 | ZC3H6 | −1.6 | FHIT | −1.5 |
| CYP4F3¥ | 1.5 | KPNA5 | −1.6 | CD28 | −1.6 | TSEPA | −1.5 |
| ARHGAP5 | −1.8 | NOP58 | −1.6 | GTPBp0 | −1.6 | NPCDR1 | −1.5 |
| ORC3L | −1.8 | PARp5 | −1.6 | C5orf51 | −1.6 | OXNAD1 | −1.5 |
| ZNF100 | −1.8 | PRO0471 | −1.6 | TRBC1 | −1.6 | IL7R | −1.5 |
| UTp5 | −1.8 | RCAN3 | −1.6 | CAMK2D | −1.5 | HLA-DQA1 | −1.5 |
| ANKRD36B | −1.8 | C7orf64 | −1.6 | PPM1K | −1.5 | TMEM156 | −1.5 |
| LOC399753 | −1.8 | ANKRD36 | −1.6 | CCDC76 | −1.5 | HLA-DQA1 | −1.5 |
| KCNA3 | −1.8 | GPR174 | −1.6 | CASD1 | −1.5 | LOC401397 | −1.5 |
| RHOH | −1.8 | NDUFAF4 | −1.6 | pRY10 | −1.5 | CDK6 | −1.5 |
| LCK | −1.8 | CCDC141 | −1.6 | DPP4 | −1.5 | GCNT4 | −1.5 |
| C16orf52 | −1.7 | GPR18 | −1.6 | S1PR1 | −1.5 | NELL2 | −1.5 |
| TC2N | −1.7 | DDX60 | −1.6 | ITGA6 | −1.5 | FLJ33630 | −1.5 |
| HIVEp | −1.7 | TMEM209 | −1.6 | GBP4 | −1.5 | TRAT1 | −1.5 |
| KIF3A | −1.7 | GVIN1 | −1.6 | ABCE1 | −1.5 | LEF1 | −1.5 |
| IFT80 | −1.7 | TMEM161B | −1.6 | TXK | −1.5 | FCRL3 | −1.5 |
| TIA1 | −1.7 | USP53 | −1.6 | TRAF5 | −1.5 | GUSBL2 | −1.5 |
| ZNF83 | −1.7 | TRAJ17 | −1.6 | SLAMF6 | −1.5 | SEPSECS | −1.5 |
| SETDB2 | −1.7 | MRPL1 | −1.6 | CD96 | −1.5 | BTLA | −1.5 |
| WDR36 | −1.7 | SNORD116 | −1.6 | PRKACB | −1.5 | ||
| ZNF141 | −1.7 | GPR171 | −1.6 | ALG10B | −1.5 | ||
| TRBC1 | −1.7 | MGC40069 | −1.6 | NBPF10 | −1.5 | ||
| FAM69A | −1.7 | LOC439949 | −1.6 | MGAT4A | −1.5 | ||
| C1GALT1 | −1.7 | CCR7 | −1.6 | INPP4B | −1.5 | ||
| GIMAP5 | −1.7 | NUP43 | −1.6 | STAT4 | −1.5 |
FC = Fold change; ¥denotes genes in red module; §denotes genes in black module; the rest of genes are in turquoise module
Fig. 3Genomic model and 10-fold cross validation results. a A genomic model was constructed from the 118 IPF prognostic predictor genes using “Survival risk group prediction” algorithm implemented in BRB-ArrayTools (see Additional file 1) followed by 10-fold cross validation (CV) algorithm to calculate the misclassification rate. Formula of genomic model: Prognostic index (PI) = ∑W * X +13.5, where W and X represent the weight and log-intensity of the i-th gene in IPF prognostic predictor gene set compiled from training cohort, respectively. Misclassification rate (20 %) was determined by 10-fold CV and computed as (k + n)/total cases, where k represents the predicted high risk that are observed as good prognosis, and n represents the predicted low risk that are observed as poor prognosis. b IPF patients with predicted low (dotted line) and high risk (dashed line) stratified by prognostic index (PI) derived from each patient in training cohort based on the genomic model. The red line denotes 50 % probability of survival. PI independently predicted survival in univariate competing-risk Cox regression (Sub-hazard ratio (SHR) 2.7; 95 % CI 1.9-3.9; p < 0.001) and in multivariate competing-risk Cox regression after adjustment for baseline CPI (SHR 2.3; 95 % CI 1.5-3.4; p < 0.001)
Fig. 4IPF genomic model predicts prognosis in two independent validation cohorts. The prognosis prediction specificity was assessed in University of Chicago validation cohort (UCV, panel a) and University of Pittsburgh validation cohort (UPV, panel b). IPF patients with predicted low (dotted line) and high risk (dashed line) stratified by prognostic index (PI) derived from each patient in UCV and UPV cohorts based on the genomic model. The red line denotes 50 % probability of survival. PI significantly predicted survival in univariate competing-event Cox regression in the UCV (SHR 2.0; 95 % CI 1.2-3.4; p = 0.005) and UPV (SHR 1.8; 95 % CI 1.1-2.7; p = 0.01) cohorts. This association remained in the UCV (SHR 1.7; 95 % CI 1.04-2.93; p = 0.035) and UPV (SHR 1.9; 95 % CI 1.2-3.0; p = 0.005) cohorts after adjusting for baseline CPI in multivariate Cox regression