| Literature DB >> 29016970 |
Vincent Gardeux1,2,3, Joanne Berghout1,2,3, Ikbel Achour1,2,3, A Grant Schissler1,2,3,4, Qike Li1,2,3,4, Colleen Kenost3, Jianrong Li3, Yuan Shang1,2,3,5, Anthony Bosco6, Donald Saner1,3,7, Marilyn J Halonen8, Daniel J Jackson9, Haiquan Li1,3, Fernando D Martinez2,10, Yves A Lussier1,2,3,11.
Abstract
OBJECTIVE: To introduce a disease prognosis framework enabled by a robust classification scheme derived from patient-specific transcriptomic response to stimulation.Entities:
Keywords: HRV stimulation; PBMC stimulated; asthma; dynamic expression; gene-by-environment; genomic classifier; pathways; personal transcriptome; precision medicine; prognostic; virogram
Mesh:
Substances:
Year: 2017 PMID: 29016970 PMCID: PMC6080688 DOI: 10.1093/jamia/ocx069
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1.Overview of the experimental and analytic design of the study. We hypothesize that a classifier based on the pathway-level transcriptional responses that differ between symptomatic and asymptomatic responses to HRV infection in healthy patients can predict which asthmatic patients will have exacerbations over a 1-year follow-up period, based on those patients’ transcriptomic responses to an in vitro HRV stimulation assay. Panel A illustrates the development of a classifier using innovative features. Both shared and unique genetic and nongenetic variables influence transcript expression and healthy vs disease state within an individual. Exposure to a relevant stimulus (here, HRV infection) reveals relevant pathway gene sets whose genome and environmentally informed responses (G × E response pathways) can be used to predict individual prognosis. Panel B describes the development of classifier from the training set using data from PBMCs of healthy volunteers exposed to HRV in vivo. For each patient, paired microarrays analyzing gene expression before and after HRV exposure were compared using N-of-1-pathways analysis to identify significant Gene Ontology biological process (GO-BP) features describing each response. Responses in asymptomatic patients were then compared to responses in symptomatic patients to develop the classifier. Panel C describes the laboratory stimulation assay of asthmatic patients' PBMCs in this study. As in the training set, paired microarrays for each patient were used to determine the response using N-of-1-pathways. The classifier developed in panel B was then applied to individual responses and used to predict recurrent exacerbation.
Descriptions of datasets
| Dataset | Training set | Validation set | |
| Purpose | Learn classifier | Predict outcome | |
| Source | Authors | Zaas et al. | Present study |
| Data Source | GSE17156 (downloaded 9/17/2014) | GSE68479 | |
| Platform | Microarray | Affy. Human Gene U133A 2.0 | Affy. Human Gene 1.0ST33297 |
| Probe | 22277 | 33297 | |
| Protocol | Inhaled HRVPBMC samples drawn before and 48 h after HRV inoculation | PBMCs isolated, then incubated in vitro with HRV (stimulated) or vehicle (unstimulated) | |
| Subjects | Total | 19 healthy adult volunteers | 23 pediatric asthmatic patients |
| □ 10 symptomatic for common cold | □ 12 recurrent exacerbations of asthma (hospitalizations and/or emergency room visits) | ||
| □ 9 asymptomatic | □ 11 no exacerbation of asthma | ||
| Samples | Samples: | 38 RNA microarrays | 46 RNA microarrays |
| □ 19 PBMC drawn prior to infection | □ 23 PBMC unstimulated | ||
| □ 19 PBMC drawn after HRV inhalation | □ 23 PBMC HRV-stimulated in vitro |
Gene ontology (GO) gene sets of the G × E classifier obtained in the training set
| Class of GO-BP gene sets | GO ID | GO term |
|---|---|---|
| I. Acquired Immunity | GO:0002768* | immune response–regulating cell surface receptor |
| ↪GO:0002429 | immune response–activating cell surface receptor | |
| ↪GO:0050851 | antigen receptor–mediated signaling pathway | |
| ↪GO:0050852 | T cell receptor signaling pathway | |
| II. Innate Immune Response | GO:0045087* | innate immune response |
| ↪GO:0060337 | type I interferon-mediated signaling pathway | |
| (p)↪GO:0034340 | response to type I interferon | |
| (i)↪GO:0071357 | cellular response to type I interferon | |
| III. Morphogenesis | GO:0050807 | regulation of synapse organization |
| GO:0001658 | branching involved in ureteric bud morphogenesis | |
| GO:0060688 | regulation of morphogenesis of a branching structure | |
| GO:2000027 | regulation of organ morphogenesis | |
| IV. Response to Stimulus | GO:0043279 | response to alkaloid |
| GO:0050795 | regulation of behavior | |
| GO:0009581 | detection of external stimulus | |
| V. Chromatin Organization | GO:0006325* | chromatin organization |
| ↪GO:0016568 | chromatin modification | |
| ↪GO:0016569 | covalent chromatin modification | |
| ↪GO:0016570 | histone modification | |
| GO:0006913 | nucleocytoplasmic transport |
Twenty GO biological processes (GO-BPs) responsive to HRV stimulation were selected by the best classifier after evaluation in the validation set (Materials and Methods section). GO-BPs were organized into 5 categories by an unbiased information theoretic similarity score (Materials and Methods section), and these classes were manually assigned representative names. Of note, some GO-BPs were ordered according to their GO hierarchy when available, with the parent term annotated with an asterisk and located above the child term. For example, GO:0002768 is the parent of GO:0002429. Legend: ↪ = “is a” (parent-child relationship); (p)↪ = “part of” (parent-child relationship); (i)↪inferred as “is a” (parent-child relationship).
Figure 2.Metrics derived from responsive pathways discriminate asymptomatic from symptomatic subjects in the training set of in vivo HRV-stimulation data. In panel A, principal component analysis was conducted using responsive gene sets derived from each subject in the training set (Figure 1B and Materials and Methods section). The scatter plot on the left illustrates the bivariate relationship of the first and third principal components for each patient’s ternary-represented N-of-1-pathways scores derived from paired samples of PBMCs collected at baseline and after HRV exposure. Each point represents a subject in the training set through a linear combination of pathway gene set–level scores that explain the maximal variation in the data (see Materials and Methods section for details on the ternary representation and PCA construction). The first and third principal components show 2 clusters emerging that separate asymptomatic from symptomatic individuals. Thus, N-of-1-pathways scores are associated with the phenotype of interest. On the right, side-by-side box plots display the first principal component scores among asymptomatic and symptomatic subjects, and this component alone significantly dichotomizes the 2 phenotypes (Mann-Whitney U test, P = .0069). Panel B lists the 20 Gene Ontology biological processes used as features in our classifier, organized according to broad biological function categories.
Figure 3.Fully specified classifier derived from responsive pathways in training set performs accurately in independent asthmatic validation/virogram set to predict exacerbation status. Panel A shows receiver operating curve (ROC) of the G × E classifier conducted in the validation set (overall accuracy, 74%; sensitivity, 75%; specificity, 73%). In panel B, the star plots illustrate the level of response to HRV stimulation for each pathway in that patient. The classifier is designed using 20 pathways, with each radial line representing the score of a pathway. The area above and below the gray zone represents upregulation and downregulation, respectively, of any given pathway (see Materials and Methods section for complete details). In panel C, each star plot represents a single subject, with label appearing above (eg, Subject 16 = SUB16). The star plot is located in the quadrant of the contingency table that represents the performance of the G × E classifier on predicting the clinical progression from a specific asthmatic patient’s data. This classifier prediction applied to the HRV-stimulation assay recapitulated the clinical progression in 17 out of the 23 asthmatic patients; 6 were misclassified: 3 false positives (SUBM1, SUB09, SUB23) and 3 false negatives (SUB2, SUB23, and SUBM2). One can straightforwardly identify that innate immunity (mauve) is upregulated in every asthmatic patient and does not contribute to the classification in the validation set. On the other hand, observed upregulation in acquired immunity (orange) pathways could be used to correctly classify 19 subjects.
Performance of 5 different pathway response classifiers built on N-of-1-pathway scores in the training set (in vivo) and validated without bias on N-of-1-pathway scores in the validation set (in vitro). The Random Forest (bold) classifier showed the highest scores and was chosen for all other analyses.
| Classifier | #Features (GO-BP terms) | Classification performance in the validation set | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Accuracy | Sensitivity | Specificity | Precision | AUC | TP | FP | FN | TN | ||
| Naïve Bayes | 20 | 75 | 72.7 | 75 | 67.4 | 9 | 3 | 3 | 8 | |
| Decision Tree | 20 | 75 | 54.5 | 64.3 | 64.8 | 9 | 5 | 3 | 6 | |
| Support Vector Machine | 20 | 75 | 63.6 | 69.2 | 69.3 | 9 | 4 | 3 | 7 | |
| Nearest Neighbor | 20 | 66.7 | 72.7 | 72.7 | 69.7 | 8 | 3 | 4 | 8 | |
The random forest and naïve Bayes classifiers showed the highest metrics on the validation set, and random forest was selected for further optimization.
TP: true positive; FP: false positive; FN: false negative; TN: true negative
Our classifier performed optimally across all metrics using 20 gene-set (GO-BP term) features
| Classifier | #Features | Classification performance in the validation set | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Accur | Sens. | Spec. | Prec. | AUC | TP | FP | FN | TN | ||
| Random Forest | 10 | 47.8 | 0 | 100 | NA | 68.2 | 0 | 0 | 12 | 11 |
| Random Forest | 20 | 73.9 | 75.0 | 72.7 | 75.0 | 71.2 | 9 | 3 | 3 | 8 |
| Random Forest | 30 | 69.5 | 58.3 | 81.8 | 77.8 | 65.5 | 7 | 2 | 5 | 9 |
TP: true positive; FP: false positive; FN: false negative; TN: true negative