| Literature DB >> 30593281 |
Vincent M Tutino1,2, Kerry E Poppenberg1,2, Lu Li3, Hussain Shallwani4, Kaiyu Jiang5, James N Jarvis5,6, Yijun Sun5,7, Kenneth V Snyder1,4,8,9, Elad I Levy1,4,8, Adnan H Siddiqui1,4,8, John Kolega1,10, Hui Meng11,12,13,14.
Abstract
BACKGROUND: Intracranial aneurysms (IAs) are dangerous because of their potential to rupture and cause deadly subarachnoid hemorrhages. Previously, we found significant RNA expression differences in circulating neutrophils between patients with unruptured IAs and aneurysm-free controls. Searching for circulating biomarkers for unruptured IAs, we tested the feasibility of developing classification algorithms that use neutrophil RNA expression levels from blood samples to predict the presence of an IA.Entities:
Keywords: Inflammation; Intracranial aneurysm; Machine learning; Neutrophils; Transcriptomics
Mesh:
Substances:
Year: 2018 PMID: 30593281 PMCID: PMC6310942 DOI: 10.1186/s12967-018-1749-3
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Clinical characteristics
| Training cohort | Testing cohort | |||
|---|---|---|---|---|
| Control (n = 15) | Aneurysm (n = 15) | Control (n = 5) | Aneurysm (n = 5) | |
| Age (mean ± SE) | 59 ± 4.8 | 63 ± 2.8 | 63 ± 7.2 | 52.6 ± 6.6 |
| Age [median (Q1/Q3)] | 61 (52.5/71.5) | 64 (56.5/68.5) | 68 (62/71) | 53 (47/54) |
| Sex (number of patients) | ||||
| Female | 40% | 66.67% | 60% | 40% |
| Smoker (number of patients) | ||||
| Yes | 0% | 20% | 40% | 60% |
| Comorbidities (number of patients) | ||||
| Hypertension | 60% | 60% | 60% | 20% |
| Heart disease | 6.67% | 26.67% | 40% | 0% |
| High cholesterol | 26.67% | 40% | 60% | 0% |
| Stroke history | 6.67% | 0% | 0% | 0% |
| Diabetes | 33.33% | 20% | 20% | 0% |
| Osteoarthritis | 20% | 33.33% | 20% | 0% |
Clinical characteristics of the randomly-created training and testing cohorts. With the exception of age, these factors were quantified as binary data points. The clinical factors were retrieved from the patients’ medical records via the latest “Patient Medical History” form administered prior to imaging
Fig. 1Neutrophil RNA expression differences between patients with intracranial aneurysms (IA) and IA-free controls, feature selection for classification model creation, and model training. a Transcriptome profiling demonstrated 95 differently expressed transcripts (q-value < 0.05) between patients with IA and controls. Of these, 26 had a false discovery rate (FDR) < 0.05 and an absolute fold change ≥ 1.5 (in red). b Principal component analysis (PCA) using these 26 transcripts demonstrated general separation between samples from patients with IA (60%, circled in red) and those from controls (80%, circled in blue). c Estimation of model performance during leave-one-out (LOO) cross-validation in the training cohort demonstrated that most models performed with an accuracy of 0.50–0.73. Among the classification models, diagonal linear discriminant analysis (DLDA) had the highest combination of sensitivity, specificity, and accuracy (0.67, 0.80, 0.73 respectively). d Receiver operating characteristic (ROC) analysis using classifications in the training dataset showed that the models had areas under the curve of 0.54 (support vector machines [SVM]) to 0.73 (DLDA). (F-C: fold-change; ABS: absolute value; Cosine NN: cosine nearest neighbors; NSC: nearest shrunken centroids)
Gene ontology (GO) analysis
| Category | GO term | Description | p-value | q-value |
|---|---|---|---|---|
| Transcripts with higher expression in intracranial aneurysms (IA) | ||||
| Process | GO:0031347 | Regulation of defense response | 5.11E−06 | 0.0658 |
| Process | GO:0050727 | Regulation of inflammatory response | 1.01E−05 | 0.0652 |
| Process | GO:0019934 | cGMP-mediated signaling | 3.77E−05 | 0.162 |
| Process | GO:0032101 | Regulation of response to external stimulus | 3.90E−05 | 0.125 |
| Process | GO:0031348 | Negative regulation of defense response | 4.45E−05 | 0.115 |
| Process | GO:0050728 | Negative regulation of inflammatory response | 5.21E−05 | 0.112 |
| Process | GO:0007165 | Signal transduction | 6.64E−05 | 0.122 |
| Function | GO:0004908 | Interleukin-1 receptor activity | 2.25E−06 | 0.00858 |
| Function | GO:0004872 | Receptor activity | 7.22E−05 | 0.138 |
| Function | GO:0060089 | Molecular transducer activity | 7.22E−05 | 0.092 |
| Function | GO:0038023 | Signaling receptor activity | 1.32E−04 | 0.127 |
| Transcripts with lower expression in IA | ||||
| Function | GO:0043295 | Glutathione binding | 1.16E−04 | 0.148 |
| Function | GO:0046906 | Tetrapyrrole binding | 1.40E−04 | 0.134 |
Gene set enrichment analysis was performed on the 95 significantly differentially expressed genes (q < 0.05) in peripheral blood samples obtained from patients with intracranial aneurysms (IA). Significantly enriched ontologies with a false discovery rate (FDR) adjusted p-value (q-value) < 0.20 were considered (FDR of 20%). Transcripts with higher expression in IA demonstrated regulation of inflammatory and defense responses, signaling, and cell motility. Significantly enriched ontologies in transcripts with lower expression in IA demonstrated regulation of glutathione and tetrapyrrole binding
The 26 transcripts selected for classification model training
| Transcript | Gene ID | Accession no. | Log2 (F-C) | p-value | q-value |
|---|---|---|---|---|---|
|
| 5819 | NM_002856.2 | 2.27 | 5.54E−12 | 6.94E−09 |
|
| 1545 | NM_000104.3 | 1.53 | 4.13E−10 | 3.88E−07 |
|
| 57126 | NM_020406.3 | 1.48 | 8.04E−06 | 2.91E−03 |
|
| 5152 | NM_002606.2 | 1.45 | 5.67E−05 | 9.90E−03 |
|
| 221481 | NM_145028.4 | 1.37 | 1.38E−12 | 2.07E−09 |
|
| 55301 | NM_018324.2 | 1.15 | 1.71E−11 | 1.83E−08 |
|
| 96764 | NM_024831.7 | 1.02 | 1.72E−14 | 4.31E−11 |
|
| 9332 | NM_004244.5 | 0.98 | 2.65E−09 | 1.99E−06 |
|
| 100506229 | NR_039975.1 | 0.96 | 1.23E−05 | 3.55E−03 |
|
| 100506658 | NM_002538.3 | 0.85 | 4.07E−07 | 2.37E−04 |
|
| 10501 | NM_032108.3 | 0.80 | 7.62E−05 | 1.19E−02 |
|
| 84830 | NM_001143948.1 | 0.77 | 1.61E−05 | 4.47E−03 |
|
| 23078 | NM_015058.1 | 0.70 | 2.56E−06 | 1.20E−03 |
|
| 100463488 | NM_001190708.1 | 0.63 | 1.21E−05 | 3.55E−03 |
|
| 3212 | NM_002145.3 | 0.62 | 6.25E−05 | 1.02E−02 |
|
| 4072 | NM_002354.2 | 0.60 | 1.02E−05 | 3.50E−03 |
|
| 8809 | NM_003855.3 | 0.59 | 1.17E−05 | 3.55E−03 |
|
| 147710 | NM_001205280.1 | − 0.80 | 5.87E−05 | 9.94E−03 |
|
| 9536 | NM_004878.4 | − 0.91 | 4.78E−05 | 8.98E−03 |
|
| 50486 | NM_015714.3 | − 0.96 | 6.71E−06 | 2.66E−03 |
|
| 83416 | NM_031281.2 | − 1.26 | 4.31E−06 | 1.80E−03 |
|
| 400793 | NM_001135240.1 | − 1.51 | 1.27E−14 | 4.31E−11 |
|
| 10911 | NM_021995.2 | − 1.93 | 8.85E−14 | 1.66E−10 |
|
| 3048 | NM_000184.2 | − 1.97 | 6.62E−10 | 5.53E−07 |
|
| 56603 | NM_019885.3 | − 2.99 | 4.32E−07 | 2.37E−04 |
|
| 10882 | NM_006688.4 | − 3.25 | 5.16E−22 | 3.88E−18 |
Significantly differentially expressed transcripts with FDR < 0.05 and absolute fold-change ≥ 1.5. (F-C: fold-change)
Fig. 2Performance of the four classification models during model testing. a PCA using the 26 transcripts showed general separation between patients with IA (100%, circled in red) and controls (80%, circled in blue). b Validation of the classification models in an independent testing cohort of patients demonstrated that DLDA had the best performance, with sensitivity, specificity, and accuracy of 0.80, 1.0, and 0.90, respectively. c ROC analysis in the testing cohort also showed that DLDA had the best area under the curve (AUC) (0.80)
Fig. 3Assessment of model performance by LOO cross-validation of all data, and positive predictive value (PPV)/negative predictive value (NPV). a Estimation of model performance showed that the models performed with an accuracy of 0.63–0.80. DLDA had the highest combination of sensitivity, specificity, and accuracy (0.65, 0.95, 0.80, respectively). b ROC analysis demonstrated that the models had AUC of 0.68 (NSC) to 0.84 (DLDA). c Plot showing the PPV of all models across all possible prevalence. The blue region in the figure represents the range of IA prevalence reported in the current literature. The best performing model (DLDA) had the highest PPV, and cosine NN demonstrated the poorest PPV. d The DLDA model also had the best NPV, but only slightly better than that of the cosine NN, NSC, and SVM models
Fig. 4Validation of RNA-Sequencing data for seven transcripts by quantitative polymerase chain reaction (qPCR). Six of seven differentially expressed transcripts in samples from patients with and without IA were also differentially expressed in neutrophils in the qPCR in an independent cohort. This demonstrates consistent expression differences between patients with IA and controls in ~ 86% (6/7) of the tested transcripts
Fig. 5Comparison of fold-change in expression in patients with “small” (< 5 mm) IAs vs. control and patients with “large” (≥ 5 mm) IAs vs. control. The plot shows the fold-change (F-C) in expression of the 26 classifier transcripts identified in the training cohort (n = 30—black line) compared to those for “small” IAs (vs. control—green) and “large” IAs (vs. control—orange). Expression changes were more pronounced in both the positive and negative direction in patients with larger IAs. Fold-changes across all 26 transcripts in the “large” group were on average 24% higher than those for the training cohort, while fold-changes for the “small” group F-C were on average 35% lower