| Literature DB >> 32097409 |
Juan M Cubillos-Angulo1,2,3, Eduardo R Fukutani1, Luís A B Cruz1,3,4, María B Arriaga1,2,3, João Victor Lima1, Bruno B Andrade1,2,3,4,5,6, Artur T L Queiroz1, Kiyoshi F Fukutani1,3,4.
Abstract
BACKGROUND: Cigarette smoking is associated with an increased risk of developing respiratory diseases and various types of cancer. Early identification of such unfavorable outcomes in patients who smoke is critical for optimizing personalized medical care.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32097409 PMCID: PMC7041805 DOI: 10.1371/journal.pone.0222552
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1PRISMA flow chart of the microarray meta-analysis.
Selection of eligible GEO datasets for systems biology analysis according to PRISMA 2019 flow diagram.
Clinical and demographic characteristics of the study participants included in each dataset evaluated.
| Characteristics | Smoking datasets | p-value | Cancer dataset | ||||
|---|---|---|---|---|---|---|---|
| Discovery datasets | Validation datasets | ||||||
| GSE3320 | GSE4498 | GSE20257 | GSE17905 | GSE13931 | GSE19804 | ||
| Age, mean (SD) | 36.8 (5.6) | 43.0 (6.1) | 43.6 (9.9) | 42.4 (8.6) | 42.0 (7.0) | 0.1549 | 61.2 (10.2) |
| Gender, Male, n (%) | 7 (63.6%) | 17 (77.3%) | 95 (70.3%) | 107 (68.2%) | 73 (75.3%) | 0.6995 | 0 (0.0%) |
| Ethnic, n (%) | 0.9476 | ||||||
| Black | 4 (36.4%) | 11 (50.0%) | 67 (49.7%) | 86 (54.8%) | 56 (57.7%) | 0 (0.0%) | |
| White | 5 (45.5%) | 9 (40.9%) | 44 (32.6% | 46 (29.3%) | 32 (33.0%) | 0 (0.0%) | |
| Hispanic/Latino | 2 (18.2%) | 2 (9.1%) | 21 (15.5%) | 21 (13.4%) | 10 (10.3%) | 0 (0.0%) | |
| Afro-Hispanic | 0 (0.0%) | 0 (0.0%) | 1 (0.7%) | 2 (1.3%) | 0 (0.0%) | 0 (0.0%) | |
| Asian | 0 (0.0%) | 0 (0.0%) | 2 (1.5%) | 2 (1.3%) | 0 (0.0%) | 60 (100.0%) | |
| Smoke status, n (%) | 0.6102 | ||||||
| non-smoker | 5 (45.5%) | 12 (45.5%) | 53 (39.3%) | 67 (42.7%) | 38 (39.2%) | 0 (0.0%) | |
| smoker | 6 (54.5%) | 10 (54.5%) | 59 (43.7%) | 90 (57.3%) | 60 (61.9%) | 0 (0.0%) | |
| COPD, n (%) | 0 (0.0%) | 0 (0.0%) | 23 (17.8%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | |
| Lung Cancer, n (%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 60 (100.0%) | |
COPD: Chronic obstructive pulmonary disease.
Fig 2Differentially expressed genes associated with cigarette smoking.
We analyzed publicly available data of 2 datasets of small airways transcriptome (RNAseq). (A) A principal component analysis (PCA) model of 13,516 genes was used to distinguish smokers from nonsmokers. (B) Volcano plot of all genes (smoker vs. nonsmokers). (C) 22 differentially expressed genes (DEGs), defined as p<0.05 after 1%FDR and 1.0-fold change expression, were found and together were able to discriminate the clinical conditions.
Detailed information obtained from the ROC curve analysis used in the study.
| Dataset | Tissue | Genes/signature | AUC | 95% CI | p-value | Sensibility | 95% CI | Specificity | 95% CI |
|---|---|---|---|---|---|---|---|---|---|
| GSE17905 | Small and large airway bronchial epithelium | 22-gene | 0.864 | 0.808–0.989 | <0.0001 | 83.87 | 66.2%-94.5% | 95.24 | 76.1%-99.8% |
| GSE20257 | Small airway bronchial epithelium | 22-gene | 0.862 | 0.845–0.973 | <0.0001 | 69.05 | 52.9%-82.3% | 98.04 | 89.5%-99.9% |
| GSE13931 | Alveolar Macrophages | 22-gene | 0.607 | 0.396–0.740 | 0.4236 | 80.00 | 61.4%-92.2% | 42.11 | 20.2%-66.5% |
| GSE19804 | Lung tissue | AKR1B10 | 0.760 | 0.720–0.880 | <0.0001 | 35.00 | 23.1%-48.4% | 98.31 | 90.9%-99.9% |
*Smokers versus nonsmokers comparison
**Cancer versus non cancer comparison
Fig 3Gene pathway analysis in smokers and nonsmokers.
(A) Co-expressed modules of all genes. Circle sizes are proportional to the normalized enrichment scores (NES). (B) The modules were annotated using Keg package for R. Dashed lines represent significance threshold. (C) Hierarchical cluster analysis (Ward’s method) using the NES scores for each annotated module and calculated for each person was employed test discrimination between smokers and nonsmokers.
Fig 4Defining the molecular signatures of smoking.
(A) Data on the 22 DEGs found in our discovery analyses were used to validate discrimination between smokers and nonsmokers in 3 different previously published datasets. (B) Machine-learning decision trees were built for each dataset to describe the most relevant genes driving discrimination. Of note, the gene AKR1B10 was found to be the main discriminator in 3 out of the 4 datasets examined. (C) Scatter plots of the AKR1B10 gene expression in the 4 datasets. (D) Venn diagram of the DEGs in each dataset shows AKR1B10 in the intersection of 3 datasets extracted from lung tissue specimens but not included among DEGs from alveolar macrophages. *p<0.05 (Student’s t-test).
Fig 5In nonsmokers, higher AKR1B10 expression is detected in lung cancer.
(A) We analyzed AKR1B10 gene expression values in a published dataset of neoplastic lung tissue microarray in nonsmoking individuals who were diagnosed with lung cancer and compared to ipsilateral healthy lung tissue specimens (controls.) Scatter plots of AKR1B10 gene expression in the groups. *p<0.05 (Student’s t-test). (B) Receiver Operator Characteristics (ROC) indicated a high accuracy to discriminate cancer tissue from controls.