| Literature DB >> 32013930 |
Kazushi Matsumura1, Shigeaki Ito2.
Abstract
BACKGROUND: Chronic obstructive pulmonary disease (COPD) is combination of progressive lung diseases. The diagnosis of COPD is generally based on the pulmonary function testing, however, difficulties underlie in prognosis of smokers or early stage of COPD patients due to the complexity and heterogeneity of the pathogenesis. Computational analyses of omics technologies are expected as one of the solutions to resolve such complexities.Entities:
Keywords: Chronic obstructive pulmonary disease; Cigarette smoke; Classifier; Computational scoring; Gene expression; Logistic regression; Random forest
Mesh:
Substances:
Year: 2020 PMID: 32013930 PMCID: PMC6998147 DOI: 10.1186/s12890-020-1062-9
Source DB: PubMed Journal: BMC Pulm Med ISSN: 1471-2466 Impact factor: 3.317
Fig. 2Schematic diagram for identifying descriptive marker genes. The process for identifying descriptive marker genes for multi-classification and stepwise logistic regression analysis. Cig, cigarettes; CV, coefficient of variation; NI, normalized intensity value; NS, non-smokers; SMK, smokers; COPD, chronic obstructive pulmonary disease subjects
Overall summary of the publicly available datasets
| Study name | Study samples | Age | Sex | Sample type |
|---|---|---|---|---|
| E-MTAB-1690 | 14 NS, 27 SMK, and 21 COPD | 51.9 ± 8.69 | 53 male, 9 female | Respiratory tract |
| E-GEOD-20257 | 36 NS, 43 SMK, and 9 COPD | 42.8 ± 10.9 | 61 male, 27 female | Small airway |
| E-GEOD-8545 | 18 NS, 18 SMK, and 18 COPD | 45.7 ± 7.18 | 41 male, 13 female | Small airway |
NS non-smokers, SMK smokers, COPD COPD subjects
Fig. 1Identifying descriptive marker genes from in vitro exposure studies. a Venn diagram of up- and downregulated genes (|fold change| > 1.5, false discovery rate-corrected p < 0.05) following exposure to the aqueous extract of 3R4F smoke at a concentration of 0.5, 1.0, or 2.0 cigarettes/L. b Hierarchical clustering analysis with time-independently differentially expressed genes (DEGs) (false discovery rate-corrected p < 0.05) following exposure to each test substance. The orange box denotes up- or downregulated gene clusters. c Venn diagram of up- and down-regulated DEGs derived from (a) cigarette smoke exposure studies and (b) test substances exposure studies. Identified 15 descriptive genes is summarized in the right table of the Venn diagram. Cig, cigarettes
Known function and association with chronic obstructive pulmonary disease (COPD) for identified genes
| Gene | Known function of the gene product | References (PMIDs) |
|---|---|---|
| Upregulated genes | ||
| AREG | Member of the EGF family, which interacts with the EGF/TGF-alpha receptor to promote the growth of normal epithelial cells. | Stolarczyk M, et al. (27561911), Wang J, et al. (30291869) |
| CYP1B1 | Member of the cytochrome P450 superfamily of enzymes. High expression is induced by cigarette smoke exposure. | Liu C, et al. (29110844), Slowikowski BK, et al. (28858732) |
| DUSP6 | Dual-specificity protein phosphatase subfamily. It negatively regulates MAPK superfamily proteins, which are associated with cellular proliferation and differentiation. | – |
| PHLDA1 | Proline–histidine-rich nuclear protein that might play an important role in the anti-apoptotic effects of insulin-like growth factor-1. | – |
| SLC7A11 | Sodium-independent, high-affinity exchange of anionic amino acids with high specificity for the anionic forms of cystine and glutamate. | – |
| TXNIP | Thioredoxin-binding protein that inhibits the antioxidative function of thioredoxin, resulting in the accumulation of ROS and cellular stress. | – |
| WNT5A | Wnt family member 5A, ligand for members of the frizzled family of seven-transmembrane receptors. | Koopmans T, et al. (27468699), Baarsma HA, et al. (27979969) |
| ZBED2 | Zinc finger BED-type containing 2. | – |
| Downregulated genes | ||
| ADM | Preprohormone with several functions, including vasodilation, regulation of hormone secretion, promotion of angiogenesis, and antimicrobial activity. | Xu P, et al. (14720432), Meng DQ, et al. (24962223) |
| CXCR4 | CXC chemokine receptor specific for stromal cell-derived factor-1. | Weigold F, et al. (29566745), Karagiannis K, et al. (28804668) |
| EFNA1 | Member of the ephrin family. Its target receptors comprise the protein-tyrosine kinases, and it has been implicated in mediating developmental events. | – |
| EGLN3 | Hypoxia-inducible factor. Essential for the hypoxic regulation of neutrophilic inflammation and it has crucial role in DNA damage response. | – |
| FBXO32 | Fbox protein that functions in phosphorylation-dependent ubiquitination and subsequent proteasomal degradation. | – |
| HILPDA | Hypoxia-inducible lipid droplet-associated protein. Stimulates cytokine expression and enhances cell growth and proliferation. | – |
| IGFBP3 | Insulin-like growth factor binding protein family. It prolongs the half-life of IGFs and alters their interaction with cell surface receptors. | – |
The cited references describe the confirmation of the association of the selected genes with COPD or lung function, which were obtained by reviewing the literature using PubMed ((“COPD” OR “Lung Function”) AND “name of each selected gene”)
Fig. 3Expression value of identified genes in publicly samples. The box plot presents the normalized expression values of the 15 identified genes in publicly available samples for non-smokers (green), smokers (yellow), and chronic obstructive pulmonary disease (COPD) subjects (red). The box plot presents the median (line) and 25th and 75th percentiles (box); the whiskers are the 5th and 95th percentiles; and the outliers are denoted by open circles. One-way ANOVA with subsequent Tukey’s honest significant difference post-hoc analysis revealed differences between NS and SMK or COPD (*p < 0.05) and between SMK and COPD (†p < 0.05). NS, non-smokers; SMK, current smokers; COPD, COPD subjects
Multi-classification analysis with random forest (5-fold cross-validation repeated 100 times independently)
| Gene set | Original | Published | Extended | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Pred./Truth | NS | SMK | COPD | True rate | NS | SMK | COPD | True rate | NS | SMK | COPD | True rate |
| NS | 25.5 | 5.8 | 1.0 | 0.77 | 16.1 | 10.2 | 4.2 | 0.48 | 19.7 | 8.8 | 2.4 | 0.59 |
| SMK | 7.2 | 32.6 | 15.8 | 0.76 | 15.0 | 25.9 | 15.0 | 0.60 | 13.0 | 30.4 | 15.0 | 0.71 |
| COPD | 0.6 | 4.8 | 6.7 | 0.29 | 2.2 | 7.0 | 4.3 | 0.18 | 0.6 | 4.0 | 6.2 | 0.26 |
Classification analysis with random forest was performed using the identified 15 genes (Original) and previously published genes, including genes cited in > 10 (Published) or > 6 publications (Extended)
NS non-smokers, SMK smokers, COPD COPD subjects
The parameters calculated via stepwise logistic regression analysis
| Estimate | Std. Error | Z value | Pr(>|Z|) | |
|---|---|---|---|---|
| (Intercept) NS|SMK | 1.6715 | 0.4229 | 3.953 | 7.73E−05 |
| ADM | −2.2568 | 1.0247 | −2.202 | 0.027641 |
| AREG | 2.0152 | 1.0407 | 1.936 | 0.052820 |
| CXCR4 | −3.1177 | 1.4308 | − 2.179 | 0.029336 |
| EFNA1 | 3.7889 | 1.9572 | 1.936 | 0.052882 |
| EGLN3 | −4.0571 | 1.7627 | −2.302 | 0.021357 |
| FBXO32 | −3.8824 | 2.4113 | −1.610 | 0.107376 |
| HILPDA | 3.2193 | 0.8100 | 3.974 | 7.06E−05 |
| IGFBP3 | −8.2992 | 2.3747 | −3.495 | 0.000474 |
| SLC7A11 | −3.5355 | 1.2516 | −2.825 | 0.004730 |
| TXNIP | −5.6745 | 1.4851 | −3.281 | 1.33E−04 |
| WNT5A | 2.7391 | 0.8231 | 3.328 | 8.75E−04 |
| (Intercept) SMK|COPD | −0.8555 | 0.2144 | −3.990 | 6.61E−05 |
| AREG | −1.3039 | 0.7058 | −1.847 | 0.06469 |
| DUSP6 | 1.4688 | 0.6254 | 2.349 | 0.01885 |
| EFNA1 | 2.3861 | 0.9058 | 2.634 | 0.00843 |
| TXNIP | −0.8847 | 0.5167 | −1.712 | 8.69E−02 |
Fig. 4Potential risk factor calculation with publicly samples. a The box plot showing the potential risk factor (PRF) indices of non-smokers (green), smokers (yellow), and chronic obstructive pulmonary disease (COPD) subjects (red) in publicly available samples. The box plot presents the median (line) and 25th and 75th percentiles (box); the whiskers are the 5th and 95th percentiles; and the outliers are denoted by open circles. One-way ANOVA with subsequent Tukey’s honest significant difference post-hoc analysis revealed differences between NS and SMK (*p < 0.05) and between SMK and COPD (†p < 0.05). b The correlations of the PRF indices with pack-years and age in smokers and COPD subjects. The Pearson correlation coefficient (R) is shown in upper right of each image. NS, non-smokers; SMK, smokers; COPD, COPD subjects