| Literature DB >> 35993327 |
Qingzhou Guan1,2, Peng Zhao1,2, Yange Tian1,2, Liping Yang3, Zhenzhen Zhang1,2, Jiansheng Li2,4.
Abstract
It is essential to assess the cancer risk for patients with chronic obstructive pulmonary disease (COPD). Comparing gene expression data from patients with lung cancer (a total of 506 samples) and those with cancer-adjacent normal lung tissues (a total of 370 samples), we generated a qualitative transcriptional signature consisting of 2046 gene pairs. The signature was verified in an evaluation dataset comprising 18 subjects with severe disease and 52 subjects with moderate disease (Wilcoxon rank-sum test; p = 7.33 × 10-5). Similar results were obtained in other independent datasets. Among the gene pairs in the signature, 326 COPD stage-related gene pairs were identified based on Spearman's rank correlation tests and those gene pairs comprised 368 unique genes. Of these 368 genes, 16 genes were significantly dysregulated in COPD rat model data compared with control data. Some of these genes (Dhx16, Upf2, Notch3, Sec61a1, Dyrk2, and Hmmr) were altered when the COPD rat model was treated with traditional Chinese medicines (TCM), including Bufei Yishen formula, Bufei Jianpi formula, and Yiqi Zishen formula. Overall, the signature could predict the cancer incidence-risk of COPD and the identified key genes might provide guidance regarding both the treatment of COPD using TCM and the prevention of cancer in patients with COPD. KEY MESSAGESA cancer risk assessment signature was identified in patients with COPD.The signature is insensitive to batch effects and is well verified.COPD key genes identified in this study might play a crucial role in TCM treatment and cancer prevention.Entities:
Keywords: Chronic obstructive pulmonary disease; incidence-risk score; lung cancer; qualitative transcriptional characteristics; traditional Chinese medicines
Mesh:
Year: 2022 PMID: 35993327 PMCID: PMC9415445 DOI: 10.1080/07853890.2022.2112070
Source DB: PubMed Journal: Ann Med ISSN: 0785-3890 Impact factor: 5.348
Data analysed in this study.
| GEO No. | Genea | Platform | Normal sample size | Cancer sample size |
|---|---|---|---|---|
| GSE19804 | 20486 | Affymetrix GPL570 | 60 | 60 |
| GSE18842 | 20486 | Affymetrix GPL570 | 45 | 46 |
| GSE27262 | 20486 | Affymetrix GPL570 | 25 | 25 |
| GSE31210 | 20486 | Affymetrix GPL570 | 20 | 226 |
| GSE19188 | 20486 | Affymetrix GPL570 | 65 | 91 |
| GSE32863 | 25186 | Illumina GPL6884 | 58 | 58 |
| GSE31267 | 24384 | Illumina GPL6947 | 24 | – |
| GSE15197 | 18615 | Agilent GPL6480 | 13 | – |
| GSE40588 | 19595 | Agilent GPL6480 | 60 | – |
aThe number of genes detected in the corresponding dataset.
–: there is no sample in the corresponding category.
Figure 1.Analysis flowchart for this study.
Figure 2.Distribution of gene expression levels for the three gene pairs—BIRC5-ASPA (A), BARD1-PTPRB (B), and CCNA2-ACKR4 (C)—in GSE18842 and GSE27262 datasets. Horizontal coordinates represent cancer and normal lung tissue from datasets GSE18842 and GSE27262. Vertical coordinates represent the expression level of the corresponding gene.
Figure 3.Performance of the signature in COPD samples with different disease courses. Horizontal coordinates represent severe and moderate COPD samples from public database. Vertical coordinates represent the score of our signature in severe and moderate COPD samples. The Wilcoxon rank-sum test was applied to calculate the p values.
Differentially expressed genes between COPD rat model and control group.
| Gene symbol | FC |
|
|
|---|---|---|---|
|
| 0.530623 | −3.72567 | .003938 |
|
| 1.039416 | 3.475603 | .005965 |
|
| 1.015124 | 3.212685 | .00929 |
|
| 0.884189 | −3.09422 | .011362 |
|
| 1.012759 | 3.046952 | .012316 |
|
| 1.013586 | 3.017057 | .01296 |
|
| 0.975045 | −2.97443 | .01394 |
|
| 0.984546 | −2.5892 | .026987 |
|
| 0.923138 | −2.53068 | .029836 |
|
| 1.039209 | 2.52876 | .029934 |
|
| 0.688459 | −2.48504 | .032263 |
|
| 0.943282 | −2.48184 | .032441 |
|
| 0.718159 | −2.42208 | .035933 |
|
| 0.971469 | −2.28776 | .045186 |
|
| 1.043465 | 2.252802 | .047951 |
|
| 0.857672 | −2.24158 | .048873 |
FC: fold-change of the COPD rat samples compared with control samples; T: test statistic value between COPD rat and control samples using the Student’s t-tests.
Frequency of genes that were reversed between the treatment and model group.
| Gene symbol | FC (M_vs_C) | Num | Num | Num | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
| 0.530623 | −3.72567 | .003938 | 0.701224 | 5.52E-09 | 0.097164 | 5.50E-05 | 2 | 3 | 3 |
|
| 1.039416 | 3.475603 | .005965 | .027316 | .011337 | .058669 | .02064 | 3 | 4 | 4 |
|
| 1.015124 | 3.212685 | .00929 | .022126 | .043653 | .93704 | .822763 | 2 | 2 | 2 |
|
| 0.884189 | −3.09422 | .011362 | .78651 | .197268 | .106184 | .211585 | 0 | 0 | 2 |
|
| 1.012759 | 3.046952 | .012316 | .560379 | .689214 | .271842 | .713968 | 0 | 0 | 0 |
|
| 1.013586 | 3.017057 | .01296 | .010532 | .020327 | .118537 | .100091 | 2 | 2 | 4 |
|
| 0.975045 | −2.97443 | .01394 | .475693 | .000212 | .088364 | .020111 | 2 | 3 | 3 |
|
| 0.984546 | −2.5892 | .026987 | .728516 | .001616 | .317634 | .000346 | 2 | 2 | 2 |
|
| 0.923138 | −2.53068 | .029836 | .718672 | .000155 | .19656 | .000206 | 2 | 2 | 3 |
|
| 1.039209 | 2.52876 | .029934 | .202428 | .056216 | .006779 | .014547 | 2 | 3 | 3 |
|
| 0.688459 | −2.48504 | .032263 | .994814 | .124428 | .03643 | .016983 | 2 | 2 | 3 |
|
| 0.943282 | −2.48184 | .032441 | .492618 | .000315 | .509816 | .000261 | 2 | 2 | 2 |
|
| 0.718159 | −2.42208 | .035933 | .36409 | .814336 | .458153 | .089266 | 0 | 1 | 1 |
|
| 0.971469 | −2.28776 | .045186 | .138333 | .00598 | .351014 | .173025 | 1 | 1 | 3 |
|
| 1.043465 | 2.252802 | .047951 | .656511 | .036755 | .571989 | .012966 | 2 | 2 | 2 |
|
| 0.857672 | −2.24158 | .048873 | .547285 | .018171 | .724516 | .011406 | 2 | 2 | 2 |
FC: fold-change of COPD rat samples compared with control samples; T: test statistic value between COPD rat and control samples using the Student’s t-tests; p_value: p value between the corresponding two group samples (including Model vs Control, BYF vs Model, BJF vs Model, YZF vs Model, and APL vs Model) using Student’s t-tests; Num: number of the corresponding genes occurring in the four treatment protocols with one certain threshold.