| Literature DB >> 35097114 |
Jun Wang1, Peng Chen1, Mingyang Su1, Guocheng Zhong2, Shasha Zhang1, Deming Gou1.
Abstract
Immunotherapy has been widely used in the treatment of lung cancer, and one of the most effective biomarkers for the prognosis of immunotherapy currently is tumor mutation burden (TMB). Although whole-exome sequencing (WES) could be utilized to assess TMB, several problems prevent its routine clinical application. To develop a simplified TMB prediction model, patients with lung adenocarcinoma (LUAD) in The Cancer Genome Atlas (TCGA) were randomly split into training and validation cohorts and categorized into the TMB-high (TMB-H) and TMB-low (TMB-L) groups, respectively. Based on the 610 differentially expressed genes, 50 differentially expressed miRNAs and 58 differentially methylated CpG sites between TMB-H and TMB-L patients, we constructed 4 predictive signatures and established TMB prediction model through machine learning methods that integrating the expression or methylation profiles of 7 genes, 7 miRNAs, and 6 CpG sites. The multiomics model exhibited excellent performance in predicting TMB with the area under curve (AUC) of 0.911 in the training cohort and 0.859 in the validation cohort. Besides, the significant correlation between the multiomics model score and TMB was observed. In summary, we developed a prognostic TMB prediction model by integrating multiomics data in patients with LUAD, which might facilitate the further development of quantitative real time-polymerase chain reaction- (qRT-PCR-) based TMB prediction assay.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35097114 PMCID: PMC8794677 DOI: 10.1155/2022/2698190
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1Flowchart of the analysis process in this study. TMB: tumor mutation burden; TMB-H: TMB-high; TMB-L: TMB-low; PCA: principal component analysis; LASSO: least absolute shrinkage and selection operator; ROC: receiver operating characteristic.
Clinical information of 522 TCGA-LUAD patients.
| Variables | Statistics |
|---|---|
| Gender | |
| Male (%) | 242 (46.4%) |
| Female (%) | 280 (53.6%) |
| Age | |
| 80~89 (%) | 30 (5.8%) |
| 70~79 (%) | 150 (28.7%) |
| 60~69 (%) | 146 (28.0%) |
| 50~59 (%) | 83 (16.0%) |
| 40~49 (%) | 25 (4.8%) |
| 30~39 (%) | 2 (0.4%) |
| Not reported (%) | 86 (16.3%) |
| Race | |
| White (%) | 393 (75.3%) |
| Black or African American (%) | 53 (10.2%) |
| Asian (%) | 8 (1.5%) |
| American Indian or Alaska native (%) | 1 (0.2%) |
| Not reported (%) | 67 (12.9%) |
| Status | |
| Alive (%) | 334 (64.0%) |
| Dead (%) | 188 (36.0%) |
| Tumor stage | |
| I (%) | 279 (53.4%) |
| II (%) | 124 (23.8%) |
| III (%) | 85 (16.3%) |
| IV (%) | 26 (5.0%) |
| Not reported (%) | 8 (1.5%) |
LUAD: lung adenocarcinoma.
Figure 2Division of patients with LUAD into TMB-H and TMB-L subgroups. (a) The distribution of TMB in patients with LUAD; (b) the number of TMB-H and TMB-L patients with LUAD; (c) the distribution of TMB across different tumor stages. TMB: tumor mutation burden; LUAD: lung adenocarcinoma; TMB-H: TMB-high; TMB-L: TMB-low; OS: overall survival.
Figure 3Multiomics data obtained from TCGA for patients with LUAD. (a) 440 patients with LUAD were found having coupled WES, DNA methylation, RNA-seq and miRNA-seq data; (b) 148 patients were classified as TMB-H and 292 patients were classified as TMB-L. WES: whole-exome sequencing; TMB-H: TMB-high; TMB-L: TMB-low.
Figure 4The landscape of tumor-infiltrating immune cells in TMB-H patients and TMB-L patients. (a) Relative proportions of infiltrating immune cells in TMB-H patients and TMB-L patients; (b) correlation matrix of all the proportions of 6 detected immune cell types. TMB-H: TMB-high; TMB-L: TMB-low.
Figure 5Characterization of the top50 differential expressed genes, miRNAs, and differential methylated CpG sites between TMB-H and TMB-L patients. Volcano plot showed the differentially expressed genes (a) and miRNAs (c) or differentially methylated CpG sites (e) between TMB-H and TMB-L patients. The red dots represent upregulated genes, miRNAs, or hypermethylated CpG sites; the blue dots represent downregulated genes, miRNAs, or hypomethylated CpG sites; the black dots represent genes, miRNAs, or CpG sites with no significantly differential expression or methylation. Hierarchical clustering heatmap of differentially expressed genes (b) and miRNAs (d) or differentially methylated CpG sites (f) between TMB-H and TMB-L patients. Orange indicates the upregulated genes, miRNAs, or hypermethylated CpG sites; blue indicates the downregulated genes, miRNAs, or hypomethylated CpG sites. TMB-H: TMB-high; TMB-L: TMB-low.
Figure 6LASSO regression analysis for 4 possible prediction biomarker signatures. 10-fold cross-validation in LASSO regression analysis for gene signature (a), miRNA signature (b), CpG site signature (c), and multiomics signature (d). LASSO: least absolute shrinkage and selection operator; AUC: area under curve.
The performance of 4 optimal biomarker signatures obtained by LASSO regression analysis.
| Biomarker signature | Optimal biomarkers | lambda.min | Measure |
|---|---|---|---|
| Gene | GTF2IRD1, FTSJ1, CHMP4B, KLC3, DMAC2, GIT1, SOHLH2, SYNGR3, SAP130, LRRC1, FN3KRP, POU4F1, ZNF526, KRT80, UBE2C, FOXE1, MEX3D, CIDECP1, PRR19, DHX16, FANCG, AC010632.1, AC019171.1 | 0.018 | 0.884 |
| miRNA | hsa-miR-22-5p, hsa-miR-486-5p, hsa-miR-492, hsa-miR-561-5p, hsa-miR-151b, hsa-miR-3677-5p, hsa-miR-3923, hsa-miR-4425, hsa-miR-4434, hsa-miR-4536-5p, hsa-miR-4679, hsa-miR-5702, hsa-miR-6727-5p, hsa-miR-6858-5p, hsa-miR-7107-5p, hsa-let-7 g-3p, hsa-miR-136-3p, hsa-miR-155-3p, hsa-miR-371a-3p, hsa-miR-491-3p, hsa-miR-432-3p, hsa-miR-574-3p, hsa-miR-3074-3p, hsa-miR-3622b-3p, hsa-miR-3679-3p, hsa-miR-3150b-3p, hsa-miR-4639-3p, hsa-miR-4655-3p, hsa-miR-6798-3p, hsa-miR-6847-3p | 0.017 | 0.734 |
| CpG site | cg01862650, cg02031308, cg02916472, cg07184316, cg07729440, cg10120778, cg10488199, cg11002952, cg20151576, cg20297017, cg20671274, cg21827634, cg22773522, cg23049130, cg25841348 | 0.015 | 0.845 |
| Multiomics | cg01862650, cg07729440, cg20671274, cg21827634, cg22773522, GTF2IRD1, FTSJ1, TTI1, CHMP4B, KLC3, HNRNPUL1, UBE2S, BCL2L12, SYNGR3, KRT80, FOXE1, AC006213.3, hsa-miR-22-5p, hsa-miR-492, hsa-miR-4536-5p, hsa-miR-6727-5p, hsa-miR-7107-5p, hsa-miR-136-3p, hsa-miR-3679-3p, hsa-miR-6816-3p | 0.010 | 0.938 |
LASSO: least absolute shrinkage and selection operator: TMB: tumor mutation burden.
Coefficient of each biomarker of multiomics signature in LASSO model analysis.
| Biomarkers | Coefficient | |
|---|---|---|
| Multiomics | cg01862650 | -1.818730262 |
| cg07729440 | -6.256940647 | |
| cg20671274 | -0.911730577 | |
| cg21827634 | -3.867697524 | |
| cg22773522 | -2.460934084 | |
| GTF2IRD1 | 0.039415926 | |
| FTSJ1 | 0.012286235 | |
| TTI1 | 0.042479337 | |
| CHMP4B | 0.009500278 | |
| KLC3 | 0.454704618 | |
| HNRNPUL1 | 0.015834511 | |
| UBE2S | 0.014978452 | |
| BCL2L12 | 0.078730525 | |
| SYNGR3 | 0.192724739 | |
| KRT80 | 0.017563239 | |
| FOXE1 | 0.011660062 | |
| AC006213.3 | 0.175314579 | |
| hsa-miR-22-5p | 1.230322969 | |
| hsa-miR-492 | -0.185026622 | |
| hsa-miR-4536-5p | 7.457452192 | |
| hsa-miR-6727-5p | 0.560095048 | |
| hsa-miR-7107-5p | -0.131067725 | |
| hsa-miR-136-3p | 0.93577811 | |
| hsa-miR-3679-3p | -0.007961529 | |
| hsa-miR-6816-3p | 1.359432308 |
LASSO: least absolute shrinkage and selection operator.
Figure 7The performance of TPM in the training cohort. (a) ROC analysis of the TPM score in the training cohort; (b) the TPM score is highly correlated with TMB. TPM: TMB prediction model; AUC: area under curve.
Figure 8The performance of TPM in the validation cohort. (a) AUC of ROC analysis was 0.859 showing the great predictive accuracy of TPM; (b) the TPM score is highly correlated with TMB with p value = 1.19e − 14. TPM: TMB prediction model; AUC: area under curve.