| Literature DB >> 34946821 |
Yingxia Li1, Ulrich Mansmann1, Shangming Du1, Roman Hornung1.
Abstract
Lung adenocarcinoma (LUAD) is a common and very lethal cancer. Accurate staging is a prerequisite for its effective diagnosis and treatment. Therefore, improving the accuracy of the stage prediction of LUAD patients is of great clinical relevance. Previous works have mainly focused on single genomic data information or a small number of different omics data types concurrently for generating predictive models. A few of them have considered multi-omics data from genome to proteome. We used a publicly available dataset to illustrate the potential of multi-omics data for stage prediction in LUAD. In particular, we investigated the roles of the specific omics data types in the prediction process. We used a self-developed method, Omics-MKL, for stage prediction that combines an existing feature ranking technique Minimum Redundancy and Maximum Relevance (mRMR), which avoids redundancy among the selected features, and multiple kernel learning (MKL), applying different kernels for different omics data types. Each of the considered omics data types individually provided useful prediction results. Moreover, using multi-omics data delivered notably better results than using single-omics data. Gene expression and methylation information seem to play vital roles in the staging of LUAD. The Omics-MKL method retained 70 features after the selection process. Of these, 21 (30%) were methylation features and 34 (48.57%) were gene expression features. Moreover, 18 (25.71%) of the selected features are known to be related to LUAD, and 29 (41.43%) to lung cancer in general. Using multi-omics data from genome to proteome for predicting the stage of LUAD seems promising because each omics data type may improve the accuracy of the predictions. Here, methylation and gene expression data may play particularly important roles.Entities:
Keywords: MKL; lung adenocarcinoma; mRMR; multi-omics data
Mesh:
Substances:
Year: 2021 PMID: 34946821 PMCID: PMC8700916 DOI: 10.3390/genes12121872
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1The case numbers available for each combination of omics data types.
Patients’ basic characteristics.
| Items | Number | Percent (%) | |
|---|---|---|---|
| Gender | Male | 162 | 46.15 |
| Age (year) | <60 | 94 | 26.78 |
| Average age at diagnosis | 65.04 | ||
| Stage | Early (T1, T2) | 270 | 76.92 |
| Total | 351 |
Data description.
| Data Type | No. of Features | No. of Selected Features |
|---|---|---|
| CNV | 25,988 | 10 |
| Methylation | 13,620 | 21 |
| Gene Expression | 15,751 | 34 |
| miRNA Expression | 595 | 3 |
| Protein Expression | 216 | 2 |
| Total | 56,170 | 70 |
Figure 2Comparison between prediction performance obtained using multi-omics data and single-omics data (prediction method: Omics-MKL).
Figure 3Comparison between the prediction performance obtained using all five omics data types and after removing one omics data type at a time (prediction method: Omics-MKL).
Figure 4Comparison of the prediction methods applied to the whole multi-omics dataset.
Figure 5The relationship between the number of selected features and the cross-validated AUC.
Figure 6The percentages of features from each data type in the selected features.
Figure 7The selected features ranked according to their association with the outcome and the mutual information between them. Red represents protein features, yellow represents miRNA features, brown represents CNV features, purple represents methylation features, and blue represents RNA-seq features.
Figure 8Bar graph of enriched molecular functions based on the 70 selected features.
The top 10 features related to LUAD.
| Rank ID | Genes | The Content of the Report | PubMed ID |
|---|---|---|---|
| 2 | YTHDF2 | The m6A-related genes METTL3, YTHDF1, and YTHDF2 could serve as novel biomarkers for the prognosis of LUAD. | PMID: 32086933 |
| 3 | DSG2 | High DSG2 expression in both lung adenocarcinoma (LUAD) cell lines and tissues is associated with poor prognosis in LUAD patients. | PMID: 32272148 |
| 4 | XAF1 | XAF1 inhibits cell proliferation and induces apoptosis in the human lung. | PMID: 25539606 |
| 5 | CAPN1 | CAPN1 promotes malignant behavior and erlotinib resistance mediated by phosphorylation of c-Met and PIK3R2 via degrading PTPN1 in lung adenocarcinoma. | PMID: 32395869 |
| 8 | LINC00665 | Long non-coding RNA LINC00665 promotes lung adenocarcinoma progression and functions as ceRNA to regulate AKR1B10-ERK signaling by sponging miR-98. | PMID: 30692511 |
| 17 | PRKG1 | The MAPK, PI3K-Akt, Ras, and cGMP-PRKG1 signaling pathways were considered to be most probably correlated with platinum resistance. | PMID: 29288364 |
| 23 | BCAN | A survival prediction model composed of six TME-related genes (CLEC17A, TAGAP, ABCC8, BCAN, FLT3, and CCR2) was used in a Lung Adenocarcinoma Microenvironment. | PMID: 32337264 |
| 26 | SHC1 | In NSCLC, the failure of pathways which involve factors such as DAPK1, GADD45A, SHC1, and TP53, in response to short telomeres, could promote tumor progression. | PMID: 22433385 |
| 30 | NAV3 | The most commonly mutated genes with predicted neo-antigens are KRAS, TTN, RYR2, MUC16, TP53, USH2A, ZFHX4, KEAP1, STK11, FAT3, NAV3, and EGFR in lung adenocarcinoma. | PMID: 30075702 |
| 37 | BCL7B | Compared with the combined human ACs, 39 genes with similar expression changes in murine lung tumors and human ACs/LCCs were identified, such as the oncogene related BCL7B, the cell cycle regulator CDK4, and the proapoptotic Endophilin B1. | PMID: 14647414 |
Figure 9Relation between the values of the 18 selected features known to be associated with LUAD and the stage. Purple boxes represent RNA-seq features, blue boxes DNA methylation features, red boxes protein features, and bars CNV features (CNV values: −2 = homozygous deletion; −1 = hemizygous deletion; 0 = neutral/no change; 1 = gain; 2 = high level amplification).