| Literature DB >> 35880092 |
Wei Li1, Binchun Liu1, Weiqian Wang1, Can Sun1, Jianpeng Che1, Xuelian Yuan2,3, Chunbo Zhai1.
Abstract
Lung cancer is one of the leading causes of cancer death. Patients with early-stage lung cancer can be treated by surgery, while patients in the middle and late stages need chemotherapy or radiotherapy. Therefore, accurate staging of lung cancer is crucial for doctors to formulate accurate treatment plans for patients. In this paper, the random forest algorithm is used as the lung cancer stage prediction model, and the accuracy of lung cancer stage prediction is discussed in the microbiome, transcriptome, microbe, and transcriptome fusion groups, and the accuracy of the model is measured by indicators such as ACC, recall, and precision. The results showed that the prediction accuracy of microbial combinatorial transcriptome fusion analysis was the highest, reaching 0.809. The study reveals the role of multimodal data and fusion algorithm in accurately diagnosing lung cancer stage, which could aid doctors in clinics.Entities:
Mesh:
Year: 2022 PMID: 35880092 PMCID: PMC9308511 DOI: 10.1155/2022/2279044
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.809
Details of 189 samples downloaded from TCGA.
| Cancer | Group | Number |
|---|---|---|
| Lung | Stage I | 98 |
| Stages II, III, and IV | 91 |
Figure 1(a) Volcano map. The figure compares the transcriptomes of 189 lung cancer patients. Among them, each point represents a gene, the abscissa is the fold difference, and the ordinate is the inverse of the logarithm of the p value. Colors are used to distinguish whether genes are differentially expressed, blue represents genes downregulated, red represents genes upregulated, and gray represents genes that are not differentially expressed. Genes with greater differential expression are farther away and are generally distributed at the endpoints of the graph. (b) Heat map. Heat map of the top 20 genes up and down, where the rows represent the stage of lung cancer and the columns represent the genes. (c)–(d) GO enrichment analysis and KEGG enrichment analysis. The horizontal axis represents the number of genes, the vertical axis represents the biological process and cell function, and the color represents the p value. The darker the color, the less significant the p value. In this paper, the top 10 pathways with the smallest p value were selected for display.
Figure 2Boxplot and bee colony plot of the expression levels of 8 genera in different stages of lung cancer with rank sum test, p < 0.01.
Figure 3Display of four outcome staging prediction results. (a) AUC value of staging prediction results. (b) Visual comparison chart of AUC, ACC, recall, and precision of staging prediction.