| Literature DB >> 35795312 |
Qiao Sun1,2,3,4,5, Lin Bai1,2,3,4,5, Shaopin Zhu1,2,3,4,5, Lu Cheng1,2,3,4,5, Yang Xu6,7, Yu-Dong Cai8, Hui Chen1,2,3,4,5, Jian Zhang1,2,3,4,5.
Abstract
Lymphoma is a serious malignant tumor that contains more than 70 different types and seriously endangers the body's lymphatic system. The lymphatic system is the regulatory center of the immune system and is important in the immune response to foreign antigens and tumors. Studies showed that multiple genetic variants are associated with lymphoma but determining the pathogenic mechanisms remains a challenge. In the present study, we first applied the Gene Ontology (GO) and KEGG pathway enrichment analyses of lymphoma-associated and lymphoma-nonassociated genes. Next, the Boruta and max-relevance and min-redundancy feature selection methods were performed to filter and rank features. Then, features preselected and ranked using the incremental feature selection method were applied for the decision tree model to identify the best GO terms and KEGG pathways and extract classification rules. Results indicate that our predicted features, such as B-cell activation, negative regulation of protein processing, negative regulation of mast cell cytokine production, and natural killer cell-mediated cytotoxicity, are associated with the biological process of lymphoma, consistent with those of recent publications. This study provides a new perspective for future research on the molecular mechanisms of lymphoma.Entities:
Mesh:
Year: 2022 PMID: 35795312 PMCID: PMC9251090 DOI: 10.1155/2022/8503511
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.246
Figure 1Flow chart for classifying samples for two types of genes in lymphoma. The gene ontology (GO) and KEGG pathway enrichment are used to construct the features of the dataset, and the Boruta and mRMR feature selection methods are used to filter and rank features. The optimal number of features and optimal classifiers are obtained by the incremental feature selection method with DT.
Figure 2Incremental feature selection (IFS) curves of the DT classifier on the different number of features. DT provides the highest F1 − measure of 0.486 when the top 805 features are used.
Figure 3Distribution of GO features used in the best DT classifier on three GO groups. The BP GP terms are the most, followed by MF and CC GO terms.