Literature DB >> 33613644

A Novel XGBoost Method to Infer the Primary Lesion of 20 Solid Tumor Types From Gene Expression Data.

Sijie Chen1, Wenjing Zhou2, Jinghui Tu1, Jian Li1, Bo Wang3,4, Xiaofei Mo3,4, Geng Tian3,4, Kebo Lv1, Zhijian Huang5.   

Abstract

PURPOSE: Establish a suitable machine learning model to identify its primary lesions for primary metastatic tumors in an integrated learning approach, making it more accurate to improve primary lesions' diagnostic efficiency.
METHODS: After deleting the features whose expression level is lower than the threshold, we use two methods to perform feature selection and use XGBoost for classification. After the optimal model is selected through 10-fold cross-validation, it is verified on an independent test set.
RESULTS: Selecting features with around 800 genes for training, the R 2-score of a 10-fold CV of training data can reach 96.38%, and the R 2-score of test data can reach 83.3%.
CONCLUSION: These findings suggest that by combining tumor data with machine learning methods, each cancer has its corresponding classification accuracy, which can be used to predict primary metastatic tumors' location. The machine-learning-based method can be used as an orthogonal diagnostic method to judge the machine learning model processing and clinical actual pathological conditions.
Copyright © 2021 Chen, Zhou, Tu, Li, Wang, Mo, Tian, Lv and Huang.

Entities:  

Keywords:  CUP; XGBoost; feature selection; gene expression; tumor tissue-of-origin

Year:  2021        PMID: 33613644      PMCID: PMC7886791          DOI: 10.3389/fgene.2021.632761

Source DB:  PubMed          Journal:  Front Genet        ISSN: 1664-8021            Impact factor:   4.599


  14 in total

1.  Multisite validation study to determine performance characteristics of a 92-gene molecular cancer classifier.

Authors:  Sarah E Kerr; Catherine A Schnabel; Peggy S Sullivan; Yi Zhang; Veena Singh; Brittany Carey; Mark G Erlander; W Edward Highsmith; Sarah M Dry; Elena F Brachtel
Journal:  Clin Cancer Res       Date:  2012-05-30       Impact factor: 12.531

2.  Clinical applicability and cost of a 46-gene panel for genomic analysis of solid tumours: Retrospective validation and prospective audit in the UK National Health Service.

Authors:  Angela Hamblin; Sarah Wordsworth; Jilles M Fermont; Suzanne Page; Kulvinder Kaur; Carme Camps; Pamela Kaisaki; Avinash Gupta; Denis Talbot; Mark Middleton; Shirley Henderson; Anthony Cutts; Dimitrios V Vavoulis; Nick Housby; Ian Tomlinson; Jenny C Taylor; Anna Schuh
Journal:  PLoS Med       Date:  2017-02-14       Impact factor: 11.069

Review 3.  Cancer of Unknown Primary Site: New Treatment Paradigms in the Era of Precision Medicine.

Authors:  John D Hainsworth; F Anthony Greco
Journal:  Am Soc Clin Oncol Educ Book       Date:  2018-05-23

4.  Analysis of gene expression profiles of lung cancer subtypes with machine learning algorithms.

Authors:  Fei Yuan; Lin Lu; Quan Zou
Journal:  Biochim Biophys Acta Mol Basis Dis       Date:  2020-04-28       Impact factor: 5.187

Review 5.  Cancer of unknown primary site.

Authors:  Nicholas Pavlidis; George Pentheroudakis
Journal:  Lancet       Date:  2012-03-12       Impact factor: 79.321

6.  Molecular classification of human cancers using a 92-gene real-time quantitative polymerase chain reaction assay.

Authors:  Xiao-Jun Ma; Rajesh Patel; Xianqun Wang; Ranelle Salunga; Jaji Murage; Rupal Desai; J Todd Tuggle; Wei Wang; Shirley Chu; Kimberly Stecker; Rajiv Raja; Howard Robin; Mat Moore; David Baunoch; Dennis Sgroi; Mark Erlander
Journal:  Arch Pathol Lab Med       Date:  2006-04       Impact factor: 5.534

Review 7.  The role of cell adhesion molecule in cancer progression and its application in cancer therapy.

Authors:  Takatsugu Okegawa; Rey-Chen Pong; Yingming Li; Jer-Tsong Hsieh
Journal:  Acta Biochim Pol       Date:  2004       Impact factor: 2.149

8.  Translocatome: a novel resource for the analysis of protein translocation between cellular organelles.

Authors:  Péter Mendik; Levente Dobronyi; Ferenc Hári; Csaba Kerepesi; Leonardo Maia-Moço; Donát Buszlai; Peter Csermely; Daniel V Veres
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

Review 9.  Molecular characterisation and liquid biomarkers in Carcinoma of Unknown Primary (CUP): taking the 'U' out of 'CUP'.

Authors:  Alicia-Marie Conway; Claire Mitchell; Elaine Kilgour; Gerard Brady; Caroline Dive; Natalie Cook
Journal:  Br J Cancer       Date:  2018-12-23       Impact factor: 7.640

10.  Gene Expression Value Prediction Based on XGBoost Algorithm.

Authors:  Wei Li; Yanbin Yin; Xiongwen Quan; Han Zhang
Journal:  Front Genet       Date:  2019-11-12       Impact factor: 4.599

View more
  4 in total

1.  Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification.

Authors:  Xiongshi Deng; Min Li; Shaobo Deng; Lei Wang
Journal:  Med Biol Eng Comput       Date:  2022-01-13       Impact factor: 2.602

2.  Prediction and Screening Model for Products Based on Fusion Regression and XGBoost Classification.

Authors:  Jiaju Wu; Linggang Kong; Ming Yi; Qiuxian Chen; Zheng Cheng; Hongfu Zuo; Yonghui Yang
Journal:  Comput Intell Neurosci       Date:  2022-07-31

3.  A Machine Learning Method to Trace Cancer Primary Lesion Using Microarray-Based Gene Expression Data.

Authors:  Qingfeng Lu; Fengxia Chen; Qianyue Li; Lihong Chen; Ling Tong; Geng Tian; Xiaohong Zhou
Journal:  Front Oncol       Date:  2022-04-21       Impact factor: 5.738

4.  Identifying cancer tissue-of-origin by a novel machine learning method based on expression quantitative trait loci.

Authors:  Yongchang Miao; Xueliang Zhang; Sijie Chen; Wenjing Zhou; Dalai Xu; Xiaoli Shi; Jian Li; Jinhui Tu; Xuelian Yuan; Kebo Lv; Geng Tian
Journal:  Front Oncol       Date:  2022-08-09       Impact factor: 5.738

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.