Literature DB >> 36107929

Classification and prediction for multi-cancer data with ultrahigh-dimensional gene expressions.

Li-Pang Chen1.   

Abstract

Analysis of gene expression data is an attractive topic in the field of bioinformatics, and a typical application is to classify and predict individuals' diseases or tumors by treating gene expression values as predictors. A primary challenge of this study comes from ultrahigh-dimensionality, which makes that (i) many predictors in the dataset might be non-informative, (ii) pairwise dependence structures possibly exist among high-dimensional predictors, yielding the network structure. While many supervised learning methods have been developed, it is expected that the prediction performance would be affected if impacts of ultrahigh-dimensionality were not carefully addressed. In this paper, we propose a new statistical learning algorithm to deal with multi-classification subject to ultrahigh-dimensional gene expressions. In the proposed algorithm, we employ the model-free feature screening method to retain informative gene expression values from ultrahigh-dimensional data, and then construct predictive models with network structures of selected gene expression accommodated. Different from existing supervised learning methods that build predictive models based on entire dataset, our approach is able to identify informative predictors and dependence structures for gene expression. Throughout analysis of a real dataset, we find that the proposed algorithm gives precise classification as well as accurate prediction, and outperforms some commonly used supervised learning methods.

Entities:  

Mesh:

Year:  2022        PMID: 36107929      PMCID: PMC9477337          DOI: 10.1371/journal.pone.0274440

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.752


  17 in total

1.  Topology of gene expression networks as revealed by data mining and modeling.

Authors:  Alexander V Lukashin; Matvey E Lukashev; Rainer Fuchs
Journal:  Bioinformatics       Date:  2003-10-12       Impact factor: 6.937

2.  Discussion of "Sure Independence Screening for Ultra-High Dimensional Feature Space.

Authors:  Hao Helen Zhang
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2008-11       Impact factor: 4.488

3.  Joint Bayesian variable and graph selection for regression models with network-structured predictors.

Authors:  Christine B Peterson; Francesco C Stingo; Marina Vannucci
Journal:  Stat Med       Date:  2015-10-29       Impact factor: 2.373

4.  Multiclass cancer diagnosis using tumor gene expression signatures.

Authors:  S Ramaswamy; P Tamayo; R Rifkin; S Mukherjee; C H Yeang; M Angelo; C Ladd; M Reich; E Latulippe; J P Mesirov; T Poggio; W Gerald; M Loda; E S Lander; T R Golub
Journal:  Proc Natl Acad Sci U S A       Date:  2001-12-11       Impact factor: 11.205

5.  Variable Selection for Support Vector Machines in Moderately High Dimensions.

Authors:  Xiang Zhang; Yichao Wu; Lan Wang; Runze Li
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2015-01-05       Impact factor: 4.488

6.  Classification of multiple cancer types by multicategory support vector machines using gene expression data.

Authors:  Yoonkyung Lee; Cheol-Koo Lee
Journal:  Bioinformatics       Date:  2003-06-12       Impact factor: 6.937

7.  SVM and SVM Ensembles in Breast Cancer Prediction.

Authors:  Min-Wei Huang; Chih-Wen Chen; Wei-Chao Lin; Shih-Wen Ke; Chih-Fong Tsai
Journal:  PLoS One       Date:  2017-01-06       Impact factor: 3.240

8.  Nearest Neighbor Networks: clustering expression data based on gene neighborhoods.

Authors:  Curtis Huttenhower; Avi I Flamholz; Jessica N Landis; Sauhard Sahi; Chad L Myers; Kellen L Olszewski; Matthew A Hibbs; Nathan O Siemers; Olga G Troyanskaya; Hilary A Coller
Journal:  BMC Bioinformatics       Date:  2007-07-12       Impact factor: 3.169

9.  Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series.

Authors:  Juan Manuel Gálvez; Daniel Castillo; Luis Javier Herrera; Belén San Román; Olga Valenzuela; Francisco Manuel Ortuño; Ignacio Rojas
Journal:  PLoS One       Date:  2018-05-11       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.