Literature DB >> 28968879

High dimensional classification with combined adaptive sparse PLS and logistic regression.

Ghislain Durif1,2, Laurent Modolo1,3,4, Jakob Michaelsson4, Jeff E Mold4, Sophie Lambert-Lacroix5, Franck Picard1.   

Abstract

Motivation: The high dimensionality of genomic data calls for the development of specific classification methodologies, especially to prevent over-optimistic predictions. This challenge can be tackled by compression and variable selection, which combined constitute a powerful framework for classification, as well as data visualization and interpretation. However, current proposed combinations lead to unstable and non convergent methods due to inappropriate computational frameworks. We hereby propose a computationally stable and convergent approach for classification in high dimensional based on sparse Partial Least Squares (sparse PLS).
Results: We start by proposing a new solution for the sparse PLS problem that is based on proximal operators for the case of univariate responses. Then we develop an adaptive version of the sparse PLS for classification, called logit-SPLS, which combines iterative optimization of logistic regression and sparse PLS to ensure computational convergence and stability. Our results are confirmed on synthetic and experimental data. In particular, we show how crucial convergence and stability can be when cross-validation is involved for calibration purposes. Using gene expression data, we explore the prediction of breast cancer relapse. We also propose a multicategorial version of our method, used to predict cell-types based on single-cell expression data. Availability and implementation: Our approach is implemented in the plsgenomics R-package. Contact: ghislain.durif@inria.fr. Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

Entities:  

Mesh:

Year:  2018        PMID: 28968879     DOI: 10.1093/bioinformatics/btx571

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  4 in total

1.  Prediction of tumour pathological subtype from genomic profile using sparse logistic regression with random effects.

Authors:  Özlem Kaymaz; Khaled Alqahtani; Henry M Wood; Arief Gusnanto
Journal:  J Appl Stat       Date:  2020-03-11       Impact factor: 1.416

2.  Computational identification of new potential transcriptional partners of ERRα in breast cancer cells: specific partners for specific targets.

Authors:  Catherine Cerutti; Ling Zhang; Violaine Tribollet; Jing-Ru Shi; Riwan Brillet; Benjamin Gillet; Sandrine Hughes; Christelle Forcet; Tie-Liu Shi; Jean-Marc Vanacker
Journal:  Sci Rep       Date:  2022-03-09       Impact factor: 4.379

3.  Linking genotype to phenotype in multi-omics data of small sample.

Authors:  Xinpeng Guo; Yafei Song; Shuhui Liu; Meihong Gao; Yang Qi; Xuequn Shang
Journal:  BMC Genomics       Date:  2021-07-13       Impact factor: 3.969

4.  Classification based on extensions of LS-PLS using logistic regression: application to clinical and multiple genomic data.

Authors:  Caroline Bazzoli; Sophie Lambert-Lacroix
Journal:  BMC Bioinformatics       Date:  2018-09-06       Impact factor: 3.169

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.