Literature DB >> 34254998

Computationally scalable regression modeling for ultrahigh-dimensional omics data with ParProx.

Seyoon Ko1, Ginny X Li2, Hyungwon Choi2, Joong-Ho Won1.   

Abstract

Statistical analysis of ultrahigh-dimensional omics scale data has long depended on univariate hypothesis testing. With growing data features and samples, the obvious next step is to establish multivariable association analysis as a routine method to describe genotype-phenotype association. Here we present ParProx, a state-of-the-art implementation to optimize overlapping and non-overlapping group lasso regression models for time-to-event and classification analysis, with selection of variables grouped by biological priors. ParProx enables multivariable model fitting for ultrahigh-dimensional data within an architecture for parallel or distributed computing via latent variable group representation. It thereby aims to produce interpretable regression models consistent with known biological relationships among independent variables, a property often explored post hoc, not during model estimation. Simulation studies clearly demonstrate the scalability of ParProx with graphics processing units in comparison to existing implementations. We illustrate the tool using three different omics data sets featuring moderate to large numbers of variables, where we use genomic regions and biological pathways as variable groups, rendering the selected independent variables directly interpretable with respect to those groups. ParProx is applicable to a wide range of studies using ultrahigh-dimensional omics data, from genome-wide association analysis to multi-omics studies where model estimation is computationally intractable with existing implementation.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Keywords:  latent group lasso; parallel computing; proximal gradient; sparse regression; ultrahigh-dimensional omics data

Mesh:

Substances:

Year:  2021        PMID: 34254998      PMCID: PMC8575036          DOI: 10.1093/bib/bbab256

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  25 in total

1.  A protein-centric approach for exome variant aggregation enables sensitive association analysis with clinical outcomes.

Authors:  Ginny X H Li; Dan Munro; Damian Fermin; Christine Vogel; Hyungwon Choi
Journal:  Hum Mutat       Date:  2020-01-23       Impact factor: 4.878

2.  Strong rules for discarding predictors in lasso-type problems.

Authors:  Robert Tibshirani; Jacob Bien; Jerome Friedman; Trevor Hastie; Noah Simon; Jonathan Taylor; Ryan J Tibshirani
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2012-03       Impact factor: 4.488

3.  GSTP1 expression predicts poor pathological complete response to neoadjuvant chemotherapy in ER-negative breast cancer.

Authors:  Tomohiro Miyake; Takahiro Nakayama; Yasuto Naoi; Noriaki Yamamoto; Yoko Otani; Seung J Kim; Kenzo Shimazu; Atsushi Shimomura; Naomi Maruyama; Yasuhiro Tamaki; Shinzaburo Noguchi
Journal:  Cancer Sci       Date:  2012-03-01       Impact factor: 6.716

4.  Biomarker analysis of neoadjuvant doxorubicin/cyclophosphamide followed by ixabepilone or Paclitaxel in early-stage breast cancer.

Authors:  Christine E Horak; Lajos Pusztai; Guan Xing; Ovidiu C Trifan; Cristina Saura; Ling-Ming Tseng; Stephen Chan; Rosanne Welcher; David Liu
Journal:  Clin Cancer Res       Date:  2013-01-22       Impact factor: 12.531

5.  PhosphoSitePlus, 2014: mutations, PTMs and recalibrations.

Authors:  Peter V Hornbeck; Bin Zhang; Beth Murray; Jon M Kornhauser; Vaughan Latham; Elzbieta Skrzypek
Journal:  Nucleic Acids Res       Date:  2014-12-16       Impact factor: 16.971

6.  Response and survival of breast cancer intrinsic subtypes following multi-agent neoadjuvant chemotherapy.

Authors:  Aleix Prat; Cheng Fan; Aranzazu Fernández; Katherine A Hoadley; Rossella Martinello; Maria Vidal; Margarita Viladot; Estela Pineda; Ana Arance; Montserrat Muñoz; Laia Paré; Maggie C U Cheang; Barbara Adamo; Charles M Perou
Journal:  BMC Med       Date:  2015-12-18       Impact factor: 8.775

7.  An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics.

Authors:  Jianfang Liu; Tara Lichtenberg; Katherine A Hoadley; Laila M Poisson; Alexander J Lazar; Andrew D Cherniack; Albert J Kovatich; Christopher C Benz; Douglas A Levine; Adrian V Lee; Larsson Omberg; Denise M Wolf; Craig D Shriver; Vesteinn Thorsson; Hai Hu
Journal:  Cell       Date:  2018-04-05       Impact factor: 41.582

8.  Comprehensive and Integrative Genomic Characterization of Hepatocellular Carcinoma.

Authors: 
Journal:  Cell       Date:  2017-06-15       Impact factor: 66.850

9.  iRefIndex: a consolidated protein interaction database with provenance.

Authors:  Sabry Razick; George Magklaras; Ian M Donaldson
Journal:  BMC Bioinformatics       Date:  2008-09-30       Impact factor: 3.169

10.  Architecture of the human interactome defines protein communities and disease networks.

Authors:  Edward L Huttlin; Raphael J Bruckner; Joao A Paulo; Joe R Cannon; Lily Ting; Kurt Baltier; Greg Colby; Fana Gebreab; Melanie P Gygi; Hannah Parzen; John Szpyt; Stanley Tam; Gabriela Zarraga; Laura Pontano-Vaites; Sharan Swarup; Anne E White; Devin K Schweppe; Ramin Rad; Brian K Erickson; Robert A Obar; K G Guruharsha; Kejie Li; Spyros Artavanis-Tsakonas; Steven P Gygi; J Wade Harper
Journal:  Nature       Date:  2017-05-17       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.