Literature DB >> 26315909

glmgraph: an R package for variable selection and predictive modeling of structured genomic data.

Li Chen1, Han Liu2, Jean-Pierre A Kocher3, Hongzhe Li4, Jun Chen3.   

Abstract

UNLABELLED: One central theme of modern high-throughput genomic data analysis is to identify relevant genomic features as well as build up a predictive model based on selected features for various tasks such as personalized medicine. Correlating the large number of 'omics' features with a certain phenotype is particularly challenging due to small sample size (n) and high dimensionality (p). To address this small n, large p problem, various forms of sparse regression models have been proposed by exploiting the sparsity assumption. Among these, network-constrained sparse regression model is of particular interest due to its ability to utilize the prior graph/network structure in the omics data. Despite its potential usefulness for omics data analysis, no efficient R implementation is publicly available. Here we present an R software package 'glmgraph' that implements the graph-constrained regularization for both sparse linear regression and sparse logistic regression. We implement both the L1 penalty and minimax concave penalty for variable selection and Laplacian penalty for coefficient smoothing. Efficient coordinate descent algorithm is used to solve the optimization problem. We demonstrate the use of the package by applying it to a human microbiome dataset, where phylogeny structure among bacterial taxa is available.
AVAILABILITY AND IMPLEMENTATION: 'glmgraph' is implemented in R and C++ Armadillo and publicly available under CRAN.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2015        PMID: 26315909      PMCID: PMC4692967          DOI: 10.1093/bioinformatics/btv497

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  8 in total

1.  Network-constrained regularization and variable selection for analysis of genomic data.

Authors:  Caiyan Li; Hongzhe Li
Journal:  Bioinformatics       Date:  2008-03-01       Impact factor: 6.937

2.  Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis.

Authors:  Jun Chen; Frederic D Bushman; James D Lewis; Gary D Wu; Hongzhe Li
Journal:  Biostatistics       Date:  2012-10-15       Impact factor: 5.899

3.  The Sparse Laplacian Shrinkage Estimator for High-Dimensional Regression.

Authors:  Jian Huang; Shuangge Ma; Hongzhe Li; Cun-Hui Zhang
Journal:  Ann Stat       Date:  2011       Impact factor: 4.028

4.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION.

Authors:  Patrick Breheny; Jian Huang
Journal:  Ann Appl Stat       Date:  2011-01-01       Impact factor: 2.083

5.  Regularization Paths for Generalized Linear Models via Coordinate Descent.

Authors:  Jerome Friedman; Trevor Hastie; Rob Tibshirani
Journal:  J Stat Softw       Date:  2010       Impact factor: 6.440

6.  Disordered microbial communities in the upper respiratory tract of cigarette smokers.

Authors:  Emily S Charlson; Jun Chen; Rebecca Custers-Allen; Kyle Bittinger; Hongzhe Li; Rohini Sinha; Jennifer Hwang; Frederic D Bushman; Ronald G Collman
Journal:  PLoS One       Date:  2010-12-20       Impact factor: 3.240

7.  Optimized application of penalized regression methods to diverse genomic data.

Authors:  Levi Waldron; Melania Pintilie; Ming-Sound Tsao; Frances A Shepherd; Curtis Huttenhower; Igor Jurisica
Journal:  Bioinformatics       Date:  2011-12-15       Impact factor: 6.937

8.  Network-constrained group lasso for high-dimensional multinomial classification with application to cancer subtype prediction.

Authors:  Xinyu Tian; Xuefeng Wang; Jun Chen
Journal:  Cancer Inform       Date:  2015-01-12
  8 in total
  5 in total

1.  Scalable Bayesian variable selection for structured high-dimensional data.

Authors:  Changgee Chang; Suprateek Kundu; Qi Long
Journal:  Biometrics       Date:  2018-05-08       Impact factor: 2.571

2.  Multi-omics disease module detection with an explainable Greedy Decision Forest.

Authors:  Bastian Pfeifer; Hubert Baniecki; Anna Saranti; Przemyslaw Biecek; Andreas Holzinger
Journal:  Sci Rep       Date:  2022-10-07       Impact factor: 4.996

3.  Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer.

Authors:  Hryhorii Chereda; Annalen Bleckmann; Kerstin Menck; Júlia Perera-Bel; Philip Stegmaier; Florian Auer; Frank Kramer; Andreas Leha; Tim Beißbarth
Journal:  Genome Med       Date:  2021-03-11       Impact factor: 11.117

4.  Predictive Modeling of Microbiome Data Using a Phylogeny-Regularized Generalized Linear Mixed Model.

Authors:  Jian Xiao; Li Chen; Stephen Johnson; Yue Yu; Xianyang Zhang; Jun Chen
Journal:  Front Microbiol       Date:  2018-06-27       Impact factor: 5.640

5.  Variance Component Selection With Applications to Microbiome Taxonomic Data.

Authors:  Jing Zhai; Juhyun Kim; Kenneth S Knox; Homer L Twigg; Hua Zhou; Jin J Zhou
Journal:  Front Microbiol       Date:  2018-03-28       Impact factor: 5.640

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.