Literature DB >> 28739925

Large numbers of explanatory variables, a semi-descriptive analysis.

D R Cox1, H S Battey2.   

Abstract

Data with a relatively small number of study individuals and a very large number of potential explanatory features arise particularly, but by no means only, in genomics. A powerful method of analysis, the lasso [Tibshirani R (1996) J Roy Stat Soc B 58:267-288], takes account of an assumed sparsity of effects, that is, that most of the features are nugatory. Standard criteria for model fitting, such as the method of least squares, are modified by imposing a penalty for each explanatory variable used. There results a single model, leaving open the possibility that other sparse choices of explanatory features fit virtually equally well. The method suggested in this paper aims to specify simple models that are essentially equally effective, leaving detailed interpretation to the specifics of the particular study. The method hinges on the ability to make initially a very large number of separate analyses, allowing each explanatory feature to be assessed in combination with many other such features. Further stages allow the assessment of more complex patterns such as nonlinear and interactive dependences. The method has formal similarities to so-called partially balanced incomplete block designs introduced 80 years ago [Yates F (1936) J Agric Sci 26:424-455] for the study of large-scale plant breeding trials. The emphasis in this paper is strongly on exploratory analysis; the more formal statistical properties obtained under idealized assumptions will be reported separately.

Keywords:  genomics; sparse effects; statistical analysis

Mesh:

Year:  2017        PMID: 28739925      PMCID: PMC5559019          DOI: 10.1073/pnas.1703764114

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  2 in total

1.  Genes expressed in blood link osteoarthritis with apoptotic pathways.

Authors:  Yolande F M Ramos; Steffan D Bos; Nico Lakenberg; Stefan Böhringer; Wouter J den Hollander; Margreet Kloppenburg; P Eline Slagboom; Ingrid Meulenbelt
Journal:  Ann Rheum Dis       Date:  2013-07-17       Impact factor: 19.103

Review 2.  Role of FGF/FGFR signaling in skeletal development and homeostasis: learning from mouse models.

Authors:  Nan Su; Min Jin; Lin Chen
Journal:  Bone Res       Date:  2014-04-29       Impact factor: 13.567

  2 in total
  6 in total

1.  Fusion of single-cell transcriptome and DNA-binding data, for genomic network inference in cortical development.

Authors:  Thomas Bartlett
Journal:  BMC Bioinformatics       Date:  2021-06-04       Impact factor: 3.169

2.  Adiposity, metabolomic biomarkers, and risk of nonalcoholic fatty liver disease: a case-cohort study.

Authors:  Yuanjie Pang; Christiana Kartsonaki; Jun Lv; Iona Y Millwood; Zammy Fairhurst-Hunter; Iain Turnbull; Fiona Bragg; Michael R Hill; Canqing Yu; Yu Guo; Yiping Chen; Ling Yang; Robert Clarke; Robin G Walters; Ming Wu; Junshi Chen; Liming Li; Zhengming Chen; Michael V Holmes
Journal:  Am J Clin Nutr       Date:  2022-03-04       Impact factor: 8.472

3.  The role of NMR-based circulating metabolic biomarkers in development and risk prediction of new onset type 2 diabetes.

Authors:  Fiona Bragg; Christiana Kartsonaki; Yu Guo; Michael Holmes; Huaidong Du; Canqing Yu; Pei Pei; Ling Yang; Donghui Jin; Yiping Chen; Dan Schmidt; Daniel Avery; Jun Lv; Junshi Chen; Robert Clarke; Michael R Hill; Liming Li; Iona Y Millwood; Zhengming Chen
Journal:  Sci Rep       Date:  2022-09-05       Impact factor: 4.996

4.  Circulating proteins and risk of pancreatic cancer: a case-subcohort study among Chinese adults.

Authors:  Christiana Kartsonaki; Yuanjie Pang; Iona Millwood; Ling Yang; Yu Guo; Robin Walters; Jun Lv; Michael Hill; Canqing Yu; Yiping Chen; Xiaofang Chen; Eric O'Neill; Junshi Chen; Ruth C Travis; Robert Clarke; Liming Li; Zhengming Chen; Michael V Holmes
Journal:  Int J Epidemiol       Date:  2022-06-13       Impact factor: 9.685

5.  Big data: Some statistical issues.

Authors:  D R Cox; Christiana Kartsonaki; Ruth H Keogh
Journal:  Stat Probab Lett       Date:  2018-05       Impact factor: 0.870

6.  Large numbers of explanatory variables: a probabilistic assessment.

Authors:  H S Battey; D R Cox
Journal:  Proc Math Phys Eng Sci       Date:  2018-07-04       Impact factor: 2.704

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.