Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Statistical Contributions to Bioinformatics: Design, Modeling, Structure Learning, and Integration.

Literature DB >> 29129969

Statistical Contributions to Bioinformatics: Design, Modeling, Structure Learning, and Integration.

Jeffrey S Morris¹, Veerabhadran Baladandayuthapani¹.

Abstract

The advent of high-throughput multi-platform genomics technologies providing whole-genome molecular summaries of biological samples has revolutionalized biomedical research. These technologiees yield highly structured big data, whose analysis poses significant quantitative challenges. The field of Bioinformatics has emerged to deal with these challenges, and is comprised of many quantitative and biological scientists working together to effectively process these data and extract the treasure trove of information they contain. Statisticians, with their deep understanding of variability and uncertainty quantification, play a key role in these efforts. In this article, we attempt to summarize some of the key contributions of statisticians to bioinformatics, focusing on four areas: (1) experimental design and reproducibility, (2) preprocessing and feature extraction, (3) unified modeling, and (4) structure learning and integration. In each of these areas, we highlight some key contributions and try to elucidate the key statistical principles underlying these methods and approaches. Our goals are to demonstrate major ways in which statisticians have contributed to bioinformatics, encourage statisticians to get involved early in methods development as new technologies emerge, and to stimulate future methodological work based on the statistical principles elucidated in this article and utilizing all availble information to uncover new biological insights.

Entities: CellLine Chemical Disease Gene Species

Keywords: Bioinformatics; Epigenetics; Experimental Design; Genomics; Preprocessing; Proteomics; Regularization; Reproducible Research; Statistical Modeling

Year: 2017 PMID： 29129969 PMCID： PMC5679480 DOI： 10.1177/1471082X17698255

Source DB: PubMed Journal: Stat Modelling ISSN： 1471-082X Impact factor: 2.039

126 in total

1. KEGG: kyoto encyclopedia of genes and genomes.

Authors: M Kanehisa; S Goto
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. Complete pipeline for Infinium(®) Human Methylation 450K BeadChip data processing using subset quantile normalization for accurate DNA methylation estimation.

Authors: Nizar Touleimat; Jörg Tost
Journal: Epigenomics Date: 2012-06 Impact factor: 4.778

3. BNArray: an R package for constructing gene regulatory networks from microarray data by using Bayesian network.

Authors: Xiaohui Chen; Ming Chen; Kaida Ning
Journal: Bioinformatics Date: 2006-09-27 Impact factor: 6.937

Review 4. CNV discovery using SNP genotyping arrays.

Authors: C Yau; C C Holmes
Journal: Cytogenet Genome Res Date: 2009-03-11 Impact factor: 1.636

5. Reverse phase protein array: validation of a novel proteomic technology and utility for analysis of primary leukemia specimens and hematopoietic stem cells.

Authors: Raoul Tibes; Yihua Qiu; Yiling Lu; Bryan Hennessy; Michael Andreeff; Gordon B Mills; Steven M Kornblau
Journal: Mol Cancer Ther Date: 2006-10 Impact factor: 6.261

6. JOINT AND INDIVIDUAL VARIATION EXPLAINED (JIVE) FOR INTEGRATED ANALYSIS OF MULTIPLE DATA TYPES.

Authors: Eric F Lock; Katherine A Hoadley; J S Marron; Andrew B Nobel
Journal: Ann Appl Stat Date: 2013-03-01 Impact factor: 2.083

7. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors: Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal: Proc Natl Acad Sci U S A Date: 2005-09-30 Impact factor: 11.205

8. Multiset Statistics for Gene Set Analysis.

Authors: Michael A Newton; Zhishi Wang
Journal: Annu Rev Stat Appl Date: 2015-04 Impact factor: 5.810

9. High resolution analysis of DNA copy number variation using comparative genomic hybridization to microarrays.

Authors: D Pinkel; R Segraves; D Sudar; S Clark; I Poole; D Kowbel; C Collins; W L Kuo; C Chen; Y Zhai; S H Dairkee; B M Ljung; J W Gray; D G Albertson
Journal: Nat Genet Date: 1998-10 Impact factor: 38.330

10. Model-based gene set analysis for Bioconductor.

Authors: Sebastian Bauer; Peter N Robinson; Julien Gagneur
Journal: Bioinformatics Date: 2011-05-10 Impact factor: 6.937

6 in total

1. Multi-kernel linear mixed model with adaptive lasso for prediction analysis on high-dimensional multi-omics data.

Authors: Jun Li; Qing Lu; Yalu Wen
Journal: Bioinformatics Date: 2020-03-01 Impact factor: 6.937

2. A penalized linear mixed model with generalized method of moments for prediction analysis on high-dimensional multi-omics data.

Authors: Xiaqiong Wang; Yalu Wen
Journal: Brief Bioinform Date: 2022-07-18 Impact factor: 13.994