Literature DB >> 30740193

Model-Based Clustering With Data Correction For Removing Artifacts In Gene Expression Data.

William Chad Young1, Adrian E Raftery1, Ka Yee Yeung2.   

Abstract

The NIH Library of Integrated Network-based Cellular Signatures (LINCS) contains gene expression data from over a million experiments, using Luminex Bead technology. Only 500 colors are used to measure the expression levels of the 1,000 landmark genes measured, and the data for the resulting pairs of genes are deconvolved. The raw data are sometimes inadequate for reliable deconvolution, leading to artifacts in the final processed data. These include the expression levels of paired genes being flipped or given the same value, and clusters of values that are not at the true expression level. We propose a new method called model-based clustering with data correction (MCDC) that is able to identify and correct these three kinds of artifacts simultaneously. We show that MCDC improves the resulting gene expression data in terms of agreement with external baselines, as well as improving results from subsequent analysis.

Entities:  

Keywords:  Gene regulatory network; LINCS; MCDC; Model-based clustering

Year:  2017        PMID: 30740193      PMCID: PMC6364860          DOI: 10.1214/17-AOAS1051

Source DB:  PubMed          Journal:  Ann Appl Stat        ISSN: 1932-6157            Impact factor:   2.083


  51 in total

1.  TRANSFAC: an integrated system for gene expression regulation.

Authors:  E Wingender; X Chen; R Hehl; H Karas; I Liebich; V Matys; T Meinhardt; M Prüss; I Reuter; F Schacherer
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  Linear modeling of mRNA expression levels during CNS development and injury.

Authors:  P D'haeseleer; X Wen; S Fuhrman; R Somogyi
Journal:  Pac Symp Biocomput       Date:  1999

3.  Model-based clustering and data transformations for gene expression data.

Authors:  K Y Yeung; C Fraley; A Murua; A E Raftery; W L Ruzzo
Journal:  Bioinformatics       Date:  2001-10       Impact factor: 6.937

4.  Standards for microarray data.

Authors:  Catherine A Ball; Gavin Sherlock; Helen Parkinson; Philippe Rocca-Sera; Catherine Brooksbank; Helen C Causton; Duccio Cavalieri; Terry Gaasterland; Pascal Hingamp; Frank Holstege; Martin Ringwald; Paul Spellman; Christian J Stoeckert; Jason E Stewart; Ronald Taylor; Alvis Brazma; John Quackenbush
Journal:  Science       Date:  2002-10-18       Impact factor: 47.728

5.  JASPAR: an open-access database for eukaryotic transcription factor binding profiles.

Authors:  Albin Sandelin; Wynand Alkema; Pär Engström; Wyeth W Wasserman; Boris Lenhard
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

6.  A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.

Authors:  B M Bolstad; R A Irizarry; M Astrand; T P Speed
Journal:  Bioinformatics       Date:  2003-01-22       Impact factor: 6.937

7.  Bayesian infinite mixture model based clustering of gene expression profiles.

Authors:  Mario Medvedovic; Siva Sivaganesan
Journal:  Bioinformatics       Date:  2002-09       Impact factor: 6.937

8.  Inferring gene networks from time series microarray data using dynamic Bayesian networks.

Authors:  Sun Yong Kim; Seiya Imoto; Satoru Miyano
Journal:  Brief Bioinform       Date:  2003-09       Impact factor: 11.622

9.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data.

Authors:  Min Zou; Suzanne D Conzen
Journal:  Bioinformatics       Date:  2004-08-12       Impact factor: 6.937

10.  A comparison of cluster analysis methods using DNA methylation data.

Authors:  Kimberly D Siegmund; Peter W Laird; Ite A Laird-Offringa
Journal:  Bioinformatics       Date:  2004-03-25       Impact factor: 6.937

View more
  3 in total

1.  Integration of Multiple Data Sources for Gene Network Inference Using Genetic Perturbation Data.

Authors:  Xiao Liang; William Chad Young; Ling-Hong Hung; Adrian E Raftery; Ka Yee Yeung
Journal:  J Comput Biol       Date:  2019-04-22       Impact factor: 1.479

2.  A Bayesian approach to accurate and robust signature detection on LINCS L1000 data.

Authors:  Yue Qiu; Tianhuan Lu; Hansaim Lim; Lei Xie
Journal:  Bioinformatics       Date:  2020-05-01       Impact factor: 6.937

3.  Deep learning prediction of chemical-induced dose-dependent and context-specific multiplex phenotype responses and its application to personalized alzheimer's disease drug repurposing.

Authors:  You Wu; Qiao Liu; Yue Qiu; Lei Xie
Journal:  PLoS Comput Biol       Date:  2022-08-11       Impact factor: 4.779

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.