Literature DB >> 19965979

Dealing with missing values in large-scale studies: microarray data imputation and beyond.

Tero Aittokallio1.   

Abstract

High-throughput biotechnologies, such as gene expression microarrays or mass-spectrometry-based proteomic assays, suffer from frequent missing values due to various experimental reasons. Since the missing data points can hinder downstream analyses, there exists a wide variety of ways in which to deal with missing values in large-scale data sets. Nowadays, it has become routine to estimate (or impute) the missing values prior to the actual data analysis. After nearly a decade since the publication of the first missing value imputation methods for gene expression microarray data, new imputation approaches are still being developed at an increasing rate. However, what is lagging behind is a systematic and objective evaluation of the strengths and weaknesses of the different approaches when faced with different types of data sets and experimental questions. In this review, the present strategies for missing value imputation and the measures for evaluating their performance are described. The imputation methods are first reviewed in the context of gene expression microarray data, since most of the methods have been developed for estimating gene expression levels; then, we turn to other large-scale data sets that also suffer from the problems posed by missing values, together with pointers to possible imputation approaches in these settings. Along with a description of the basic principles behind the different imputation approaches, the review tries to provide practical guidance for the users of high-throughput technologies on how to choose the imputation tool for their data and questions, and some additional research directions for the developers of imputation methodologies.

Mesh:

Year:  2009        PMID: 19965979     DOI: 10.1093/bib/bbp059

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  41 in total

1.  Biological impact of missing-value imputation on downstream analyses of gene expression profiles.

Authors:  Sunghee Oh; Dongwan D Kang; Guy N Brock; George C Tseng
Journal:  Bioinformatics       Date:  2010-11-02       Impact factor: 6.937

2.  Random Forest Missing Data Algorithms.

Authors:  Fei Tang; Hemant Ishwaran
Journal:  Stat Anal Data Min       Date:  2017-06-13       Impact factor: 1.051

3.  SIMON, an Automated Machine Learning System, Reveals Immune Signatures of Influenza Vaccine Responses.

Authors:  Adriana Tomic; Ivan Tomic; Yael Rosenberg-Hasson; Cornelia L Dekker; Holden T Maecker; Mark M Davis
Journal:  J Immunol       Date:  2019-06-14       Impact factor: 5.422

4.  Cardiac metabolic effects of KNa1.2 channel deletion and evidence for its mitochondrial localization.

Authors:  Charles O Smith; Yves T Wang; Sergiy M Nadtochiy; James H Miller; Elizabeth A Jonas; Robert T Dirksen; Keith Nehrke; Paul S Brookes
Journal:  FASEB J       Date:  2018-06-04       Impact factor: 5.191

Review 5.  Predicting outcomes in radiation oncology--multifactorial decision support systems.

Authors:  Philippe Lambin; Ruud G P M van Stiphout; Maud H W Starmans; Emmanuel Rios-Velazquez; Georgi Nalbantov; Hugo J W L Aerts; Erik Roelofs; Wouter van Elmpt; Paul C Boutros; Pierluigi Granone; Vincenzo Valentini; Adrian C Begg; Dirk De Ruysscher; Andre Dekker
Journal:  Nat Rev Clin Oncol       Date:  2012-11-20       Impact factor: 66.675

6.  Effects of imputation on correlation: implications for analysis of mass spectrometry data from multiple biological matrices.

Authors:  Sandra L Taylor; L Renee Ruhaak; Karen Kelly; Robert H Weiss; Kyoungmi Kim
Journal:  Brief Bioinform       Date:  2017-03-01       Impact factor: 11.622

7.  A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes.

Authors:  Christopher A Mancuso; Jacob L Canfield; Deepak Singla; Arjun Krishnan
Journal:  Nucleic Acids Res       Date:  2020-12-02       Impact factor: 16.971

Review 8.  Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.

Authors:  Bobbie-Jo M Webb-Robertson; Holli K Wiberg; Melissa M Matzke; Joseph N Brown; Jing Wang; Jason E McDermott; Richard D Smith; Karin D Rodland; Thomas O Metz; Joel G Pounds; Katrina M Waters
Journal:  J Proteome Res       Date:  2015-04-22       Impact factor: 4.466

9.  A study on the predictability of acute lymphoblastic leukaemia response to treatment using a hybrid oncosimulator.

Authors:  Eleftherios Ouzounoglou; Eleni Kolokotroni; Martin Stanulla; Georgios S Stamatakos
Journal:  Interface Focus       Date:  2017-12-15       Impact factor: 3.906

10.  Metabolomic profiling of the heart during acute ischemic preconditioning reveals a role for SIRT1 in rapid cardioprotective metabolic adaptation.

Authors:  Sergiy M Nadtochiy; William Urciuoli; Jimmy Zhang; Xenia Schafer; Joshua Munger; Paul S Brookes
Journal:  J Mol Cell Cardiol       Date:  2015-09-24       Impact factor: 5.000

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.