Literature DB >> 33469060

A comparative study of evaluating missing value imputation methods in label-free proteomics.

Liang Jin1, Yingtao Bi2, Chenqi Hu1, Jun Qu3,4, Shichen Shen3,4, Xue Wang1, Yu Tian5.   

Abstract

The presence of missing values (MVs) in label-free quantitative proteomics greatly reduces the completeness of data. Imputation has been widely utilized to handle MVs, and selection of the proper method is critical for the accuracy and reliability of imputation. Here we present a comparative study that evaluates the performance of seven popular imputation methods with a large-scale benchmark dataset and an immune cell dataset. Simulated MVs were incorporated into the complete part of each dataset with different combinations of MV rates and missing not at random (MNAR) rates. Normalized root mean square error (NRMSE) was applied to evaluate the accuracy of protein abundances and intergroup protein ratios after imputation. Detection of true positives (TPs) and false altered-protein discovery rate (FADR) between groups were also compared using the benchmark dataset. Furthermore, the accuracy of handling real MVs was assessed by comparing enriched pathways and signature genes of cell activation after imputing the immune cell dataset. We observed that the accuracy of imputation is primarily affected by the MNAR rate rather than the MV rate, and downstream analysis can be largely impacted by the selection of imputation methods. A random forest-based imputation method consistently outperformed other popular methods by achieving the lowest NRMSE, high amount of TPs with the average FADR < 5%, and the best detection of relevant pathways and signature genes, highlighting it as the most suitable method for label-free proteomics.

Entities:  

Year:  2021        PMID: 33469060      PMCID: PMC7815892          DOI: 10.1038/s41598-021-81279-4

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


  25 in total

1.  Missing value estimation methods for DNA microarrays.

Authors:  O Troyanskaya; M Cantor; G Sherlock; P Brown; T Hastie; R Tibshirani; D Botstein; R B Altman
Journal:  Bioinformatics       Date:  2001-06       Impact factor: 6.937

2.  Missing value estimation for DNA microarray gene expression data: local least squares imputation.

Authors:  Hyunsoo Kim; Gene H Golub; Haesun Park
Journal:  Bioinformatics       Date:  2004-08-27       Impact factor: 6.937

3.  clusterProfiler: an R package for comparing biological themes among gene clusters.

Authors:  Guangchuang Yu; Li-Gen Wang; Yanyan Han; Qing-Yu He
Journal:  OMICS       Date:  2012-03-28

Review 4.  Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics.

Authors:  Bobbie-Jo M Webb-Robertson; Holli K Wiberg; Melissa M Matzke; Joseph N Brown; Jing Wang; Jason E McDermott; Richard D Smith; Karin D Rodland; Thomas O Metz; Joel G Pounds; Katrina M Waters
Journal:  J Proteome Res       Date:  2015-04-22       Impact factor: 4.466

5.  Statistical detection of quantitative protein biomarkers provides insights into signaling networks deregulated in acute myeloid leukemia.

Authors:  Laura L Elo; Riikka Karjalainen; Tiina Ohman; Petteri Hintsanen; Tuula A Nyman; Caroline A Heckman; Tero Aittokallio
Journal:  Proteomics       Date:  2014-10-15       Impact factor: 3.984

6.  Normalization and missing value imputation for label-free LC-MS analysis.

Authors:  Yuliya V Karpievitch; Alan R Dabney; Richard D Smith
Journal:  BMC Bioinformatics       Date:  2012-11-05       Impact factor: 3.169

7.  Missing value imputation for microarray data: a comprehensive comparison study and a web tool.

Authors:  Chia-Chun Chiu; Shih-Yao Chan; Chung-Ching Wang; Wei-Sheng Wu
Journal:  BMC Syst Biol       Date:  2013-12-13

8.  Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study.

Authors:  Marietta Kokla; Jyrki Virtanen; Marjukka Kolehmainen; Jussi Paananen; Kati Hanhineva
Journal:  BMC Bioinformatics       Date:  2019-10-11       Impact factor: 3.169

9.  Evaluation of linear models and missing value imputation for the analysis of peptide-centric proteomics.

Authors:  Philip Berg; Evan W McConnell; Leslie M Hicks; Sorina C Popescu; George V Popescu
Journal:  BMC Bioinformatics       Date:  2019-03-14       Impact factor: 3.169

10.  Missing Value Imputation Approach for Mass Spectrometry-based Metabolomics Data.

Authors:  Runmin Wei; Jingye Wang; Mingming Su; Erik Jia; Shaoqiu Chen; Tianlu Chen; Yan Ni
Journal:  Sci Rep       Date:  2018-01-12       Impact factor: 4.379

View more
  5 in total

1.  Imputing Longitudinal Growth Data in International Pediatric Studies: Does CDC Reference Suffice?

Authors:  Zhiguo Li; Jorma Toppari; Markus Lundgren; Brigitte I Frohnert; Peter Achenbach; Riitta Veijola; Vibha Anand
Journal:  AMIA Annu Symp Proc       Date:  2022-02-21

2.  Assessment of label-free quantification and missing value imputation for proteomics in non-human primates.

Authors:  Zeeshan Hamid; Kip D Zimmerman; Hector Guillen-Ahlers; Cun Li; Peter Nathanielsz; Laura A Cox; Michael Olivier
Journal:  BMC Genomics       Date:  2022-07-08       Impact factor: 4.547

3.  Missing Value Imputation Method for Multiclass Matrix Data Based on Closed Itemset.

Authors:  Mayu Tada; Natsumi Suzuki; Yoshifumi Okada
Journal:  Entropy (Basel)       Date:  2022-02-16       Impact factor: 2.524

4.  A combined test for feature selection on sparse metaproteomics data-an alternative to missing value imputation.

Authors:  Sandra Plancade; Magali Berland; Mélisande Blein-Nicolas; Olivier Langella; Ariane Bassignani; Catherine Juste
Journal:  PeerJ       Date:  2022-06-24       Impact factor: 3.061

5.  ImputEHR: A Visualization Tool of Imputation for the Prediction of Biomedical Data.

Authors:  Yi-Hui Zhou; Ehsan Saghapour
Journal:  Front Genet       Date:  2021-07-02       Impact factor: 4.599

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.