Literature DB >> 31611172

Examining the practical limits of batch effect-correction algorithms: When should you care about batch effects?

Longjian Zhou1, Andrew Chi-Hau Sue1, Wilson Wen Bin Goh2.   

Abstract

Batch effects are technical sources of variation and can confound analysis. While many performance ranking exercises have been conducted to establish the best batch effect-correction algorithm (BECA), we hold the viewpoint that the notion of best is context-dependent. Moreover, alternative questions beyond the simplistic notion of "best" are also interesting: are BECAs robust against various degrees of confounding and if so, what is the limit? Using two different methods for simulating class (phenotype) and batch effects and taking various representative datasets across both genomics (RNA-Seq) and proteomics platforms, we demonstrate that under situations where sample classes and batch factors are moderately confounded, most BECAs are remarkably robust and only weakly affected by upstream normalization procedures. This observation is consistently supported across the multitude of test datasets. BECAs do have limits: When sample classes and batch factors are strongly confounded, BECA performance declines, with variable performance in precision, recall and also batch correction. We also report that while conventional normalization methods have minimal impact on batch effect correction, they do not affect downstream statistical feature selection, and in strongly confounded scenarios, may even outperform BECAs. In other words, removing batch effects is no guarantee of optimal functional analysis. Overall, this study suggests that simplistic performance ranking exercises are quite trivial, and all BECAs are compromises in some context or another.
Copyright © 2019 Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, and Genetics Society of China. Published by Elsevier Ltd. All rights reserved.

Keywords:  Batch effects; Bioinformatics; Feature selection; Normalization; Statistics

Year:  2019        PMID: 31611172     DOI: 10.1016/j.jgg.2019.08.002

Source DB:  PubMed          Journal:  J Genet Genomics        ISSN: 1673-8527            Impact factor:   4.275


  5 in total

1.  Simulating ComBat: how batch correction can lead to the systematic introduction of false positive results in DNA methylation microarray studies.

Authors:  Tristan Zindler; Helge Frieling; Alexandra Neyazi; Stefan Bleich; Eva Friedel
Journal:  BMC Bioinformatics       Date:  2020-06-30       Impact factor: 3.169

2.  The role of gene to gene interaction in the breast's genomic signature of pregnancy.

Authors:  Pedro J Gutiérrez-Díez; Javier Gomez-Pilar; Roberto Hornero; Julia Martínez-Rodríguez; Miguel A López-Marcos; Jose Russo
Journal:  Sci Rep       Date:  2021-01-29       Impact factor: 4.379

Review 3.  Mathematical-based microbiome analytics for clinical translation.

Authors:  Jayanth Kumar Narayana; Micheál Mac Aogáin; Wilson Wen Bin Goh; Kelin Xia; Krasimira Tsaneva-Atanasova; Sanjay H Chotirmall
Journal:  Comput Struct Biotechnol J       Date:  2021-11-22       Impact factor: 7.271

4.  Doppelgänger spotting in biomedical gene expression data.

Authors:  Li Rong Wang; Xin Yun Choy; Wilson Wen Bin Goh
Journal:  iScience       Date:  2022-07-19

Review 5.  Perspectives for better batch effect correction in mass-spectrometry-based proteomics.

Authors:  Ser-Xian Phua; Kai-Peng Lim; Wilson Wen-Bin Goh
Journal:  Comput Struct Biotechnol J       Date:  2022-08-12       Impact factor: 6.155

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.