Literature DB >> 22851511

Batch effect removal methods for microarray gene expression data integration: a survey.

Cosmin Lazar1, Stijn Meganck, Jonatan Taminau, David Steenhoff, Alain Coletta, Colin Molter, David Y Weiss-Solís, Robin Duque, Hugues Bersini, Ann Nowé.   

Abstract

Genomic data integration is a key goal to be achieved towards large-scale genomic data analysis. This process is very challenging due to the diverse sources of information resulting from genomics experiments. In this work, we review methods designed to combine genomic data recorded from microarray gene expression (MAGE) experiments. It has been acknowledged that the main source of variation between different MAGE datasets is due to the so-called 'batch effects'. The methods reviewed here perform data integration by removing (or more precisely attempting to remove) the unwanted variation associated with batch effects. They are presented in a unified framework together with a wide range of evaluation tools, which are mandatory in assessing the efficiency and the quality of the data integration process. We provide a systematic description of the MAGE data integration methodology together with some basic recommendation to help the users in choosing the appropriate tools to integrate MAGE data for large-scale analysis; and also how to evaluate them from different perspectives in order to quantify their efficiency. All genomic data used in this study for illustration purposes were retrieved from InSilicoDB http://insilico.ulb.ac.be.

Keywords:  Microarray gene expression data; batch effect removal; combining microarray datasets; data integration; large-scale genomic data analysis; microarray gene expression data merging

Mesh:

Year:  2012        PMID: 22851511     DOI: 10.1093/bib/bbs037

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  113 in total

1.  A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis.

Authors:  Sarah E Reese; Kellie J Archer; Terry M Therneau; Elizabeth J Atkinson; Celine M Vachon; Mariza de Andrade; Jean-Pierre A Kocher; Jeanette E Eckel-Passow
Journal:  Bioinformatics       Date:  2013-08-19       Impact factor: 6.937

2.  Important Issues in Planning a Proteomics Experiment: Statistical Considerations of Quantitative Proteomic Data.

Authors:  Karin Schork; Katharina Podwojski; Michael Turewicz; Christian Stephan; Martin Eisenacher
Journal:  Methods Mol Biol       Date:  2021

Review 3.  Row versus column correlations: avoiding the ecological fallacy in RNA/protein expression studies.

Authors:  Jonathon J O'Brien; Harsha P Gunawardena; Bahjat F Qaqish
Journal:  Brief Bioinform       Date:  2018-09-28       Impact factor: 11.622

4.  Long non-coding RNA transcriptome of uncharacterized samples can be accurately imputed using protein-coding genes.

Authors:  Aritro Nath; Paul Geeleher; R Stephanie Huang
Journal:  Brief Bioinform       Date:  2020-03-23       Impact factor: 11.622

5.  Data aggregation at the level of molecular pathways improves stability of experimental transcriptomic and proteomic data.

Authors:  Nicolas Borisov; Maria Suntsova; Maxim Sorokin; Andrew Garazha; Olga Kovalchuk; Alexander Aliper; Elena Ilnitskaya; Ksenia Lezhnina; Mikhail Korzinkin; Victor Tkachev; Vyacheslav Saenko; Yury Saenko; Dmitry G Sokov; Nurshat M Gaifullin; Kirill Kashintsev; Valery Shirokorad; Irina Shabalina; Alex Zhavoronkov; Bhubaneswar Mishra; Charles R Cantor; Anton Buzdin
Journal:  Cell Cycle       Date:  2017-08-21       Impact factor: 4.534

6.  Positional effects revealed in Illumina methylation array and the impact on analysis.

Authors:  Chuan Jiao; Chunling Zhang; Rujia Dai; Yan Xia; Kangli Wang; Gina Giase; Chao Chen; Chunyu Liu
Journal:  Epigenomics       Date:  2018-02-22       Impact factor: 4.778

7.  Detecting hidden batch factors through data-adaptive adjustment for biological effects.

Authors:  Haidong Yi; Ayush T Raman; Han Zhang; Genevera I Allen; Zhandong Liu
Journal:  Bioinformatics       Date:  2018-04-01       Impact factor: 6.937

8.  An expanded landscape of human long noncoding RNA.

Authors:  Shuai Jiang; Si-Jin Cheng; Li-Chen Ren; Qian Wang; Yu-Jian Kang; Yang Ding; Mei Hou; Xiao-Xu Yang; Yuan Lin; Nan Liang; Ge Gao
Journal:  Nucleic Acids Res       Date:  2019-09-05       Impact factor: 16.971

9.  Evidence for shared molecular pathways of dysregulated decidualization in preeclampsia and endometrial disorders revealed by microarray data integration.

Authors:  Maria Belen Rabaglino; Kirk P Conrad
Journal:  FASEB J       Date:  2019-08-07       Impact factor: 5.191

Review 10.  The 'omics' of adrenocortical tumours for personalized medicine.

Authors:  Guillaume Assié; Anne Jouinot; Jérôme Bertherat
Journal:  Nat Rev Endocrinol       Date:  2014-02-04       Impact factor: 43.330

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.