Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Preserving biological heterogeneity with a permuted surrogate variable analysis for genomics batch correction.

Literature DB >> 24907368

Preserving biological heterogeneity with a permuted surrogate variable analysis for genomics batch correction.

Hilary S Parker¹, Jeffrey T Leek¹, Alexander V Favorov², Michael Considine¹, Xiaoxin Xia¹, Sameer Chavan¹, Christine H Chung¹, Elana J Fertig¹.

Abstract

MOTIVATION: Sample source, procurement process and other technical variations introduce batch effects into genomics data. Algorithms to remove these artifacts enhance differences between known biological covariates, but also carry potential concern of removing intragroup biological heterogeneity and thus any personalized genomic signatures. As a result, accurate identification of novel subtypes from batch-corrected genomics data is challenging using standard algorithms designed to remove batch effects for class comparison analyses. Nor can batch effects be corrected reliably in future applications of genomics-based clinical tests, in which the biological groups are by definition unknown a priori.
RESULTS: Therefore, we assess the extent to which various batch correction algorithms remove true biological heterogeneity. We also introduce an algorithm, permuted-SVA (pSVA), using a new statistical model that is blind to biological covariates to correct for technical artifacts while retaining biological heterogeneity in genomic data. This algorithm facilitated accurate subtype identification in head and neck cancer from gene expression data in both formalin-fixed and frozen samples. When applied to predict Human Papillomavirus (HPV) status, pSVA improved cross-study validation even if the sample batches were highly confounded with HPV status in the training set.
AVAILABILITY AND IMPLEMENTATION: All analyses were performed using R version 2.15.0. The code and data used to generate the results of this manuscript is available from https://sourceforge.net/projects/psva.

Entities: Chemical Disease Species

Mesh：

Year: 2014 PMID： 24907368 PMCID： PMC4173013 DOI： 10.1093/bioinformatics/btu375

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

26 in total

1. Using control genes to correct for unwanted variation in microarray data.

Authors: Johann A Gagnon-Bartsch; Terence P Speed
Journal: Biostatistics Date: 2011-11-17 Impact factor: 5.899

2. CoGAPS: an R/C++ package to identify patterns and biological process activity in transcriptomic data.

Authors: Elana J Fertig; Jie Ding; Alexander V Favorov; Giovanni Parmigiani; Michael F Ochs
Journal: Bioinformatics Date: 2010-09-01 Impact factor: 6.937

3. Temporal dynamics and genetic control of transcription in the human prefrontal cortex.

Authors: Carlo Colantuoni; Barbara K Lipska; Tianzhang Ye; Thomas M Hyde; Ran Tao; Jeffrey T Leek; Elizabeth A Colantuoni; Abdel G Elkahloun; Mary M Herman; Daniel R Weinberger; Joel E Kleinman
Journal: Nature Date: 2011-10-26 Impact factor: 49.962

4. Phase 2 trial of oxaliplatin and pemetrexed as an induction regimen in locally advanced head and neck cancer.

Authors: Jill Gilbert; Barbara Murphy; Mary S Dietrich; Eve Henry; Richard Jordan; Ashley Counsell; Pamela Wirth; Wendell G Yarbrough; Robert J Slebos; Christine H Chung
Journal: Cancer Date: 2011-07-15 Impact factor: 6.860

5. The practical effect of batch on genomic prediction.

Authors: Hilary S Parker; Jeffrey T Leek
Journal: Stat Appl Genet Mol Biol Date: 2012

Review 6. Tackling the widespread and critical impact of batch effects in high-throughput data.

Authors: Jeffrey T Leek; Robert B Scharpf; Héctor Corrada Bravo; David Simcha; Benjamin Langmead; W Evan Johnson; Donald Geman; Keith Baggerly; Rafael A Irizarry
Journal: Nat Rev Genet Date: 2010-09-14 Impact factor: 53.242

7. Diagnosis of multiple cancer types by shrunken centroids of gene expression.

Authors: Robert Tibshirani; Trevor Hastie; Balasubramanian Narasimhan; Gilbert Chu
Journal: Proc Natl Acad Sci U S A Date: 2002-05-14 Impact factor: 11.205

8. Nuclear factor-kappa B pathway and response in a phase II trial of bortezomib and docetaxel in patients with recurrent and/or metastatic head and neck squamous cell carcinoma.

Authors: C H Chung; J Aulino; N J Muldowney; H Hatakeyama; J Baumann; B Burkey; J Netterville; R Sinard; W G Yarbrough; A J Cmelak; R J Slebos; Y Shyr; J Parker; J Gilbert; B A Murphy
Journal: Ann Oncol Date: 2009-10-22 Impact factor: 32.976

9. Fundamental differences in cell cycle deregulation in human papillomavirus-positive and human papillomavirus-negative head/neck and cervical cancers.

Authors: Dohun Pyeon; Michael A Newton; Paul F Lambert; Johan A den Boon; Srikumar Sengupta; Carmen J Marsit; Craig D Woodworth; Joseph P Connor; Thomas H Haugen; Elaine M Smith; Karl T Kelsey; Lubomir P Turek; Paul Ahlquist
Journal: Cancer Res Date: 2007-05-15 Impact factor: 12.701

10. Gene expression differences associated with human papillomavirus status in head and neck squamous cell carcinoma.

Authors: Robbert J C Slebos; Yajun Yi; Kim Ely; Jesse Carter; Amy Evjen; Xueqiong Zhang; Yu Shyr; Barbara M Murphy; Anthony J Cmelak; Brian B Burkey; James L Netterville; Shawn Levy; Wendell G Yarbrough; Christine H Chung
Journal: Clin Cancer Res Date: 2006-02-01 Impact factor: 12.531

47 in total

1. Identification of innate lymphoid cells in single-cell RNA-Seq data.

Authors: Madeleine Suffiotti; Santiago J Carmona; Camilla Jandus; David Gfeller
Journal: Immunogenetics Date: 2017-05-22 Impact factor: 2.846

2. Learning and Imputation for Mass-spec Bias Reduction (LIMBR).

Authors: Alexander M Crowell; Casey S Greene; Jennifer J Loros; Jay C Dunlap
Journal: Bioinformatics Date: 2019-05-01 Impact factor: 6.937

3. Discovering What Dimensionality Reduction Really Tells Us About RNA-Seq Data.

Authors: Sean Simmons; Jian Peng; Jadwiga Bienkowska; Bonnie Berger
Journal: J Comput Biol Date: 2015-06-22 Impact factor: 1.479

4. Identification and Verification of Diagnostic Biomarkers for Glomerular Injury in Diabetic Nephropathy Based on Machine Learning Algorithms.

Authors: Hongdong Han; Yanrong Chen; Hao Yang; Wei Cheng; Sijing Zhang; Yunting Liu; Qiuhong Liu; Dongfang Liu; Gangyi Yang; Ke Li
Journal: Front Endocrinol (Lausanne) Date: 2022-05-19 Impact factor: 6.055

5. Integration of Genetic and Immune Infiltration Insights into Data Mining of Multiple Sclerosis Pathogenesis.

Authors: Xiaoyun Zhang; Ying Song; Xiao Chen; Xiaojia Zhuang; Zhiqiang Wei; Li Yi
Journal: Comput Intell Neurosci Date: 2022-06-27

6. Identification and Verification of Feature Biomarkers Associated With Immune Cells in Dilated Cardiomyopathy by Bioinformatics Analysis.

Authors: Tingfang Zhu; Mingjie Wang; Jinwei Quan; Zunhui Du; Qiheng Li; Yuan Xie; Menglu Lin; Cathy Xu; Yucai Xie
Journal: Front Genet Date: 2022-05-12 Impact factor: 4.772

7. Subtype prediction in pediatric acute myeloid leukemia: classification using differential network rank conservation revisited.

Authors: Askar Obulkasim; Maarten Fornerod; Michel C Zwaan; Dirk Reinhardt; Marry M van den Heuvel-Eibrink
Journal: BMC Bioinformatics Date: 2015-09-23 Impact factor: 3.169

8. Adversarial deconfounding autoencoder for learning robust gene expression embeddings.

Authors: Ayse B Dincer; Joseph D Janizek; Su-In Lee
Journal: Bioinformatics Date: 2020-12-30 Impact factor: 6.937

9. Identification of key target genes and pathways in laryngeal carcinoma.

Authors: Feng Liu; Jintao Du; Jun Liu; Bei Wen
Journal: Oncol Lett Date: 2016-06-17 Impact factor: 2.967

10. Identification of Underlying Hub Genes Associated with Hypertrophic Cardiomyopathy by Integrated Bioinformatics Analysis.

Authors: Zetao Ma; Xizhi Wang; Qingbo Lv; Yingchao Gong; Minghong Xia; Lenan Zhuang; Xue Lu; Ying Yang; Wenbin Zhang; Guosheng Fu; Yang Ye; Dongwu Lai
Journal: Pharmgenomics Pers Med Date: 2021-07-12