Literature DB >> 33381842

Adversarial deconfounding autoencoder for learning robust gene expression embeddings.

Ayse B Dincer1, Joseph D Janizek1,2, Su-In Lee1.   

Abstract

MOTIVATION: Increasing number of gene expression profiles has enabled the use of complex models, such as deep unsupervised neural networks, to extract a latent space from these profiles. However, expression profiles, especially when collected in large numbers, inherently contain variations introduced by technical artifacts (e.g. batch effects) and uninteresting biological variables (e.g. age) in addition to the true signals of interest. These sources of variations, called confounders, produce embeddings that fail to transfer to different domains, i.e. an embedding learned from one dataset with a specific confounder distribution does not generalize to different distributions. To remedy this problem, we attempt to disentangle confounders from true signals to generate biologically informative embeddings.
RESULTS: In this article, we introduce the Adversarial Deconfounding AutoEncoder (AD-AE) approach to deconfounding gene expression latent spaces. The AD-AE model consists of two neural networks: (i) an autoencoder to generate an embedding that can reconstruct original measurements, and (ii) an adversary trained to predict the confounder from that embedding. We jointly train the networks to generate embeddings that can encode as much information as possible without encoding any confounding signal. By applying AD-AE to two distinct gene expression datasets, we show that our model can (i) generate embeddings that do not encode confounder information, (ii) conserve the biological signals present in the original space and (iii) generalize successfully across different confounder domains. We demonstrate that AD-AE outperforms standard autoencoder and other deconfounding approaches.
AVAILABILITY AND IMPLEMENTATION: Our code and data are available at https://gitlab.cs.washington.edu/abdincer/ad-ae. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Year:  2020        PMID: 33381842      PMCID: PMC7773484          DOI: 10.1093/bioinformatics/btaa796

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  31 in total

1.  Adjustment of systematic microarray data biases.

Authors:  Monica Benito; Joel Parker; Quan Du; Junyuan Wu; Dong Xiang; Charles M Perou; J S Marron
Journal:  Bioinformatics       Date:  2004-01-01       Impact factor: 6.937

2.  Adjusting batch effects in microarray expression data using empirical Bayes methods.

Authors:  W Evan Johnson; Cheng Li; Ariel Rabinovic
Journal:  Biostatistics       Date:  2006-04-21       Impact factor: 5.899

3.  The somatic genomic landscape of glioblastoma.

Authors:  Cameron W Brennan; Roel G W Verhaak; Aaron McKenna; Benito Campos; Houtan Noushmehr; Sofie R Salama; Siyuan Zheng; Debyani Chakravarty; J Zachary Sanborn; Samuel H Berman; Rameen Beroukhim; Brady Bernard; Chang-Jiun Wu; Giannicola Genovese; Ilya Shmulevich; Jill Barnholtz-Sloan; Lihua Zou; Rahulsimham Vegesna; Sachet A Shukla; Giovanni Ciriello; W K Yung; Wei Zhang; Carrie Sougnez; Tom Mikkelsen; Kenneth Aldape; Darell D Bigner; Erwin G Van Meir; Michael Prados; Andrew Sloan; Keith L Black; Jennifer Eschbacher; Gaetano Finocchiaro; William Friedman; David W Andrews; Abhijit Guha; Mary Iacocca; Brian P O'Neill; Greg Foltz; Jerome Myers; Daniel J Weisenberger; Robert Penny; Raju Kucherlapati; Charles M Perou; D Neil Hayes; Richard Gibbs; Marco Marra; Gordon B Mills; Eric Lander; Paul Spellman; Richard Wilson; Chris Sander; John Weinstein; Matthew Meyerson; Stacey Gabriel; Peter W Laird; David Haussler; Gad Getz; Lynda Chin
Journal:  Cell       Date:  2013-10-10       Impact factor: 41.582

4.  An online survival analysis tool to rapidly assess the effect of 22,277 genes on breast cancer prognosis using microarray data of 1,809 patients.

Authors:  Balazs Györffy; Andras Lanczky; Aron C Eklund; Carsten Denkert; Jan Budczies; Qiyuan Li; Zoltan Szallasi
Journal:  Breast Cancer Res Treat       Date:  2009-12-18       Impact factor: 4.872

5.  Batch effect removal methods for microarray gene expression data integration: a survey.

Authors:  Cosmin Lazar; Stijn Meganck; Jonatan Taminau; David Steenhoff; Alain Coletta; Colin Molter; David Y Weiss-Solís; Robin Duque; Hugues Bersini; Ann Nowé
Journal:  Brief Bioinform       Date:  2012-07-31       Impact factor: 11.622

6.  Inconsistency in large pharmacogenomic studies.

Authors:  Benjamin Haibe-Kains; Nehme El-Hachem; Nicolai Juul Birkbak; Andrew C Jin; Andrew H Beck; Hugo J W L Aerts; John Quackenbush
Journal:  Nature       Date:  2013-11-27       Impact factor: 49.962

7.  Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study.

Authors:  Kerby Shedden; Jeremy M G Taylor; Steven A Enkemann; Ming-Sound Tsao; Timothy J Yeatman; William L Gerald; Steven Eschrich; Igor Jurisica; Thomas J Giordano; David E Misek; Andrew C Chang; Chang Qi Zhu; Daniel Strumpf; Samir Hanash; Frances A Shepherd; Keyue Ding; Lesley Seymour; Katsuhiko Naoki; Nathan Pennell; Barbara Weir; Roel Verhaak; Christine Ladd-Acosta; Todd Golub; Michael Gruidl; Anupama Sharma; Janos Szoke; Maureen Zakowski; Valerie Rusch; Mark Kris; Agnes Viale; Noriko Motoi; William Travis; Barbara Conley; Venkatraman E Seshan; Matthew Meyerson; Rork Kuick; Kevin K Dobbin; Tracy Lively; James W Jacobson; David G Beer
Journal:  Nat Med       Date:  2008-07-20       Impact factor: 53.440

Review 8.  Breast cancer prognostic classification in the molecular era: the role of histological grade.

Authors:  Emad A Rakha; Jorge S Reis-Filho; Frederick Baehner; David J Dabbs; Thomas Decker; Vincenzo Eusebi; Stephen B Fox; Shu Ichihara; Jocelyne Jacquemier; Sunil R Lakhani; José Palacios; Andrea L Richardson; Stuart J Schnitt; Fernando C Schmitt; Puay-Hoon Tan; Gary M Tse; Sunil Badve; Ian O Ellis
Journal:  Breast Cancer Res       Date:  2010-07-30       Impact factor: 6.466

9.  DeepSynergy: predicting anti-cancer drug synergy with Deep Learning.

Authors:  Kristina Preuer; Richard P I Lewis; Sepp Hochreiter; Andreas Bender; Krishna C Bulusu; Günter Klambauer
Journal:  Bioinformatics       Date:  2018-05-01       Impact factor: 6.937

10.  Gene2vec: distributed representation of genes based on co-expression.

Authors:  Jingcheng Du; Peilin Jia; Yulin Dai; Cui Tao; Zhongming Zhao; Degui Zhi
Journal:  BMC Genomics       Date:  2019-02-04       Impact factor: 3.969

View more
  4 in total

Review 1.  Navigating the pitfalls of applying machine learning in genomics.

Authors:  Sean Whalen; Jacob Schreiber; William S Noble; Katherine S Pollard
Journal:  Nat Rev Genet       Date:  2021-11-26       Impact factor: 53.242

2.  Large-Scale Integrative Analysis of Soybean Transcriptome Using an Unsupervised Autoencoder Model.

Authors:  Lingtao Su; Chunhui Xu; Shuai Zeng; Li Su; Trupti Joshi; Gary Stacey; Dong Xu
Journal:  Front Plant Sci       Date:  2022-03-03       Impact factor: 5.753

3.  sciCAN: single-cell chromatin accessibility and gene expression data integration via cycle-consistent adversarial network.

Authors:  Yang Xu; Edmon Begoli; Rachel Patton McCord
Journal:  NPJ Syst Biol Appl       Date:  2022-09-12

4.  Multi-omics single-cell data integration and regulatory inference with graph-linked embedding.

Authors:  Zhi-Jie Cao; Ge Gao
Journal:  Nat Biotechnol       Date:  2022-05-02       Impact factor: 68.164

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.