Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 ACTIVA: realistic single-cell RNA-seq generation with automatic cell-type identification using introspective variational autoencoders.

Literature DB >> 35179571

ACTIVA: realistic single-cell RNA-seq generation with automatic cell-type identification using introspective variational autoencoders.

A Ali Heydari^1,2, Oscar A Davalos^3,2, Lihong Zhao¹, Katrina K Hoyer^4,2, Suzanne S Sindi^1,2.

Abstract

MOTIVATION: Single-cell RNA sequencing (scRNAseq) technologies allow for measurements of gene expression at a single-cell resolution. This provides researchers with a tremendous advantage for detecting heterogeneity, delineating cellular maps or identifying rare subpopulations. However, a critical complication remains: the low number of single-cell observations due to limitations by rarity of subpopulation, tissue degradation or cost. This absence of sufficient data may cause inaccuracy or irreproducibility of downstream analysis. In this work, we present ACTIVA (Automated Cell-Type-informed Introspective Variational Autoencoder): a novel framework for generating realistic synthetic data using a single-stream adversarial variational autoencoder conditioned with cell-type information. Within a single framework, ACTIVA can enlarge existing datasets and generate specific subpopulations on demand, as opposed to two separate models (such as scGAN and cscGAN). Data generation and augmentation with ACTIVA can enhance scRNAseq pipelines and analysis, such as benchmarking new algorithms, studying the accuracy of classifiers and detecting marker genes. ACTIVA will facilitate analysis of smaller datasets, potentially reducing the number of patients and animals necessary in initial studies.
RESULTS: We train and evaluate models on multiple public scRNAseq datasets. In comparison to GAN-based models (scGAN and cscGAN), we demonstrate that ACTIVA generates cells that are more realistic and harder for classifiers to identify as synthetic which also have better pair-wise correlation between genes. Data augmentation with ACTIVA significantly improves classification of rare subtypes (more than 45% improvement compared to not augmenting and 4% better than cscGAN) all while reducing run-time by an order of magnitude in comparison to both models. AVAILABILITY OF DATA AND CODE: The codes and datasets are hosted on Zenodo (https://doi.org/10.5281/zenodo.5879639). Tutorials are available at https://github.com/SindiLab/ACTIVA. SUPPLEMENTARY INFORMATION: Supplementary material is available online.

Entities: Chemical

Year: 2022 PMID： 35179571 PMCID： PMC9004654 DOI： 10.1093/bioinformatics/btac095

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.931

Keyword Cloud
References

19 in total

1. Polyester: simulating RNA-seq datasets with differential transcript expression.

Authors: Alyssa C Frazee; Andrew E Jaffe; Ben Langmead; Jeffrey T Leek
Journal: Bioinformatics Date: 2015-04-28 Impact factor: 6.937

2. SimSeq: a nonparametric approach to simulation of RNA-sequence datasets.

Authors: Sam Benidt; Dan Nettleton
Journal: Bioinformatics Date: 2015-02-26 Impact factor: 6.937

3. ACTINN: automated identification of cell types in single cell RNA sequencing.

Authors: Feiyang Ma; Matteo Pellegrini
Journal: Bioinformatics Date: 2020-01-15 Impact factor: 6.937

4. Mapping the Mouse Cell Atlas by Microwell-Seq.

Authors: Xiaoping Han; Renying Wang; Yincong Zhou; Lijiang Fei; Huiyu Sun; Shujing Lai; Assieh Saadatpour; Ziming Zhou; Haide Chen; Fang Ye; Daosheng Huang; Yang Xu; Wentao Huang; Mengmeng Jiang; Xinyi Jiang; Jie Mao; Yao Chen; Chenyu Lu; Jin Xie; Qun Fang; Yibin Wang; Rui Yue; Tiefeng Li; He Huang; Stuart H Orkin; Guo-Cheng Yuan; Ming Chen; Guoji Guo
Journal: Cell Date: 2018-02-22 Impact factor: 41.582

5. Text Data Augmentation for Deep Learning.

Authors: Connor Shorten; Taghi M Khoshgoftaar; Borko Furht
Journal: J Big Data Date: 2021-07-19

6. Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks.

Authors: Mohamed Marouf; Pierre Machart; Vikas Bansal; Christoph Kilian; Daniel S Magruder; Christian F Krebs; Stefan Bonn
Journal: Nat Commun Date: 2020-01-09 Impact factor: 14.919

7. Neurological Manifestations of COVID-19 Feature T Cell Exhaustion and Dedifferentiated Monocytes in Cerebrospinal Fluid.

Authors: Michael Heming; Xiaolin Li; Saskia Räuber; Anne K Mausberg; Anna-Lena Börsch; Maike Hartlehnert; Arpita Singhal; I-Na Lu; Michael Fleischer; Fabian Szepanowski; Oliver Witzke; Thorsten Brenner; Ulf Dittmer; Nir Yosef; Christoph Kleinschnitz; Heinz Wiendl; Mark Stettner; Gerd Meyer Zu Hörste
Journal: Immunity Date: 2020-12-23 Impact factor: 31.745

8. The Human Cell Atlas.

Authors: Aviv Regev; Sarah A Teichmann; Eric S Lander; Ido Amit; Christophe Benoist; Ewan Birney; Bernd Bodenmiller; Peter Campbell; Piero Carninci; Menna Clatworthy; Hans Clevers; Bart Deplancke; Ian Dunham; James Eberwine; Roland Eils; Wolfgang Enard; Andrew Farmer; Lars Fugger; Berthold Göttgens; Nir Hacohen; Muzlifah Haniffa; Martin Hemberg; Seung Kim; Paul Klenerman; Arnold Kriegstein; Ed Lein; Sten Linnarsson; Emma Lundberg; Joakim Lundeberg; Partha Majumder; John C Marioni; Miriam Merad; Musa Mhlanga; Martijn Nawijn; Mihai Netea; Garry Nolan; Dana Pe'er; Anthony Phillipakis; Chris P Ponting; Stephen Quake; Wolf Reik; Orit Rozenblatt-Rosen; Joshua Sanes; Rahul Satija; Ton N Schumacher; Alex Shalek; Ehud Shapiro; Padmanee Sharma; Jay W Shin; Oliver Stegle; Michael Stratton; Michael J T Stubbington; Fabian J Theis; Matthias Uhlen; Alexander van Oudenaarden; Allon Wagner; Fiona Watt; Jonathan Weissman; Barbara Wold; Ramnik Xavier; Nir Yosef
Journal: Elife Date: 2017-12-05 Impact factor: 8.140

9. Splatter: simulation of single-cell RNA sequencing data.

Authors: Luke Zappia; Belinda Phipson; Alicia Oshlack
Journal: Genome Biol Date: 2017-09-12 Impact factor: 13.583

10. A comparison of automatic cell identification methods for single-cell RNA sequencing data.

Authors: Tamim Abdelaal; Lieke Michielsen; Davy Cats; Dylan Hoogduin; Hailiang Mei; Marcel J T Reinders; Ahmed Mahfouz
Journal: Genome Biol Date: 2019-09-09 Impact factor: 13.583