BACKGROUND: Single-cell sequencing enables us to better understand genetic diseases, such as cancer or autoimmune disorders, which are often affected by changes in rare cells. Currently, no existing software is aimed at identifying single nucleotide variations or micro (1-50 bp) insertions and deletions in single-cell RNA sequencing (scRNA-seq) data. Generating high-quality variant data is vital to the study of the aforementioned diseases, among others. RESULTS: In this study, we report the design and implementation of Red Panda, a novel method to accurately identify variants in scRNA-seq data. Variants were called on scRNA-seq data from human articular chondrocytes, mouse embryonic fibroblasts (MEFs), and simulated data stemming from the MEF alignments. Red Panda had the highest Positive Predictive Value at 45.0%, while other tools-FreeBayes, GATK HaplotypeCaller, GATK UnifiedGenotyper, Monovar, and Platypus-ranged from 5.8-41.53%. From the simulated data, Red Panda had the highest sensitivity at 72.44%. CONCLUSIONS: We show that our method provides a novel and improved mechanism to identify variants in scRNA-seq as compared to currently existing software. However, methods for identification of genomic variants using scRNA-seq data can be still improved.
BACKGROUND: Single-cell sequencing enables us to better understand genetic diseases, such as cancer or autoimmune disorders, which are often affected by changes in rare cells. Currently, no existing software is aimed at identifying single nucleotide variations or micro (1-50 bp) insertions and deletions in single-cell RNA sequencing (scRNA-seq) data. Generating high-quality variant data is vital to the study of the aforementioned diseases, among others. RESULTS: In this study, we report the design and implementation of Red Panda, a novel method to accurately identify variants in scRNA-seq data. Variants were called on scRNA-seq data from human articular chondrocytes, mouse embryonic fibroblasts (MEFs), and simulated data stemming from the MEF alignments. Red Panda had the highest Positive Predictive Value at 45.0%, while other tools-FreeBayes, GATK HaplotypeCaller, GATK UnifiedGenotyper, Monovar, and Platypus-ranged from 5.8-41.53%. From the simulated data, Red Panda had the highest sensitivity at 72.44%. CONCLUSIONS: We show that our method provides a novel and improved mechanism to identify variants in scRNA-seq as compared to currently existing software. However, methods for identification of genomic variants using scRNA-seq data can be still improved.
Entities:
Keywords:
Heterozygous variant calling; Human articular chondrocytes; Red panda; Single cell sequencing; Variant calling using scRNAseq
Authors: Michael J McConnell; Michael R Lindberg; Kristen J Brennand; Julia C Piper; Thierry Voet; Chris Cowing-Zitron; Svetlana Shumilina; Roger S Lasken; Joris R Vermeesch; Ira M Hall; Fred H Gage Journal: Science Date: 2013-11-01 Impact factor: 47.728
Authors: Jacob A Tennessen; Abigail W Bigham; Timothy D O'Connor; Wenqing Fu; Eimear E Kenny; Simon Gravel; Sean McGee; Ron Do; Xiaoming Liu; Goo Jun; Hyun Min Kang; Daniel Jordan; Suzanne M Leal; Stacey Gabriel; Mark J Rieder; Goncalo Abecasis; David Altshuler; Deborah A Nickerson; Eric Boerwinkle; Shamil Sunyaev; Carlos D Bustamante; Michael J Bamshad; Joshua M Akey Journal: Science Date: 2012-05-17 Impact factor: 47.728
Authors: Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean Journal: Nature Date: 2010-10-28 Impact factor: 49.962
Authors: Murim Choi; Ute I Scholl; Weizhen Ji; Tiewen Liu; Irina R Tikhonova; Paul Zumbo; Ahmet Nayir; Ayşin Bakkaloğlu; Seza Ozen; Sami Sanjad; Carol Nelson-Williams; Anita Farhi; Shrikant Mane; Richard P Lifton Journal: Proc Natl Acad Sci U S A Date: 2009-10-27 Impact factor: 11.205