Lisa Neums1,2, Seiji Suenaga1, Peter Beyerlein2, Sara Anders2, Devin Koestler3, Andrea Mariani4, Jeremy Chien5. 1. Department of Cancer Biology, University of Kansas Medical Center, 3901 Rainbow Blvd., Kansas City, KS 66160, USA. 2. Department of Bioinformatics and Biosystems Technology, University of Applied Sciences Wildau, Hochschulring 1, 15745 Wildau, Germany. 3. Department of Biostatistics, University of Kansas Medical Center, 3901 Rainbow Blvd., Kansas City, KS 66160, USA. 4. Obstetrics and Gynecology, Cancer Center, Mayo Clinic, 200 First St. SW, Rochester, MN 55905, USA. 5. Department of Internal Medicine, University of New Mexico Health Sciences Center, 2325 Camino de Salud NE, Albuquerque, NM 87131, USA.
Abstract
Background: Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. Results: We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Conclusions: Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Background: Advances in next-generation DNA sequencing technologies are now enabling detailed characterization of sequence variations in cancer genomes. With whole-genome sequencing, variations in coding and non-coding sequences can be discovered. But the cost associated with it is currently limiting its general use in research. Whole-exome sequencing is used to characterize sequence variations in coding regions, but the cost associated with capture reagents and biases in capture rate limit its full use in research. Additional limitations include uncertainty in assigning the functional significance of the mutations when these mutations are observed in the non-coding region or in genes that are not expressed in cancer tissue. Results: We investigated the feasibility of uncovering mutations from expressed genes using RNA sequencing datasets with a method called Variant Detection in RNA(VaDiR) that integrates 3 variant callers, namely: SNPiR, RVBoost, and MuTect2. The combination of all 3 methods, which we called Tier 1 variants, produced the highest precision with true positive mutations from RNA-seq that could be validated at the DNA level. We also found that the integration of Tier 1 variants with those called by MuTect2 and SNPiR produced the highest recall with acceptable precision. Finally, we observed a higher rate of mutation discovery in genes that are expressed at higher levels. Conclusions: Our method, VaDiR, provides a possibility of uncovering mutations from RNA sequencing datasets that could be useful in further functional analysis. In addition, our approach allows orthogonal validation of DNA-based mutation discovery by providing complementary sequence variation analysis from paired RNA/DNA sequencing datasets.
Authors: Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark Daly; Mark A DePristo Journal: Genome Res Date: 2010-07-19 Impact factor: 9.043
Authors: David E Larson; Christopher C Harris; Ken Chen; Daniel C Koboldt; Travis E Abbott; David J Dooling; Timothy J Ley; Elaine R Mardis; Richard K Wilson; Li Ding Journal: Bioinformatics Date: 2011-12-06 Impact factor: 6.937
Authors: Kimberly C Wiegand; Sohrab P Shah; Osama M Al-Agha; Yongjun Zhao; Kane Tse; Thomas Zeng; Janine Senz; Melissa K McConechy; Michael S Anglesio; Steve E Kalloger; Winnie Yang; Alireza Heravi-Moussavi; Ryan Giuliany; Christine Chow; John Fee; Abdalnasser Zayed; Leah Prentice; Nataliya Melnyk; Gulisa Turashvili; Allen D Delaney; Jason Madore; Stephen Yip; Andrew W McPherson; Gavin Ha; Lynda Bell; Sian Fereday; Angela Tam; Laura Galletta; Patricia N Tonin; Diane Provencher; Dianne Miller; Steven J M Jones; Richard A Moore; Gregg B Morin; Arusha Oloumi; Niki Boyd; Samuel A Aparicio; Ie-Ming Shih; Anne-Marie Mes-Masson; David D Bowtell; Martin Hirst; Blake Gilks; Marco A Marra; David G Huntsman Journal: N Engl J Med Date: 2010-09-08 Impact factor: 91.245
Authors: Jennifer S Parla; Ivan Iossifov; Ian Grabill; Mona S Spector; Melissa Kramer; W Richard McCombie Journal: Genome Biol Date: 2011-09-29 Impact factor: 13.583
Authors: Yu Fan; Liu Xi; Daniel S T Hughes; Jianjun Zhang; Jianhua Zhang; P Andrew Futreal; David A Wheeler; Wenyi Wang Journal: Genome Biol Date: 2016-08-24 Impact factor: 13.583
Authors: Nicolai Juul Birkbak; Bose Kochupurakkal; Jose M G Izarzugaza; Aron C Eklund; Yang Li; Joyce Liu; Zoltan Szallasi; Ursula A Matulonis; Andrea L Richardson; J Dirk Iglehart; Zhigang C Wang Journal: PLoS One Date: 2013-11-12 Impact factor: 3.240
Authors: Muxin Gu; Maximillian Zwiebel; Swee Hoe Ong; Nick Boughton; Josep Nomdedeu; Faisal Basheer; Yasuhito Nannya; Pedro M Quiros; Seishi Ogawa; Mario Cazzola; Roland Rad; Adam P Butler; M S Vijayabaskar; George S Vassiliou Journal: Haematologica Date: 2019-10-24 Impact factor: 9.941
Authors: Jeremy Chien; Lisa Neums; Alexis F L A Powell; Michelle Torres; Kimberly R Kalli; Francesco Multinu; Viji Shridhar; Andrea Mariani Journal: Front Oncol Date: 2018-03-07 Impact factor: 6.244