Literature DB >> 27532064

Evaluating the impact of sequencing error correction for RNA-seq data with ERCC RNA spike-in controls.

Li Tong1, Cheng Yang2, Po-Yen Wu3, May D Wang1.   

Abstract

Sequencing errors are a major issue for several next-generation sequencing-based applications such as de novo assembly and single nucleotide polymorphism detection. Several error-correction methods have been developed to improve raw data quality. However, error-correction performance is hard to evaluate because of the lack of a ground truth. In this study, we propose a novel approach which using ERCC RNA spike-in controls as the ground truth to facilitate error-correction performance evaluation. After aligning raw and corrected RNA-seq data, we characterized the quality of reads by three metrics: mismatch patterns (i.e., the substitution rate of A to C) of reads aligned with one mismatch, mismatch patterns of reads aligned with two mismatches and the percentage increase of reads aligned to reference. We observed that the mismatch patterns for reads aligned with one mismatch are significantly correlated between ERCC spike-ins and real RNA samples. Based on such observations, we conclude that ERCC spike-ins can serve as ground truths for error correction beyond their previous applications for validation of dynamic range and fold-change response. Also, the mismatch patterns for ERCC reads aligned with one mismatch can serve as a novel and reliable metric to evaluate the performance of error-correction tools.

Entities:  

Year:  2016        PMID: 27532064      PMCID: PMC4983418          DOI: 10.1109/BHI.2016.7455838

Source DB:  PubMed          Journal:  IEEE EMBS Int Conf Biomed Health Inform


  17 in total

Review 1.  A survey of error-correction methods for next-generation sequencing.

Authors:  Xiao Yang; Sriram P Chockalingam; Srinivas Aluru
Journal:  Brief Bioinform       Date:  2012-04-06       Impact factor: 11.622

Review 2.  RNA sequencing: advances, challenges and opportunities.

Authors:  Fatih Ozsolak; Patrice M Milos
Journal:  Nat Rev Genet       Date:  2010-12-30       Impact factor: 53.242

3.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

4.  Assessing technical performance in differential gene expression experiments with external spike-in RNA control ratio mixtures.

Authors:  Sarah A Munro; Steven P Lund; P Scott Pine; Hans Binder; Djork-Arné Clevert; Ana Conesa; Joaquin Dopazo; Mario Fasold; Sepp Hochreiter; Huixiao Hong; Nadereh Jafari; David P Kreil; Paweł P Łabaj; Sheng Li; Yang Liao; Simon M Lin; Joseph Meehan; Christopher E Mason; Javier Santoyo-Lopez; Robert A Setterquist; Leming Shi; Wei Shi; Gordon K Smyth; Nancy Stralis-Pavese; Zhenqiang Su; Weida Tong; Charles Wang; Jian Wang; Joshua Xu; Zhan Ye; Yong Yang; Ying Yu; Marc Salit
Journal:  Nat Commun       Date:  2014-09-25       Impact factor: 14.919

Review 5.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

6.  Sequence-specific error profile of Illumina sequencers.

Authors:  Kensuke Nakamura; Taku Oshima; Takuya Morimoto; Shun Ikeda; Hirofumi Yoshikawa; Yuh Shiwa; Shu Ishikawa; Margaret C Linak; Aki Hirai; Hiroki Takahashi; Md Altaf-Ul-Amin; Naotake Ogasawara; Shigehiko Kanaya
Journal:  Nucleic Acids Res       Date:  2011-05-16       Impact factor: 16.971

7.  Identification and correction of systematic error in high-throughput sequence data.

Authors:  Frazer Meacham; Dario Boffelli; Joseph Dhahbi; David I K Martin; Meromit Singer; Lior Pachter
Journal:  BMC Bioinformatics       Date:  2011-11-21       Impact factor: 3.169

8.  A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.

Authors: 
Journal:  Nat Biotechnol       Date:  2014-08-24       Impact factor: 54.908

9.  Probabilistic error correction for RNA sequencing.

Authors:  Hai-Son Le; Marcel H Schulz; Brenna M McCauley; Veronica F Hinman; Ziv Bar-Joseph
Journal:  Nucleic Acids Res       Date:  2013-04-04       Impact factor: 16.971

10.  TopHat: discovering splice junctions with RNA-Seq.

Authors:  Cole Trapnell; Lior Pachter; Steven L Salzberg
Journal:  Bioinformatics       Date:  2009-03-16       Impact factor: 6.937

View more
  2 in total

1.  Simultaneous sequencing of coding and noncoding RNA reveals a human transcriptome dominated by a small number of highly expressed noncoding genes.

Authors:  Vincent Boivin; Gabrielle Deschamps-Francoeur; Sonia Couture; Ryan M Nottingham; Philia Bouchard-Bourelle; Alan M Lambowitz; Michelle S Scott; Sherif Abou-Elela
Journal:  RNA       Date:  2018-04-27       Impact factor: 4.942

2.  Transcriptome-module phenotype association study implicates extracellular vesicles biogenesis in Plasmodium falciparum artemisinin resistance.

Authors:  Kwesi Z Tandoh; Oheneba C Hagan; Michael D Wilson; Neils B Quashie; Nancy O Duah-Quashie
Journal:  Front Cell Infect Microbiol       Date:  2022-08-19       Impact factor: 6.073

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.