Literature DB >> 23060617

Transcriptome assembly and isoform expression level estimation from biased RNA-Seq reads.

Wei Li1, Tao Jiang.   

Abstract

MOTIVATION: RNA-Seq uses the high-throughput sequencing technology to identify and quantify transcriptome at an unprecedented high resolution and low cost. However, RNA-Seq reads are usually not uniformly distributed and biases in RNA-Seq data post great challenges in many applications including transcriptome assembly and the expression level estimation of genes or isoforms. Much effort has been made in the literature to calibrate the expression level estimation from biased RNA-Seq data, but the effect of biases on transcriptome assembly remains largely unexplored.
RESULTS: Here, we propose a statistical framework for both transcriptome assembly and isoform expression level estimation from biased RNA-Seq data. Using a quasi-multinomial distribution model, our method is able to capture various types of RNA-Seq biases, including positional, sequencing and mappability biases. Our experimental results on simulated and real RNA-Seq datasets exhibit interesting effects of RNA-Seq biases on both transcriptome assembly and isoform expression level estimation. The advantage of our method is clearly shown in the experimental analysis by its high sensitivity and precision in transcriptome assembly and the high concordance of its estimated expression levels with quantitative reverse transcription-polymerase chain reaction data. AVAILABILITY: CEM is freely available at http://www.cs.ucr.edu/~liw/cem.html. CONTACT: liw@cs.ucr.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh:

Substances:

Year:  2012        PMID: 23060617      PMCID: PMC3496342          DOI: 10.1093/bioinformatics/bts559

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  35 in total

1.  Gene expression profiling in single cells from the pancreatic islets of Langerhans reveals lognormal distribution of mRNA levels.

Authors:  Martin Bengtsson; Anders Ståhlberg; Patrik Rorsman; Mikael Kubista
Journal:  Genome Res       Date:  2005-10       Impact factor: 9.043

2.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

3.  Biases in Illumina transcriptome sequencing caused by random hexamer priming.

Authors:  Kasper D Hansen; Steven E Brenner; Sandrine Dudoit
Journal:  Nucleic Acids Res       Date:  2010-04-14       Impact factor: 16.971

4.  A two-parameter generalized Poisson model to improve the analysis of RNA-seq data.

Authors:  Sudeep Srivastava; Liang Chen
Journal:  Nucleic Acids Res       Date:  2010-07-29       Impact factor: 16.971

5.  Detection of splice junctions from paired-end RNA-seq data by SpliceMap.

Authors:  Kin Fai Au; Hui Jiang; Lan Lin; Yi Xing; Wing Hung Wong
Journal:  Nucleic Acids Res       Date:  2010-04-05       Impact factor: 16.971

Review 6.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

7.  Detection and removal of biases in the analysis of next-generation sequencing reads.

Authors:  Schraga Schwartz; Ram Oren; Gil Ast
Journal:  PLoS One       Date:  2011-01-31       Impact factor: 3.240

8.  The UCSC Genome Browser database: update 2011.

Authors:  Pauline A Fujita; Brooke Rhead; Ann S Zweig; Angie S Hinrichs; Donna Karolchik; Melissa S Cline; Mary Goldman; Galt P Barber; Hiram Clawson; Antonio Coelho; Mark Diekhans; Timothy R Dreszer; Belinda M Giardine; Rachel A Harte; Jennifer Hillman-Jackson; Fan Hsu; Vanessa Kirkup; Robert M Kuhn; Katrina Learned; Chin H Li; Laurence R Meyer; Andy Pohl; Brian J Raney; Kate R Rosenbloom; Kayla E Smith; David Haussler; W James Kent
Journal:  Nucleic Acids Res       Date:  2010-10-18       Impact factor: 16.971

9.  Splice site strength-dependent activity and genetic buffering by poly-G runs.

Authors:  Xinshu Xiao; Zefeng Wang; Minyoung Jang; Razvan Nutiu; Eric T Wang; Christopher B Burge
Journal:  Nat Struct Mol Biol       Date:  2009-09-13       Impact factor: 15.369

10.  Substantial biases in ultra-short read data sets from high-throughput DNA sequencing.

Authors:  Juliane C Dohm; Claudio Lottaz; Tatiana Borodina; Heinz Himmelbauer
Journal:  Nucleic Acids Res       Date:  2008-07-26       Impact factor: 16.971

View more
  41 in total

1.  Polyester: simulating RNA-seq datasets with differential transcript expression.

Authors:  Alyssa C Frazee; Andrew E Jaffe; Ben Langmead; Jeffrey T Leek
Journal:  Bioinformatics       Date:  2015-04-28       Impact factor: 6.937

2.  Clustering of mRNA-Seq data based on alternative splicing patterns.

Authors:  Marla Johnson; Elizabeth Purdom
Journal:  Biostatistics       Date:  2017-04-01       Impact factor: 5.899

3.  TAPAS: tool for alternative polyadenylation site analysis.

Authors:  Ashraful Arefeen; Juntao Liu; Xinshu Xiao; Tao Jiang
Journal:  Bioinformatics       Date:  2018-08-01       Impact factor: 6.937

4.  Updating RNA-Seq analyses after re-annotation.

Authors:  Adam Roberts; Lorian Schaeffer; Lior Pachter
Journal:  Bioinformatics       Date:  2013-05-14       Impact factor: 6.937

5.  Information transduction capacity reduces the uncertainties in annotation-free isoform discovery and quantification.

Authors:  Yue Deng; Feng Bao; Yang Yang; Xiangyang Ji; Mulong Du; Zhengdong Zhang; Meilin Wang; Qionghai Dai
Journal:  Nucleic Acids Res       Date:  2017-09-06       Impact factor: 16.971

6.  IntAPT: integrated assembly of phenotype-specific transcripts from multiple RNA-seq profiles.

Authors:  Xu Shi; Andrew F Neuwald; Xiao Wang; Tian-Li Wang; Leena Hilakivi-Clarke; Robert Clarke; Jianhua Xuan
Journal:  Bioinformatics       Date:  2021-05-05       Impact factor: 6.937

7.  SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data.

Authors:  Xu Shi; Xiao Wang; Tian-Li Wang; Leena Hilakivi-Clarke; Robert Clarke; Jianhua Xuan
Journal:  Bioinformatics       Date:  2018-01-01       Impact factor: 6.937

8.  FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome.

Authors:  Jianyu Zhou; Shining Ma; Dongfang Wang; Jianyang Zeng; Tao Jiang
Journal:  Nucleic Acids Res       Date:  2018-01-25       Impact factor: 16.971

9.  Accurate inference of isoforms from multiple sample RNA-Seq data.

Authors:  Masruba Tasnim; Shining Ma; Ei-Wen Yang; Tao Jiang; Wei Li
Journal:  BMC Genomics       Date:  2015-01-21       Impact factor: 3.969

10.  Transcriptome-wide Interrogation of the Functional Intronome by Spliceosome Profiling.

Authors:  Weijun Chen; Jill Moore; Hakan Ozadam; Hennady P Shulha; Nicholas Rhind; Zhiping Weng; Melissa J Moore
Journal:  Cell       Date:  2018-05-03       Impact factor: 41.582

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.