| Literature DB >> 24307754 |
Julia Salzman1, Hui Jiang, Wing Hung Wong.
Abstract
Recently, ultra high-throughput sequencing of RNA (RNA-Seq) has been developed as an approach for analysis of gene expression. By obtaining tens or even hundreds of millions of reads of transcribed sequences, an RNA-Seq experiment can offer a comprehensive survey of the population of genes (transcripts) in any sample of interest. This paper introduces a statistical model for estimating isoform abundance from RNA-Seq data and is flexible enough to accommodate both single end and paired end RNA-Seq data and sampling bias along the length of the transcript. Based on the derivation of minimal sufficient statistics for the model, a computationally feasible implementation of the maximum likelihood estimator of the model is provided. Further, it is shown that using paired end RNA-Seq provides more accurate isoform abundance estimates than single end sequencing at fixed sequencing depth. Simulation studies are also given.Entities:
Keywords: Fisher information; Isoform abundance estimation; Minimal sufficiency; Paired end RNA-Seq data analysis
Year: 2011 PMID: 24307754 PMCID: PMC3846358 DOI: 10.1214/10-STS343
Source DB: PubMed Journal: Stat Sci ISSN: 0883-4237 Impact factor: 2.901