Literature DB >> 27661099

An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile.

Celine Prakash1, Arndt Von Haeseler1,2.   

Abstract

RNA sequencing (RNA-seq) has emerged as the method of choice for measuring the expression of RNAs in a given cell population. In most RNA-seq technologies, sequencing the full length of RNA molecules requires fragmentation into smaller pieces. Unfortunately, the issue of nonuniform sequencing coverage across a genomic feature has been a concern in RNA-seq and is attributed to biases for certain fragments in RNA-seq library preparation and sequencing. To investigate the expected coverage obtained from fragmentation, we develop a simple fragmentation model that is independent of bias from the experimental method and is not specific to the transcript sequence. Essentially, we enumerate all configurations for maximal placement of a given fragment length, F, on transcript length, T, to represent every possible fragmentation pattern, from which we compute the expected coverage profile across a transcript. We extend this model to incorporate general empirical attributes such as read length, fragment length distribution, and number of molecules of the transcript. We further introduce the fragment starting-point, fragment coverage, and read coverage profiles. We find that the expected profiles are not uniform and that factors such as fragment length to transcript length ratio, read length to fragment length ratio, fragment length distribution, and number of molecules influence the variability of coverage across a transcript. Finally, we explore a potential application of the model where, with simulations, we show that it is possible to correctly estimate the transcript copy number for any transcript in the RNA-seq experiment.

Entities:  

Keywords:  RNA sequencing; enumerative combinatorics; expected starting-point distribution; fragmentation model; unbiased coverage

Mesh:

Substances:

Year:  2016        PMID: 27661099      PMCID: PMC5346924          DOI: 10.1089/cmb.2016.0096

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  28 in total

1.  Evaluation of DNA fragment sizing and quantification by the agilent 2100 bioanalyzer.

Authors:  N J Panaro; P K Yuen; T Sakazume; P Fortina; L J Kricka; P Wilding
Journal:  Clin Chem       Date:  2000-11       Impact factor: 8.327

2.  Statistical inferences for isoform expression in RNA-Seq.

Authors:  Hui Jiang; Wing Hung Wong
Journal:  Bioinformatics       Date:  2009-02-25       Impact factor: 6.937

3.  NextGenMap: fast and accurate read mapping in highly polymorphic genomes.

Authors:  Fritz J Sedlazeck; Philipp Rescheneder; Arndt von Haeseler
Journal:  Bioinformatics       Date:  2013-08-23       Impact factor: 6.937

4.  Mapping and quantifying mammalian transcriptomes by RNA-Seq.

Authors:  Ali Mortazavi; Brian A Williams; Kenneth McCue; Lorian Schaeffer; Barbara Wold
Journal:  Nat Methods       Date:  2008-05-30       Impact factor: 28.547

5.  RNA sequencing: platform selection, experimental design, and data interpretation.

Authors:  Yongjun Chu; David R Corey
Journal:  Nucleic Acid Ther       Date:  2012-07-25       Impact factor: 5.486

6.  The complete gene sequence of titin, expression of an unusual approximately 700-kDa titin isoform, and its interaction with obscurin identify a novel Z-line to I-band linking system.

Authors:  M L Bang; T Centner; F Fornoff; A J Geach; M Gotthardt; M McNabb; C C Witt; D Labeit; C C Gregorio; H Granzier; S Labeit
Journal:  Circ Res       Date:  2001-11-23       Impact factor: 17.367

7.  Measuring Absolute RNA Copy Numbers at High Temporal Resolution Reveals Transcriptome Kinetics in Development.

Authors:  Nick D L Owens; Ira L Blitz; Maura A Lane; Ilya Patrushev; John D Overton; Michael J Gilchrist; Ken W Y Cho; Mustafa K Khokha
Journal:  Cell Rep       Date:  2016-01-07       Impact factor: 9.423

Review 8.  RNA-Seq: a revolutionary tool for transcriptomics.

Authors:  Zhong Wang; Mark Gerstein; Michael Snyder
Journal:  Nat Rev Genet       Date:  2009-01       Impact factor: 53.242

9.  ReadDepth: a parallel R package for detecting copy number alterations from short sequencing reads.

Authors:  Christopher A Miller; Oliver Hampton; Cristian Coarfa; Aleksandar Milosavljevic
Journal:  PLoS One       Date:  2011-01-31       Impact factor: 3.240

10.  A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.

Authors: 
Journal:  Nat Biotechnol       Date:  2014-08-24       Impact factor: 54.908

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.