Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 On the identifiability of the isoform deconvolution problem: application to select the proper fragment length in an RNAseq library.

Literature DB >> 34978563

On the identifiability of the isoform deconvolution problem: application to select the proper fragment length in an RNAseq library.

Juan A Ferrer-Bonsoms¹, Xabier Morales¹, Pegah T Afshar², Wing H Wong², Angel Rubio¹.

Abstract

MOTIVATION: Isoform deconvolution is an NP-hard problem. The accuracy of the proposed solutions are far from perfect. At present, it is not known if gene structure and isoform concentration can be uniquely inferred given paired-end reads, and there is no objective method to select the fragment length to improve the number of identifiable genes. Different pieces of evidence suggest that the optimal fragment length is gene-dependent, stressing the need for a method that selects the fragment length according to a reasonable trade-off across all the genes in the whole genome.
RESULTS: A gene is considered to be identifiable if it is possible to get both the structure and concentration of its transcripts univocally. Here, we present a method to state the identifiability of this deconvolution problem. Assuming a given transcriptome and that the coverage is sufficient to interrogate all junction reads of the transcripts, this method states whether or not a gene is identifiable given the read length and fragment length distribution.Applying this method using different read and fragment length combinations, the optimal average fragment length for the human transcriptome is around 400-600nt for coding genes and 150-200nt for long non-coding RNAs. The optimal read length is the largest one that fits in the fragment length. It is also discussed the potential profit of combining several libraries to reconstruct the transcriptome. Combining two libraries of very different fragment lengths results in a significant improvement in gene identifiability. AVAILABILITY: Code is available in GitHub (https://github.com/JFerrer-B/transcriptome-identifiability). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities: Chemical

Year: 2022 PMID： 34978563 PMCID： PMC8896638 DOI： 10.1093/bioinformatics/btab873

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
References

15 in total

1. Splicing graphs and EST assembly problem.

Authors: Steffen Heber; Max Alekseyev; Sing-Hoi Sze; Haixu Tang; Pavel A Pevzner
Journal: Bioinformatics Date: 2002 Impact factor: 6.937

2. Systematic evaluation of differential splicing tools for RNA-seq studies.

Authors: Arfa Mehmood; Asta Laiho; Mikko S Venäläinen; Aidan J McGlinchey; Ning Wang; Laura L Elo
Journal: Brief Bioinform Date: 2020-12-01 Impact factor: 11.622

3. Statistical Modeling of RNA-Seq Data.

Authors: Julia Salzman; Hui Jiang; Wing Hung Wong
Journal: Stat Sci Date: 2011-02 Impact factor: 2.901

4. rMATS-DVR: rMATS discovery of differential variants in RNA.

Authors: Jinkai Wang; Yang Pan; Shihao Shen; Lan Lin; Yi Xing
Journal: Bioinformatics Date: 2017-07-15 Impact factor: 6.937

5. Genome-guided transcript assembly by integrative analysis of RNA sequence data.

Authors: Nathan Boley; Marcus H Stoiber; Benjamin W Booth; Kenneth H Wan; Roger A Hoskins; Peter J Bickel; Susan E Celniker; James B Brown
Journal: Nat Biotechnol Date: 2014-03-16 Impact factor: 54.908

6. Gaining comprehensive biological insight into the transcriptome by performing a broad-spectrum RNA-seq analysis.

Authors: Sayed Mohammad Ebrahim Sahraeian; Marghoob Mohiyuddin; Robert Sebra; Hagen Tilgner; Pegah T Afshar; Kin Fai Au; Narges Bani Asadi; Mark B Gerstein; Wing Hung Wong; Michael P Snyder; Eric Schadt; Hugo Y K Lam
Journal: Nat Commun Date: 2017-07-05 Impact factor: 14.919

7. Comprehensive identification of mRNA isoforms reveals the diversity of neural cell-surface molecules with roles in retinal development and disease.

Authors: Thomas A Ray; Kelly Cochran; Chris Kozlowski; Jingjing Wang; Graham Alexander; Martha A Cady; William J Spencer; Philip A Ruzycki; Brian S Clark; Annelies Laeremans; Ming-Xiao He; Xiaoming Wang; Emily Park; Ying Hao; Alessandro Iannaccone; Gary Hu; Olivier Fedrigo; Nikolai P Skiba; Vadim Y Arshavsky; Jeremy N Kay
Journal: Nat Commun Date: 2020-07-03 Impact factor: 14.919

8. Long fragments achieve lower base quality in Illumina paired-end sequencing.

Authors: Ge Tan; Lennart Opitz; Ralph Schlapbach; Hubert Rehrauer
Journal: Sci Rep Date: 2019-02-27 Impact factor: 4.379

9. Efficient RNA isoform identification and quantification from RNA-Seq data with network flows.

Authors: Elsa Bernard; Laurent Jacob; Julien Mairal; Jean-Philippe Vert
Journal: Bioinformatics Date: 2014-05-09 Impact factor: 6.937

10. Comparison of RNA-seq and microarray platforms for splice event detection using a cross-platform algorithm.

Authors: Juan P Romero; María Ortiz-Estévez; Ander Muniategui; Soraya Carrancio; Fernando J de Miguel; Fernando Carazo; Luis M Montuenga; Remco Loos; Rubén Pío; Matthew W B Trotter; Angel Rubio
Journal: BMC Genomics Date: 2018-09-25 Impact factor: 3.969