Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 CAFU: a Galaxy framework for exploring unmapped RNA-Seq data.

Literature DB >> 30815667

CAFU: a Galaxy framework for exploring unmapped RNA-Seq data.

Siyuan Chen¹, Chengzhi Ren¹, Jingjing Zhai¹, Jiantao Yu², Xuyang Zhao², Zelong Li¹, Ting Zhang¹, Wenlong Ma¹, Zhaoxue Han¹, Chuang Ma¹.

Abstract

A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological information and insights. To fill this gap, we present Comprehensive Assembly and Functional annotation of Unmapped RNA-Seq data (CAFU), a Galaxy-based framework that can facilitate the large-scale analysis of unmapped RNA sequencing (RNA-Seq) reads from single- and mixed-species samples. By taking advantage of machine learning techniques, CAFU addresses the issue of accurately identifying the species origin of transcripts assembled using unmapped reads from mixed-species samples. CAFU also represents an innovation in that it provides a comprehensive collection of functions required for transcript confidence evaluation, coding potential calculation, sequence and expression characterization and function annotation. These functions and their dependencies have been integrated into a Galaxy framework that provides access to CAFU via a user-friendly interface, dramatically simplifying complex exploration tasks involving unmapped RNA-Seq reads. CAFU has been validated with RNA-Seq data sets from wheat and Zea mays (maize) samples. CAFU is freely available via GitHub: https://github.com/cma2015/CAFU.

Entities: Chemical Disease Gene Species

Keywords: Galaxy; RNA-Seq; machine learning; pipeline; unmapped reads; workflow

Year: 2020 PMID： 30815667 PMCID： PMC7299299 DOI： 10.1093/bib/bbz018

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

48 in total

1. Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses.

Authors: Laurent Jourdren; Maria Bernard; Marie-Agnès Dillies; Stéphane Le Crom
Journal: Bioinformatics Date: 2012-04-05 Impact factor: 6.937

2. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors: Weizhong Li; Adam Godzik
Journal: Bioinformatics Date: 2006-05-26 Impact factor: 6.937

Review 3. High-throughput sequencing technologies.

Authors: Jason A Reuter; Damek V Spacek; Michael P Snyder
Journal: Mol Cell Date: 2015-05-21 Impact factor: 17.970

4. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments.

Authors: Ning Leng; John A Dawson; James A Thomson; Victor Ruotti; Anna I Rissman; Bart M G Smits; Jill D Haag; Michael N Gould; Ron M Stewart; Christina Kendziorski
Journal: Bioinformatics Date: 2013-02-21 Impact factor: 6.937

5. Development of Race-Specific SCAR Markers for Detection of Chinese Races CYR32 and CYR33 of Puccinia striiformis f. sp. tritici.

Authors: Baotong Wang; Xiaoping Hu; Qiang Li; Baojun Hao; Bo Zhang; Gaobao Li; Zhensheng Kang
Journal: Plant Dis Date: 2010-02 Impact factor: 4.438

6. Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing.

Authors: Bo Wang; Elizabeth Tseng; Michael Regulski; Tyson A Clark; Ting Hon; Yinping Jiao; Zhenyuan Lu; Andrew Olson; Joshua C Stein; Doreen Ware
Journal: Nat Commun Date: 2016-06-24 Impact factor: 14.919

7. Construction and Optimization of a Large Gene Coexpression Network in Maize Using RNA-Seq Data.

Authors: Ji Huang; Stefania Vendramin; Lizhen Shi; Karen M McGinnis
Journal: Plant Physiol Date: 2017-08-02 Impact factor: 8.340

8. What's in your next-generation sequence data? An exploration of unmapped DNA and RNA sequence reads from the bovine reference individual.

Authors: Lynsey K Whitacre; Polyana C Tizioto; JaeWoo Kim; Tad S Sonstegard; Steven G Schroeder; Leeson J Alexander; Juan F Medrano; Robert D Schnabel; Jeremy F Taylor; Jared E Decker
Journal: BMC Genomics Date: 2015-12-29 Impact factor: 3.969

9. Maize pan-transcriptome provides novel insights into genome complexity and quantitative trait variation.

Authors: Minliang Jin; Haijun Liu; Cheng He; Junjie Fu; Yingjie Xiao; Yuebin Wang; Weibo Xie; Guoying Wang; Jianbing Yan
Journal: Sci Rep Date: 2016-01-05 Impact factor: 4.379

10. Comprehensive assembly of novel transcripts from unmapped human RNA-Seq data and their association with cancer.

Authors: Majid Kazemian; Min Ren; Jian-Xin Lin; Wei Liao; Rosanne Spolski; Warren J Leonard
Journal: Mol Syst Biol Date: 2015-08-07 Impact factor: 11.429

2 in total

1. Comparative RNA-Seq transcriptome analyses reveal dynamic time-dependent effects of ⁵⁶Fe, ¹⁶O, and ²⁸Si irradiation on the induction of murine hepatocellular carcinoma.

Authors: Anna M Nia; Kamil Khanipov; Brooke L Barnette; Robert L Ullrich; George Golovko; Mark R Emmett
Journal: BMC Genomics Date: 2020-07-01 Impact factor: 3.969

2. Optimized sequencing depth and de novo assembler for deeply reconstructing the transcriptome of the tea plant, an economically important plant species.

Authors: Fang-Dong Li; Wei Tong; En-Hua Xia; Chao-Ling Wei
Journal: BMC Bioinformatics Date: 2019-11-06 Impact factor: 3.169

2 in total