| Literature DB >> 31906249 |
Giulio Ferrero1, Nicola Licheri1, Lucia Coscujuela Tarrero2,3, Carlo De Intinis1, Valentina Miano2,4, Raffaele Adolfo Calogero5, Francesca Cordero1, Michele De Bortoli2, Marco Beccuti1.
Abstract
Recent improvements in cost-effectiveness of high-throughput technologies has allowed RNA sequencing of total transcriptomes suitable for evaluating the expression and regulation of circRNAs, a relatively novel class of transcript isoforms with suggested roles in transcriptional and post-transcriptional gene expression regulation, as well as their possible use as biomarkers, due to their deregulation in various human diseases. A limited number of integrated workflows exists for prediction, characterization, and differential expression analysis of circRNAs, none of them complying with computational reproducibility requirements. We developed Docker4Circ for the complete analysis of circRNAs from RNA-Seq data. Docker4Circ runs a comprehensive analysis of circRNAs in human and model organisms, including: circRNAs prediction; classification and annotation using six public databases; back-splice sequence reconstruction; internal alternative splicing of circularizing exons; alignment-free circRNAs quantification from RNA-Seq reads; and differential expression analysis. Docker4Circ makes circRNAs analysis easier and more accessible thanks to: (i) its R interface; (ii) encapsulation of computational tasks into docker images; (iii) user-friendly Java GUI Interface availability; and (iv) no need of advanced bash scripting skills for correct use. Furthermore, Docker4Circ ensures a reproducible analysis since all its tasks are embedded into a docker image following the guidelines provided by Reproducible Bioinformatics Project.Entities:
Keywords: circRNA; docker images; pipeline; reproducible analysis
Mesh:
Substances:
Year: 2019 PMID: 31906249 PMCID: PMC6982331 DOI: 10.3390/ijms21010293
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Schematic representation of the Docker4Circ modules with indication of all the functions (reported in bold in the hexagons) and the input/output files involved (reported in the squares). The different modules implemented in the framework are reported with different colors. BS = back-splicing.
Figure 2(a) The Docker4Circ Graphical User Interface. Each module implemented in the framework is accessible using the panel on the left, the right panel reports the fields and parameters of each function; (b) Bar plot reporting the Docker4Circ classification of circRNAs identified in the analysis of RNA-Seq datasets from CRC cell lines; (c) Volcano plot reporting the -log10 p-value and the log2 expression fold change (red dashed lines) computed between Docker4Circ counts of BS supporting reads from RNA-Seq datasets of NCM and CRC tissue samples.
Table reporting the number of RNA-Seq paired reads analyzed, the number of detected circRNAs, and the number of alternative splicing (AS) events predicted by CIRI-AS.
| Dataset ID | Reads | circRNAs | AS Events |
|---|---|---|---|
| NCM460_R1 | 66,144,999 | 14,003 | 1,482 |
| NCM460_R2 | 70,945,094 | 16,006 | 1,790 |
| NCM460_R3 | 73,804,226 | 12,413 | 1,078 |
| SW480_R1 | 88,915,933 | 8,627 | 532 |
| SW480_R2 | 97,303,573 | 5,688 | 335 |
| SW480_R3 | 66,144,999 | 7,154 | 470 |
| SW620_R1 | 91,406,400 | 1,0216 | 790 |
| SW620_R2 | 67,013,355 | 4,624 | 214 |
| SW620_R3 | 69,789,394 | 6,541 | 332 |
| Average | 76,829,774.78 | 9,474.67 | 780.33 |