| Literature DB >> 31649739 |
Peisen Sun1,2, Guanglin Li1,2.
Abstract
Circular RNAs (circRNAs), which play vital roles in many regulatory pathways, are widespread in many species. Although many circRNAs have been discovered in plants and animals, the functions of these RNAs have not been fully investigated. In addition to the function of circRNAs as microRNA (miRNA) decoys, the translation potential of circRNAs is important for the study of their functions; yet, few tools are available to identify their translation potential. With the development of high-throughput sequencing technology and the emergence of ribosome profiling technology, it is possible to identify the coding ability of circRNAs with high sensitivity. To evaluate the coding ability of circRNAs, we first developed the CircCode tool and then used CircCode to investigate the translation potential of circRNAs from humans and Arabidopsis thaliana. Based on the ribosome profile databases downloaded from NCBI, we found 3,610 and 1,569 translated circRNAs in humans and A. thaliana, respectively. Finally, we tested the performance of CircCode and found a low false discovery rate and high sensitivity for identifying circRNA coding ability. CircCode, a Python 3-based framework for identifying the coding ability of circRNAs, is also a simple and powerful command line-based tool. To investigate the translation potential of circRNAs, the user can simply fill in the given configuration file and run the Python 3 scripts. The tool is freely available at https://github.com/PSSUN/CircCode.Entities:
Keywords: bioinformatics; circular RNAs; classification; coding potential; ribosome profiling data; translation
Year: 2019 PMID: 31649739 PMCID: PMC6795751 DOI: 10.3389/fgene.2019.00981
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Figure 1The workflow of CircCode. The top layer represents the input file required for each step of CircCode. The middle layer is divided into three parts, and each part represents a different stage of operation. From left to right, the first part represents the filtering of the Ribo-seq data; the quality control is executed by Trimmomatic, and the rRNA reads are removed by bowtie. The second part represents the steps used to produce the virtual genome and align the filtered reads to the virtual genome with STAR. The last part represents the identification of translated circRNAs by machine learning. The bottom layer represents the last step used to predict the peptides translated from the circRNAs and the final output results, including information on translated circRNAs and their translation products.
Figure 2(A) Effect of Ribo-seq data sequencing depth on the predicted number of translated circRNAs. (B) The effect of junction read number (JRN) on CircCode sensitivity at different sequencing depths.