| Literature DB >> 32646420 |
Yidan Wang1, Xuanping Zhang1, Tao Wang2, Jinchun Xing2, Zhun Wu2, Wei Li3, Jiayin Wang4.
Abstract
BACKGROUND: Circular RNAs (circRNAs) are those RNA molecules that lack the poly (A) tails, which present the closed-loop structure. Recent studies emphasized that some circRNAs imply different functions from canonical transcripts, and further associated with complex diseases. Several computational methods have been developed for detecting circRNAs from RNA-seq data. However, the existing methods prefer to high sensitivity strategies, which always introduce many false positives. Thus, in clinical decision-supporting system, a comprehensive filtering approach is needed for accurately recognizing real circRNAs for decision models.Entities:
Keywords: Circular RNA; Detection method; High precision; Machine learning; RNA-seq data analysis
Mesh:
Substances:
Year: 2020 PMID: 32646420 PMCID: PMC7346313 DOI: 10.1186/s12911-020-1117-0
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Fig. 1The CIRCPlus2 pipeline for recognize circRNAs from RNA-seq data
Fig. 2The examples of different types of circRNAs and the related features. Split-read: a read that span one or multiple covalent linkages. Discordant read pair: a read pair with both ends mapped, but the locations are too far from or close by each other comparing to the library insert size. FSJ: forward-spliced junction read. BSJ: back-spliced junction read
List of features
| Category | Symbol | Meaning |
|---|---|---|
| Insert size of read pair | Concord | Concordant pair |
| Discord | Discordant pair | |
| BSJ reads | Mapping_quality | Mapping quality |
| Support | Supporting read count | |
| Breakpoint | l | Left breakpoint |
| r | Right breakpoint | |
| Mapping situation in breakpoint | SM | CIGAR value in the form of xS/HyM |
| MS | CIGAR value in the form of xMyS/H | |
| SMS | CIGAR value in the form of xS/HyMzS/H | |
| Splicing signal | GTAG | GT-AG signal |
| Depth | Depth | Average read depth |
| Cov | Average read base count | |
| Region (circRNA length) | Up | Up of circRNA region |
| Down | Down of circRNA region |
Fig. 3a Sensitivity analyses under different linear transcripts coverages (the read length was fixed to 100 bp). b Precision analyses under different linear transcritps coverages. c F1-Score analyses under different linear transcripts coverages
Fig. 4a Sensitivity analyses under different read lengths. b Precision analyses under different read lengths. c F1-Score analyses under different read lengths
Fig. 5a Sensitivity analyses under different read depths of linear transcripts (the read length was fixed to 100 bp). b Precision analyses under different read depths of linear transcritps. c F1-Score analyses under different read depths of linear transcripts
Fig. 6a Sensitivity analyses under different read lengths. b Precision analyses under different read lengths. c F1-Score analyses under different read lengths
List of Three Confusion Matrixes
| Testing group | Confusion Matrix | |
|---|---|---|
| 1 | 297 (TP) | 159 (FN) |
| 97 (FP) | 347 (TN) | |
| 2 | 349 (TP) | 148 (FN) |
| 92 (FP) | 311 (TN) | |
| 3 | 473 (TP) | 155 (FN) |
| 157 (FP) | 358 (TN) | |
Fig. 7F1-Score of CIRI2 and CIRCPlus2 on the HEK293 dataset