| Literature DB >> 27538485 |
Zongji Wang1,2, Jinmin Lian2, Qiye Li3,4, Pei Zhang2, Yang Zhou2, Xiaoyu Zhan2,5, Guojie Zhang6,7.
Abstract
BACKGROUND: High-throughput sequencing (HTS) provides a powerful solution for the genome-wide identification of RNA-editing sites. However, it remains a great challenge to distinguish RNA-editing sites from genetic variants and technical artifacts caused by sequencing or read-mapping errors.Entities:
Keywords: Detection; Genome-wide; Identification; RES-Scanner; RNA editing; Software package
Mesh:
Year: 2016 PMID: 27538485 PMCID: PMC4989487 DOI: 10.1186/s13742-016-0143-4
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Fig. 1Overview of the workflow of RES-Scanner. RES-Scanner employs a three-part framework to detect RNA-editing sites with matching DNA-seq and RNA-seq data, including RNA/DNA-seq read mapping and filtering, homozygous genotype calling and identification of RNA-editing candidates
Performance of RES-Scanner compared with other methods applied to GM12878 human lymphoblastoid cell line data
| All | Alu | Repetitive non-Alu | Nonrepetitive | |||||
|---|---|---|---|---|---|---|---|---|
| Total | % A-to-I | Total | % A-to-I | Total | % A-to-I | Total | % A-to-I | |
| Ramaswami et al. [ | 150,865 | 95.7 | 147,029 | 95.8 | 2,385 | 97.4 | 1,451 | 86.6 |
| REDItools [ | 222,288 | 91.3 | 221,401 | 91.2 | Not investigated | 887 | 92.2 | |
| GIREMI [ | 37,591 | 98.6 | 36,131 | 99.0 | 267 | 83.7 | 1,193 | 82.8 |
| RES-Scanner (pre-aligned) | 151,952 | 96.3 | 147,542 | 96.4 | 3,247 | 97.0 | 1,163 | 87.5 |
| RES-Scanner (raw reads) | 153,848 | 95.8 | 149,710 | 95.9 | 2,794 | 97.8 | 1,344 | 81.3 |
Comparison of the cumulative CPU times (hours) for RES-Scanner, REDItools and GIREMI in processing the human GM12878 dataset from pre-aligned reads to final editing sites
| Chromosome | RES-Scanner | REDItools | GIREMI |
|---|---|---|---|
| chr1 | 39.71 | 118.98 | ND |
| chr2 | 44.51 | 128.62 | ND |
| chr3 | 37.95 | 106.15 | ND |
| chr4 | 29.83 | 81.61 | ND |
| chr5 | 31.16 | 88.89 | ND |
| chr6 | 28.99 | 85.49 | ND |
| chr7 | 25.60 | 82.45 | ND |
| chr8 | 25.38 | 64.02 | ND |
| chr9 | 20.23 | 63.79 | ND |
| chr10 | 23.01 | 71.45 | ND |
| chr11 | 20.80 | 68.59 | ND |
| chr12 | 23.33 | 75.51 | ND |
| chr13 | 14.65 | 40.37 | ND |
| chr14 | 16.10 | 51.55 | ND |
| chr15 | 14.82 | 48.73 | ND |
| chr16 | 15.64 | 48.15 | ND |
| chr17 | 14.54 | 54.50 | ND |
| chr18 | 13.03 | 35.29 | ND |
| chr19 | 13.07 | 31.63 | ND |
| chr20 | 12.19 | 31.04 | ND |
| chr21 | 7.59 | 15.66 | ND |
| chr22 | 7.67 | 21.62 | ND |
| chrX | 21.80 | 60.09 | ND |
| chrY | 1.62 | 1.94 | ND |
| Total | 503.23 | 1,476.12 | 10.87 |
Note: A total of 150 Gb of pre-aligned DNA reads and 9 Gb of pre-aligned RNA reads in BAM format were used as inputs for both RES-Scanner and REDItools, while 9 Gb of pre-aligned RNA reads and a list of SNVs derived from the RNA-seq data were used as inputs for GIREMI. The time for GIREMI included the cumulative CPU times of generating the SNV list from the RNA-seq data using SAMtools [40] (9.60 h) and running GIREMI (1.27 h). As GIREMI required all SNVs from the whole genome to construct the MI distribution, CPU times for individual chromosomes could not be determined. ND not determined