| Literature DB >> 30911020 |
Woochang Hwang1,2, Stefano Calza3,4, Marco Silvestri4,5, Yudi Pawitan6, Youngjo Lee1,2.
Abstract
Adenosine-to-Inosine (A-to-I) RNA editing is the most prevalent post-transcriptional modification of RNA molecules. Researchers have attempted to find reliable RNA editing using next generation sequencing (NGS) data. However, most of these attempts suffered from a high rate of false positives, and they did not consider the clinical relevance of the identified RNA editing, for example, in disease progression. We devised an effective RNA-editing discovery pipeline called CREDO, which includes novel statistical filtering modules based on integration of DNA- and RNA-seq data from matched tumor-normal tissues. CREDO was compared with three other RNA-editing discovery pipelines and found to give significantly fewer false positives. Application of CREDO to breast cancer data from the Cancer Genome Atlas (TCGA) project discovered highly confident RNA editing with clinical relevance to cancer progression in terms of patient survival. RNA-editing detection using DNA- and RNA-seq data from matched tumor-normal tissues should be more routinely performed as multiple omics data are becoming commonly available from each patient sample. We believe CREDO is an effective and reliable tool for this problem.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30911020 PMCID: PMC6433923 DOI: 10.1038/s41598-019-41294-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(A) CREDO RNA-Editing discovery pipeline. Novel statistical RNA-editing discovery modules are colored in dark blue. Scatter plots show A-to-G loci before and after the statistical discovery filter application. (B) Number of editing sites discovered by CREDO. Zero-confidence score ≥ 10.0 (C) Number of editing sites identified in more than 4 individuals out of the loci in (B).
Figure 2dbSNP overlap percentage of RNA editing candidates discovered by CREDO and the other methods. The x-axis is the number of recurrent samples for the editing sites.
Confident A-to-G loci (Zero-confidence score ≥ 10.0, p-value < 0.05, OR ≥2.0 or OR ≤0.5) identified by CREDO that are also present in dbSNP. These 10 loci are most recurrent among the 60 breast-cancer patients. Sample count is the number of individuals with edited site. Gene is the gene symbol that the site is resided.
| Chromosome | Location | Sample Count | Gene | Region | Disease | Gene Description |
|---|---|---|---|---|---|---|
| 2 | 89246858 | 17 | IGKV1-5 | Exon | Immunoglobulin Kappa Variable | |
| 20 | 29628273 | 16 | FRG1B | Exon | Prostate cancer, Glioma | FSHD region gene 1 family, member B |
| 2 | 89567830 | 16 | IGKV1-33 | Exon | Immunoglobulin Kappa Variable | |
| 2 | 89246846 | 15 | IGKV1-5 | Exon | Immunoglobulin Kappa Variable | |
| 2 | 89442084 | 14 | IGKV3-20 | Exon | Immunoglobulin Kappa Variable | |
| 2 | 89384712 | 14 | IGKV3-15 | Exon | Immunoglobulin Kappa Variable | |
| 20 | 29623218 | 13 | FRG1B | Exon | Prostate cancer, Glioma | FSHD region gene 1 family, member B |
| 2 | 89326695 | 13 | IGKV3-11 | Exon | Immunoglobulin Kappa Variable | |
| 22 | 23243214 | 12 | IGLC2 | Exon | Breast cancer | Immunoglobulin Lambda Constant 2 |
| 2 | 89326707 | 12 | IGKV3-11 | Exon | Immunoglobulin Kappa Variable |
Confident A-to-G loci (Zero-confidence score ≥ 10.0, p-value < 0.05) identified by CREDO that are not present in dbSNP. These 10 loci are most recurrent among the 60 breast-cancer patients. (A) OR ≥2.0, meaning higher rate of editing in tumors. (B) OR ≤0.5, meaning lower rate of editing in tumors. Sample count is the number of individuals with edited site. Gene is the gene symbol that the location is resided.
| Chromosome | Location | Sample Count | Gene | Region | Disease | Gene Description |
|---|---|---|---|---|---|---|
|
| ||||||
| 12 | 125396510 | 12 | UBC | Exon | Apocrine adenoma, | Ubiquitin C |
| 12 | 123253679 | 11 | DENR | 3′UTR | Pleomorphic adenoma carcinoma, Breast cancer | Density regulated re-Initiation and release factor |
| 1 | 160319987 | 9 | NCSTN | Exon | Tumor suppressor | Nicastrin |
| 12 | 123253657 | 8 | DENR | 3′UTR | Pleomorphic adenoma carcinoma, Breast cancer | Density regulated re-Initiation and release factor |
| 17 | 8280945 | 7 | RPL26 | Exon | Diamond-Blackfan Anemia, | Ribosomal Protein L26 |
| 7 | 74612697 | 7 | GTF2IP1 | Exon | Williams-Beuren Syndrome | General Transcription Factor 2I Pseudogene 1 |
| 6 | 29693034 | 7 | HLA-F | Exon | Autoimmune Disease, | Major Histocompatibility Complex |
| 19 | 18288551 | 6 | PIK3R2 | Exon | malignant mixed tumor of corpus uteri, | Phodphoinositide-3-Kinase |
| 22 | 23040726 | 6 | IGLV2-23 | Exon | Immunoglobulin Lambda Variable | |
| 22 | 23101520 | 6 | IGLV2-14 | Exon | Immunoglobulin Lambda Variable | |
|
| ||||||
| 13 | 46090371 | 25 | COG3 | Exon | Breast cancer | Component Of Oligomeric Golgi Complex 3 |
| 2 | 89567797 | 17 | IGKV1-33 | Exon | Immunoglobulin Kappa Variable | |
| 12 | 125396510 | 14 | UBC | Exon | Apocrine adenoma, Congenital granular cell tumor | Ubiquitin C |
| 2 | 89521317 | 14 | IGKV2-28 | Exon | Immunoglobulin Kappa Variable | |
| 2 | 89442169 | 13 | IGKV3-20 | Exon | Immunoglobulin Kappa Variable | |
| 14 | 106452683 | 11 | IGHV1-2 | Exon | Immunoglobulin Heavy Variable | |
| 14 | 106641723 | 9 | IGHV1-18 | Exon | Immunoglobulin Heavy Variable | |
| 14 | 106725213 | 9 | IGHV3-23 | Exon | Immunoglobulin Heavy Variable | |
| 2 | 89521251 | 9 | IGKV2-28 | Exon | Immunoglobulin Kappa Variable | |
| 2 | 89521275 | 9 | IGKV2-28 | Exon | Immunoglobulin Kappa Variable | |
Figure 3(a) Kaplan-Meier curves of breast-cancer patient survival categorized according to 4 top edited loci: Chr2:89544486 (IGKV2-30), Chr2:89185437 (IGKV4-1), Chr2:216236722 (FN1) and Chr19:20727605 (ZNF737). Because they are individually too infrequent, they are combined into a single edited group (n = 16) vs non-edited group (n = 44). (b) Corresponding survival curves from FN1-mutated (n = 413) vs wild-type (n = 667) from the full TCGA breast-cancer data.