| Literature DB >> 28659141 |
Hasan Awad Aljohi1, Wanfei Liu1,2,3, Qiang Lin1,2, Jun Yu4,5, Songnian Hu6,7.
Abstract
BACKGROUND: Exon recognition and splicing precisely and efficiently by spliceosome is the key to generate mature mRNAs. About one third or a half of disease-related mutations affect RNA splicing. Software PVAAS has been developed to identify variants associated with aberrant splicing by directly using RNA-seq data. However, it bases on the assumption that annotated splicing site is normal splicing, which is not true in fact.Entities:
Keywords: Association; DNA mutation; RNA editing; RNA-seq; Sequence variant; Splicing event
Mesh:
Substances:
Year: 2017 PMID: 28659141 PMCID: PMC5490186 DOI: 10.1186/s12859-017-1732-7
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Schematic diagram of the ISVASE software. a Identify splicing variants in RNA-seq data. All splicing variants can be divided into four types according to relationship between target splicing variant (red colour) and other splicing variants (from left to right): (i) unique splicing variant; (ii) splicing variants with same junction start; (iii) splicing variants with same junction end; and (iv) splicing variants with same junction start or end. b Identify sequence variants for each splicing variant and all related splicing variants. To handle all splicing variant types, we identify sequence variants for two parts of splicing separately. In the left part, for junctions with orange, yellow and red colour, the all related splicing variants should be three (all these junctions); however, for junctions with green and blue colour, the total junction is one (itself). Similarly, in the right part, junctions with red, green and blue colour have three all related splicing variants while junctions with orange and yellow colour only has one related junction (itself). c Identify associations. This step includes three significant judgements for sequence variants, junction existence and association between sequence variants and junctions, respectively. The example shown two junctions with same junction end. For junction one (top), two sequence variants are identified (left G(ref)- > C(alt) and right G(ref)- > A(alt)). In sequence variant significant judgement, left is filtered (p value = 1) while right passes the test (p value = 0.0476). In junction significant judgement and association judgement, p value of top junction is 0.0128 (significant) and 0.0070 (significant) respectively. Dashed lines represent gaps in the alignment
The statistics of SVASE identification using PVAAS and ISVASE
| Data | PVAAS | ISVASE(novel) | ISVASE(all) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Total | dbSNP | RADAR | Total | dbSNP | RADAR | Total | dbSNP | RADAR | |
| PVAAS test data | 8 | 0 | 0 | 14 | 7 | 0 | 172 | 129 | 0 |
| Control1(SRR388226) | 61 | 12 | 0 | 134 | 54 | 1 | 2577 | 2138 | 3 |
| Control2(SRR388227) | 63 | 9 | 0 | 120 | 50 | 2 | 2557 | 2130 | 3 |
| Control(common) | 28 | 2 | 0 | 87 | 36 | 1 | 2105 | 1788 | 2 |
| Knockdown1(SRR388228) | 93 | 18 | 0 | 187 | 83 | 1 | 2710 | 2250 | 2 |
| Knockdown2(SRR388229) | 89 | 24 | 0 | 168 | 73 | 1 | 2760 | 2293 | 1 |
| Knockdown(common) | 31 | 8 | 0 | 119 | 55 | 1 | 2298 | 1951 | 1 |
The performance comparison between PVAAS and ISVASE
| Data | Method | Precision | Consistency |
|---|---|---|---|
| PVAAS test data | PVAAS | 0.00(0/8) | - |
| ISVASE(novel) | 0.50(7/14) | - | |
| ISVASE(all) | 0.75(129/172) | - | |
| Control1(SRR388226) | PVAAS | 0.20(12/61) | 0.46(28/61) |
| ISVASE(novel) | 0.40(54/134) | 0.65(87/134) | |
| ISVASE(all) | 0.83(2138/2577) | 0.82(2105/2577) | |
| Control1(SRR388227) | PVAAS | 0.14(9/63) | 0.44(28/63) |
| ISVASE(novel) | 0.42(50/120) | 0.73(87/120) | |
| ISVASE(all) | 0.83(2130/2557) | 0.82(2105/2557) | |
| PVAAS | 0.07(2/28) | - | |
| Control(common) | ISVASE(novel) | 0.41(36/87) | - |
| ISVASE(all) | 0.85(1788/2105) | - | |
| Knockdown1(SRR388228) | PVAAS | 0.19(18/93) | 0.33(31/93) |
| ISVASE(novel) | 0.44(83/187) | 0.64(119/187) | |
| ISVASE(all) | 0.83(2250/2710) | 0.85(2298/2710) | |
| Knockdown2(SRR388229) | PVAAS | 0.27(24/89) | 0.35(31/89) |
| ISVASE(novel) | 0.43(73/168) | 0.71(119/168) | |
| ISVASE(all) | 0.83(2293/2760) | 0.83(2298/2760) | |
| Knockdown(common) | PVAAS | 0.26(8/31) | - |
| ISVASE(novel) | 0.46(55/119) | - | |
| ISVASE(all) | 0.85(1951/2298) | - |
Precision known SVASE/total SVASE, known SVASE defined as SVASE existed in dbSNP, Consistency common SVASE/total SVASE, common SVASE means the SVASE identified in both repeat samples
The running time comparison between PVAAS and ISVASE
| Data | PVAAS | ISVASE(novel) | ISVASE (all) |
|---|---|---|---|
| PVAAS test data | 1h38m25s | 11m22s | 13m11s |
| Control1(SRR388226) | 12h5m22s | 2h27m31s | 2h52m33s |
| Control2(SRR388227) | 12h52m19s | 2h29m50s | 2h53m17s |
| Knockdown1(SRR388228) | 15h45m40s | 2h37m36s | 3h4m3s |
| Knockdown2(SRR388229) | 16h40m40s | 2h42m27s | 3h9m38s |
Gene Ontology enrichment analysis for genes related with 65 common SVASEs using PANTHER (filtered redundant records)
| GO function | Total gene | SVASE gene | Expected | Fold Enrichment |
|
|---|---|---|---|---|---|
| GO biological process complete | |||||
| antigen processing and presentation of endogenous peptide antigen via MHC class I | 15 | 3 | 0.02 | >100 | 0.00541 |
| antigen processing and presentation of peptide antigen via MHC class I | 108 | 6 | 0.12 | 50.28 | 1.51E-05 |
| antigen processing and presentation of endogenous antigen | 19 | 3 | 0.02 | >100 | 0.011 |
| antigen processing and presentation of exogenous antigen | 181 | 6 | 0.2 | 30 | 0.000317 |
| response to type I interferon | 74 | 6 | 0.08 | 73.37 | 1.6E-06 |
| response to interferon-gamma | 151 | 6 | 0.17 | 35.96 | 0.000109 |
| GO molecular function complete | |||||
| antigen binding | 107 | 6 | 0.12 | 50.75 | 4.69E-06 |
| GO cellular component complete | |||||
| MHC protein complex | 30 | 6 | 0.03 | >100 | 1.14E-09 |
| membrane-bounded vesicle | 1169 | 9 | 1.29 | 6.97 | 0.00285 |
| vesicle membrane | 508 | 7 | 0.56 | 12.47 | 0.00116 |
Fig. 2The characteristics of SVASEs between novel and all SVASE sites in sample SRR388226. The density of junction reads number, the bar plot of junction number for different junction splicing signals, the boxplot of junction reads number distribution for different junction splicing signals, the density of splicing signal score for variant replaced sequence and reference sequence, the histogram plot of distances between sequence variant and exon 5′ side, the histogram plot of distances between sequence variant and exon 3′ side, the boxplot of distance distribution between sequence variant type and junction breakpoint, and the bar plot of sequence variant number for different sequence variant types are shown for SVASEs located in new splicing events (the upper half) and all splicing events (the lower half)