| Literature DB >> 31238884 |
Fernando Carazo1, Marian Gimeno1, Juan A Ferrer-Bonsoms1, Angel Rubio2.
Abstract
BACKGROUND: Splicing is a genetic process that has important implications in several diseases including cancer. Deciphering the complex rules of splicing regulation is crucial to understand and treat splicing-related diseases. Splicing factors and other RNA-binding proteins (RBPs) play a key role in the regulation of splicing. The specific binding sites of an RBP can be measured using CLIP experiments. However, to unveil which RBPs regulate a condition, it is necessary to have a priori hypotheses, as a single CLIP experiment targets a single protein.Entities:
Keywords: Alternative splicing; CLIP-seq; RNA-binding protein; RNA-seq; Splicing factor
Mesh:
Substances:
Year: 2019 PMID: 31238884 PMCID: PMC6592009 DOI: 10.1186/s12864-019-5900-1
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Overview of the four databases integrated in this work. a Number of experiments per organism in each database. b Number of experiments grouped by technology. c Different genome versions available in the databases. All the experiments in POSTAR2 correspond to human hg38. d Proportional Venn diagram - built using [25] - of the RBPs covered by each database. The four databases cover 195 different RNA-binding proteins (RBPs). The basis of CLIP database is POSTAR2 (171 RBPs), comprising all but 24 other RBPs. However, other databases add up a significant part of experiments (Additional file 1 Supplementary material S1)
Fig. 4Overview of pipeline to predict splicing factors using CLIP and splicing. A toy example with a cassette exon is shown. 1) Selecting splicing regions: The cassette exon has two isoforms which give rise to Path 1 (p1), Path 2 (p2) and reference (pref). Splicing regions (typically 300–400 nt upstream and down- stream the AS events [26]) are represented in orange. 2) The ExS matrix (Events x Splicing factors) is built by mapping RBPs against the splicing regions. 3) The Percent Spliced-In (PSI) of all the events is estimated from RNA-seq or microarrays and a Fisher’s exact test enrichment is performed to get a ranking of RBPs according CLIP binding sites
Fig. 2CLIP similarities of RNA-binding proteins (RBPs). A line connecting two RBPs represents a Pearson correlation > = 0.46. The width of the line is proportional to the correlation value. RBPs are grouped in clusters. Links corresponding to RBPs in different clusters are represented as red lines
Ranking of RNA-binding proteins (RBPs) for the experiments: KD-SRSF1, KD-FUS and KD-TARDBP (CLIP p-value < 0.05; limma p-value < 0.05; |log2 FC| > 0.58). Four groups of columns are separated by thick vertical black lines are shown: i) knock down (KD) genes and RBP of the ranking; ii) the prediction using the pipeline presented in this work (CLIP experiments); iii) differential expression (knock-down vs normal) and iv) the same prediction using previous algorithms based on RBPs’ consensus binding motifs –represented as Position Weighted Matrices (PWMs). NA: the PWM is not available for this RBP. N.S.: non-significant
| Experiment | RBP | Ranking by CLIP | Differentially spliced hits (Expected) | Differentially spliced hits (Found) | CLIP | Expression Fold change (log2) | limma adjusted | Ranking by PWM | PWM |
|---|---|---|---|---|---|---|---|---|---|
| KD-SRSF1 |
|
|
|
|
|
|
|
|
|
|
| 20 | 561 | 747 | 3 .75E−15 | 0 .96 | 7 .76E-14 | NA | NA | |
|
| 27 | 261 | 384 | 1 .03E-13 | -1 .52 | 7 .61E-21 | NA | NA | |
|
| 37 | 266 | 324 | 7 .03E-11 | −0 .83 | 2 .77E-22 | NA | NA | |
|
| 46 | 69 | 164 | 2 .94E− 08 | −0 .63 | 1 .49E-16 | 15 | 1 .11E−03 | |
|
| 62 | 454 | 520 | 1 .29E-05 | -0 .66 | 2 .69E−19 | NA | NA | |
|
| 115 | 83 | 149 | 4 .75E− 02 | −0 .78 | 5 .05E-17 | NA | NA | |
| KD-FUS |
|
|
|
|
|
|
|
|
|
|
| 25 | 163 | 219 | 1 .78E− 06 | 0 .69 | 1 .43E-01 | 1 | 9 .54E-02 | |
|
| 56 | 49 | 73 | 3 .12E-04 | -0 .81 | 3 .10E-01 | NA | NA | |
|
| 64 | 145 | 182 | 4 .94E-04 | -0 .79 | 2 .08E-01 | NA | NA | |
|
| 109 | 40 | 52 | 2 .58E-02 | -0 .70 | 9 .91E-02 | NA | NA | |
| KD-TARDBP |
|
|
|
|
|
|
|
|
|
|
| 26 | 160 | 205 | 6 .69E-05 | 0 .64 | 1 .93E-02 | NA | NA | |
|
| 58 | 144 | 174 | 3 .74E-03 | -0 .60 | 4 .57E-02 | NA | NA | |
|
| 68 | 223 | 255 | 7 .86E-03 | 0 .64 | 7 .83E-02 | NA | NA | |
|
| 81 | 49 | 63 | 1 .81E-02 | 0 .89 | 1 .14E-01 | NA | NA | |
|
| 83 | 37 | 49 | 1 .92E-02 | 0 .68 | 1 .69E-02 | NA | NA |
Fig. 3Overview of the two main pipelines in this work: 1) Integration and mapping CLIP experiments to splicing regions and 2) predicting context-specific splicing factors. The tasks done in both pipelines are represented by colors: identifying the transcriptome binding sites of RBPs using previous CLIP experiments (orange); construction of a matrix containing information of specific events for each splicing factor (blue); calculating alternative splicing events from RNA-seq data or microarray data (green); and combining both results with a statistical pipeline to obtain a ranking of splicing factors (black)