| Literature DB >> 27503141 |
Domenica Marchese1,2, Natalia Sanchez de Groot1,2, Nieves Lorenzo Gotor1,2, Carmen Maria Livi1,2,3, Gian G Tartaglia4,5,6.
Abstract
From transcription, to transport, storage, and translation, RNA depends on association with different RNA-binding proteins (RBPs). Methods based on next-generation sequencing and protein mass-spectrometry have started to unveil genome-wide interactions of RBPs but many aspects still remain out of sight. How many of the binding sites identified in high-throughput screenings are functional? A number of computational methods have been developed to analyze experimental data and to obtain insights into the specificity of protein-RNA interactions. How can theoretical models be exploited to identify RBPs? In addition to oligomeric complexes, protein and RNA molecules can associate into granular assemblies whose physical properties are still poorly understood. What protein features promote granule formation and what effects do these assemblies have on cell function? Here, we describe the newest in silico, in vitro, and in vivo advances in the field of protein-RNA interactions. We also present the challenges that experimental and computational approaches will have to face in future studies. WIREs RNA 2016, 7:793-810. doi: 10.1002/wrna.1378 For further resources related to this article, please visit the WIREs website.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27503141 PMCID: PMC5113702 DOI: 10.1002/wrna.1378
Source DB: PubMed Journal: Wiley Interdiscip Rev RNA ISSN: 1757-7004 Impact factor: 9.957
Figure 1RNA‐binding proteins (RBPs) and RNA life. RNA birth is regulated by RBPs (1) that are responsible for maturation (2) and modification (3). RBPs protect (4) and transport (5) RNA around the cell to specific sites (6). Interactions are regulated through a diverse set of binding sites that allows the formation of dynamic complexes sustained by reversible contacts and involving multiple partners (7, 8). When RNA is not required, it can be stored for future needs (7) or degraded (8). The last process of the RNA life cycle is the release of nucleotides that will be employed to build new RNA molecules. When RBPs are impaired (e.g., protein mutation, concentration deregulation, etc.), half‐life, arrangement and location of RNA are affected (9).
List of Experimental Methods for the Identification of Protein–RNA Interactions
| Method | Advantages | Challenges | Disease‐Related? RBP/?RNA | References | ||
|---|---|---|---|---|---|---|
|
|
| RIP |
Genome‐wide |
High background noise |
ELAVL1 (epilepsy, cancer) | Tenenbaum et al. |
|
HiTS‐CLIP |
Genome‐wide |
False negative due to low cross‐linking efficiency |
CELF4 (epilepsy, hyperactivity) |
Licatalosi et al. | ||
|
|
RNA‐compete |
Large‐scale | Nonphysiological conditions |
ELAVL4 |
Ray et al. | |
|
|
|
TRAP/RAT |
Relatively easy and flexible |
High background noise |
Hogg and Collins | |
| Protein microarray |
Large‐scale | Possible artifacts due to nonphysiological conditions (protein folding, accessibility, post‐translational modifications, etc.,) |
|
Scherrer et al. | ||
|
| MS2‐BioTRAP |
Fast and easy set‐up |
Challenges associated with cell transfection |
Tsai et al. | ||
|
ChIRP | Study of protein–RNA interactions under physiological conditions |
Time and cost consuming |
MALAT1 (cancer) |
Chu et al. | ||
| Interactome capture |
Study of protein–RNA interactions under physiological conditions |
Possible artifacts (positive and false negative) due to cross‐linking |
Castello et al. | |||
RIP, RNA Immunoprecipitation, HiTS‐CLIP, High‐throughput sequencing of RNA isolated by cross‐linking immunoprecipitation, PAR‐CLIP, Photoactivable ribonucleoside enhanced CLIP, iCLIP, individual‐nucleotide resolution Cross‐Linking and ImmunoPrecipitation, SEQRS, in?vitro selection high‐throughput sequencing of RNA and sequence specificity landscape, RBNS, RNA Bind‐n‐Seq, RNA‐Map, RNA on a massively parallel array, HiTS‐RAP, High‐throughput sequencing RNA affinity profiling, RNA‐MITOMI, RNA mechanically induced trapping of molecular interactions, TRAP, Tandem RNA affinity purification, RAT, RNA affinity in tandem, RaPID, RNA‐binding protein purification and identification, MS2‐BioTRAP, MS2 in vivo biotin tagged RNA affinity purification, ChIRP, Chromatin isolation by RNA purification, CHART, Capture hybridization analysis of RNA targets, RAP‐MS, RNA antisense purification‐mass spectrometry.
Figure 2Schematic representation of experimental methods for the identification of protein–RNA interactions. Protein‐centric methods. (a) In vivo approaches include native purification protocols (RNA Immunoprecipitation, RIP) and denaturing protocols (Cross‐linking and Immunoprecipitation, CLIP). In the fist case, RNAs bound to a specific protein are immunoprecipitated from cell lysate in native conditions by using a protein‐specific antibody and, after wash and protein removal by proteinase K treatment, RNAs are reverse transcribed and identified through RNA sequencing.48 In the second case, cells are UV cross‐linked to ‘freeze’ protein–RNA complexes. RNA is digested to obtain fragments of a defined size and the obtained complexes are immunoprecipitated and resolved on an SDS‐PAGE. After isolation from membrane and proteinase K digestion, the RNA fragments are reverse transcribed and sequenced.49 In red and yellow are represented simulated enrichment values of a target and nontarget RNA respectively (lower part of the panel). (b) As an example of in vitro approaches, the schematic workflow of RNA compete is represented.35 RNA libraries are generated by in vitro transcription. Transcripts are incubated with the protein of interest immobilized on an affinity matrix (e.g., streptavidin‐biotin tag system) and the bound fragments are then fluorescently labeled and detected by hybridization on a microarray platform. RNA‐centric methods. (c) In vivo approaches for the identification of proteins bound to an RNA of interest often derive from methods used for the identification of genomic DNA loci targeted by noncoding RNAs. Cells are cross‐linked and lysed. Chromatin is sheared and protein–RNA–DNA complexes are pull‐down by using biotinylated oligos complementary to the sequence of the RNA. After RNA digestion, proteins can be identified by western blot or mass spectrometry analysis.46 (d) In vitro approaches commonly exploit the use of RNA tags to immobilize the RNA of interest onto an affinity matrix. Upon incubation with cell lysate, proteins bind to the immobilized RNA. After washes, the protein–RNA complexes are eluted from the matrix and proteins are characterized by western blot or mass spectrometry.50
List of Computational Methods for the Identification of Protein–RNA Interactions
| Prediction | Examples | Advantages | Disadvantages | References |
|---|---|---|---|---|
| Binding motif (RNA) | MEME |
| High‐throughput data are required as input | Bailey et al. |
| SeAMotE | Sequence complexity is a limitation | Agostini et al. | ||
| Binding residue | Pprint | Evolutionary information | RNA‐binding domains cannot be identified | Kumar et al. |
| BindN+ | Wang et al. | |||
| RNAbindR+ | Walia et al. | |||
| Domain (protein) | HMMER | Domain recognition | Annotation of RNA‐binding domains are required | Finn et al. |
|
| Annotation of RNA‐binding domains are not required | Single amino acid resolution has not been implemented | Livi et al. | |
| RNA–protein interaction |
| Runs on high‐throughput data | RNA < 1200 nt | Bellucci et al. |
| Protein < 750 aa | ||||
| RPISeq | High sensitivity | Low specificity | Muppirala et al. | |
| Max 100 sequences per run |
Performances on Detecting RNA‐Binding Regions
| Method | ACCa | sensb | specc | precd |
|---|---|---|---|---|
|
| 0.67 | 0.76 | 0.60 | 0.65 |
|
| 0.38 | 0.37 | 0.39 | 0.38 |
|
| 0.47 | 0.49 | 0.45 | 0.49 |
|
| 0.48 | 0.53 | 0.42 | 0.48 |
We analyzed 102 proteins containing nonclassical RNA‐binding domains and 102 without annotated RNA‐binding domains with catRAPID signature and three other algorithms.27, 68, 83, 100 The performances are measured using a. accuracy, b. sensitivity, c. specificity, and d. precision.
Figure 3Methods to predict RNA binding sites. We calculated performances of BindN,93 BindN+,78 Pprint,79 RNAproB,101 RNABindR+,80 and catRAPID signature 83 on a set of proteins whose RNA‐binding sites have been validated through X‐ray and NMR techniques. As in a recent work,102 three protein classes ‘fold,’ ‘family,’ and ‘superfamily’ were retrieved from SCOP.102 Performances were estimated using the formula (sensitivity + specificity)/2 on 90 folds, 126 families, and 100 superfamilies (details at http://service.tartaglialab.com/static_files/shared/documentation_signature.html).
Figure 4Features of yeast proteins forming ribonucleoprotein assemblies. Using physicochemical properties, the multicleverMachine approach122, 123 discriminates P‐bodies and stress granules from other globular proteins. The datasets employed in this analysis comprise 52 P‐body and 62 stress granule proteins, previously reported in experimental works,124 as well as five random sets of 100 globular proteins from Astral SCOPe 2.05 (<40% sequence identity).125 Three specific properties are reported: (a) nucleic acid binding, (b) hydrophobicity, and (c) disorder propensity. For each feature, enrichment or depletion of a set is indicated with a specific color: green indicates that proteins contained in P‐bodies or stress granules are enriched with respect to proteome subsamples; red means depletion with respect to the proteome subsamples; yellow indicates no significant differences between the sets. As observed in previous experimental works,63, 104 proteins found in P‐bodies and stress granules are more structurally disordered and prone to bind nucleic acids as well as less hydrophobic (p‐values <10−5; Fisher's exact test). Sets and results can be accessed at http://www.tartaglialab.com/cs_multi/confirm/1207/c51ef6cff3/ where additional statistical analyses are also reported.