| Literature DB >> 35819951 |
Hongli Ma1,2,3,4, Han Wen2, Zhiyuan Xue5, Guojun Li1,4,6, Zhaolei Zhang2,3,7.
Abstract
RNA molecules can adopt stable secondary and tertiary structures, which are essential in mediating physical interactions with other partners such as RNA binding proteins (RBPs) and in carrying out their cellular functions. In vivo and in vitro experiments such as RNAcompete and eCLIP have revealed in vitro binding preferences of RBPs to RNA oligomers and in vivo binding sites in cells. Analysis of these binding data showed that the structure properties of the RNAs in these binding sites are important determinants of the binding events; however, it has been a challenge to incorporate the structure information into an interpretable model. Here we describe a new approach, RNANetMotif, which takes predicted secondary structure of thousands of RNA sequences bound by an RBP as input and uses a graph theory approach to recognize enriched subgraphs. These enriched subgraphs are in essence shared sequence-structure elements that are important in RBP-RNA binding. To validate our approach, we performed RNA structure modeling via coarse-grained molecular dynamics folding simulations for selected 4 RBPs, and RNA-protein docking for LIN28B. The simulation results, e.g., solvent accessibility and energetics, further support the biological relevance of the discovered network subgraphs.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35819951 PMCID: PMC9275694 DOI: 10.1371/journal.pcbi.1010293
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.779
Implanted RNA motifs in recovery analysis as described in Section 2.7.
| Implanted Motifs | ||
|---|---|---|
|
| NNNCCACCANNN | Sequence |
| ( ((.. ....) ) ) | Structure | |
|
| NNNAUGNNNNNNNNNNNNNNN | Sequence |
| ( ((.. . ( ((.. .... ..) ) ) ) ) ) ) ) ) | Structure | |
|
| NNNAANNNNNNAANNN | Sequence |
| ( ((.. ( (() ) ) ..) ) ) | Structure | |
|
| NNNNNGAGAGAGANNNNN | Sequence |
| ( ( ( ((.. .... ..) ) ) ) ) | Structure | |
|
| NNNCCCCNNNNNNAAAANNNNNNCCCCNNN | Sequence |
| ( ((.. .. ( ( ( (() ) ) ) ) .. .. ( ( ( ( ( ( (() ) ) ) ) ) ) ) .. ..) ) ) | Structure |
Recovery rates of implanted RNA motifs by different methods (Section 2.7).
| Software (type) | 3nt bulge loop | 4nt internal loop | 6nt hairpin loop | 8nt hairpin loop | 12nt multi loop | Overall recovery rate |
|---|---|---|---|---|---|---|
| RNANetMotif (Sequence) | 1 | 1 | 1 | 1 | 1 | 1 |
| RNANetMotif (Structure) | 1 | 1 | 1 | 1 | 1 | 1 |
| GraphProt (Sequence) | 1/3 | 1 | 2/3 | 5/8 | 7/12 | 2/3 |
| GraphProt (Structure) | 1/3 | 1 | 0 | 0 | 2/3 | 2/5 |
| RNAcontext (Sequence) | 1/3 | 3/4 | 2/3 | 1 | 5/6 | 5/7 |
| RNAcontext (Structure) | 0 | 0 | 0 | 1 | 1 | 2/5 |
| Zagros (Sequence) | 2/3 | 3/4 | 1 | 7/8 | 2/3 | 4/5 |
| Zagros (Structure) | 0 | 1 | 1 | 1 | 2/3 | 3/4 |
RBPs and the domain information.
| RBP and Cell Line | RNA binding domain | #peaks |
|---|---|---|
| AKAP1_HepG2 | KH,Tudor | 5338 |
| BCLAF1_HepG2 | 22884 | |
| DDX24_K562 | Helicase ATP-binding, Helicase C-terminal | 5841 |
| DDX3X_HepG2 | Helicase ATP-binding, Helicase C-terminal | 5976 |
| DDX3X_K562 | Helicase ATP-binding, Helicase C-terminal | 3961 |
| FAM120A_K562 | 4218 | |
| G3BP1_HepG2 | NTF2, RRM | 5204 |
| GRWD1_HepG2 | 16040 | |
| IGF2BP1_K562 | RRM1, RRM2, KH1, KH2, KH3, KH4 | 4690 |
| LARP4_HepG2 | HTH La-type RNA-binding, RRM | 4350 |
| LIN28B_K562 | CSD | 4616 |
| PABPC4_K562 | RRM1, RRM2, RRM3, RRM4, PABC | 4865 |
| PPIG_HepG2 | PPIase cyclophilin-type | 13538 |
| PUM2_K562 | PUM-HD | 4742 |
| RBM15_K562 | RRM1, RRM2, RRM3, SPOC | 6800 |
| RPS3_HepG2 | KH type-2 | 4697 |
| SND1_HepG2 | TNase-like 1, TNase-like 2, TNase-like 3, TNase-like 4, Tudor | 5697 |
| UCHL5_K562 | 12866 | |
| UPF1_HepG2 | 8547 | |
| UPF1_K562 | 11708 | |
| YBX3_K562 | CSD | 12706 |
| ZNF622_K562 | 10551 |