| Literature DB >> 24476917 |
Sheng-You Huang1, Xiaoqin Zou.
Abstract
Protein-RNA interactions play important roles in many biological processes. Given the high cost and technique difficulties in experimental methods, computationally predicting the binding complexes from individual protein and RNA structures is pressingly needed, in which a reliable scoring function is one of the critical components. Here, we have developed a knowledge-based scoring function, referred to as ITScore-PR, for protein-RNA binding mode prediction by using a statistical mechanics-based iterative method. The pairwise distance-dependent atomic interaction potentials of ITScore-PR were derived from experimentally determined protein-RNA complex structures. For validation, we have compared ITScore-PR with 10 other scoring methods on four diverse test sets. For bound docking, ITScore-PR achieved a success rate of up to 86% if the top prediction was considered and up to 94% if the top 10 predictions were considered, respectively. For truly unbound docking, the respective success rates of ITScore-PR were up to 24 and 46%. ITScore-PR can be used stand-alone or easily implemented in other docking programs for protein-RNA recognition.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24476917 PMCID: PMC3985650 DOI: 10.1093/nar/gku077
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
List of 20 protein atom types and 12 RNA atom types used in ITScore-PR, in which ‘*’ stands for any residue name of proteins
| Numbers | Symbol | Atom name |
|---|---|---|
| Protein | ||
| 1 | C2+ | ARG_CZ |
| 2 | C2- | ASP_CG, GLU_CD |
| 3 | C2M | *_C |
| 4 | C2S | ASN_CG, GLN_CD |
| 5 | Car | HIS_CD2, HIS_CE1, HIS_CG, PHE_CD1, PHE_CD2, PHE_CE1, PHE_CE2, PHE_CG, PHE_CZ, TRP_CD1, TRP_CD2, TRP_CE2, TRP_CE3, TRP_CG, TRP_CH2, TRP_CZ2, TRP_CZ3, TYR_CD1, TYR_CD2, TYR_CE1, TYR_CE2, TYR_CG, TYR_CZ |
| 6 | C3C | ALA_CB, ARG_CB, ARG_CG, ASN_CB, ASP_CB, GLN_CB, GLN_CG, GLU_CB, GLU_CG, HIS_CB, ILE_CB, ILE_CD1, ILE_CG1, ILE_CG2, LEU_CB, LEU_CD1, LEU_CD2, LEU_CG, LYS_CB, LYS_CD, LYS_CG, MET_CB, PHE_CB, PRO_CB, PRO_CG, THR_CG2, TRP_CB, TYR_CB, VAL_CB, VAL_CG1, VAL_CG2 |
| 7 | C3A | *_CA |
| 8 | C3X | ARG_CD, CYS_CB, LYS_CE, MET_CE, MET_CG, PRO_CD, SER_CB, THR_CB |
| 9 | N2N | ALA_N, ARG_N, ASN_N, ASP_N, CYS_N, GLN_N, GLU_N, GLY_N, HIS_N, ILE_N, LEU_N, LYS_N, MET_N, PHE_N, PRO_N, SER_N, THR_N, TRP_N, TYR_N, VAL_N |
| 10 | N2+ | ARG_NH1, ARG_NH2 |
| 11 | N2X | ASN_ND2, GLN_NE2 |
| 12 | Nar | HIS_ND1, HIS_NE2, TRP_NE1 |
| 13 | N21 | ARG_NE |
| 14 | N3+ | LYS_NZ |
| 15 | O2M | *_O |
| 16 | O2S | ASN_OD1, GLN_OE1 |
| 17 | O3H | SER_OG, THR_OG1, TYR_OH |
| 18 | O2- | ASP_OD1, ASP_OD2, GLU_OE1, GLU_OE2 |
| 19 | S31 | CYS_SG |
| 20 | S30 | MET_SD |
| RNA | ||
| 1 | C2X | C_C2, G_C6, U_C2, U_C4 |
| 2 | Car | C_C4, C_C5, C_C6, G_C2, U_C5, U_C6, A_C2, A_C4, A_C5, A_C6, A_C8, G_C4, G_C5, G_C8 |
| 3 | C3X | A_C1’, A_C2’, A_C3’, A_C4’, A_C5’, C_C1’, C_C2’, C_C3’, C_C4’, C_C5’, G_C1’, G_C2’, G_C3’, G_C4’, G_C5’, U_C1’, U_C2’, U_C3’, U_C4’, U_C5’ |
| 4 | N2N | C_N1, G_N1, U_N1, U_N3 |
| 5 | N2X | A_N6, C_N4, G_N2 |
| 6 | Nar | C_N3, G_N3, A_N1, A_N3, A_N7, G_N7 |
| 7 | N21 | A_N9, G_N9 |
| 8 | O2 | C_O2, G_O6, U_O2, U_O4 |
| 9 | O31 | A_O2’, C_O2’, G_O2’, U_O2’ |
| 10 | O32 | A_O3’, A_O4’, A_O5’, C_O3’, C_O4’, C_O5’, G_O3’, G_O4’, G_O5’, U_O3’, U_O4’, U_O5’ |
| 11 | O2- | A_OP1, A_OP2, C_OP1, C_OP2, G_OP1, G_OP2, U_OP1, U_OP2 |
| 12 | P | A_P, C_P, G_P, U_P |
Figure 1.Two example pair potentials for ITScore-PR.
Figure 2.The score-RMSD plots of ITSocre-PR for the ROSETTA docking decoys (five complexes) generated by the Varani group (9).
Figure 3.The success rates of ITScore-PR and five other scoring functions (dRNA, DARS-RNP, QUASI-RNP, ZDOCK 2.1 and PMF) as a function of the number of top ranked orientations for the bound test cases of the 72 complexes in the protein–RNA docking benchmark by Huang and Zou (49).
Figure 4.The success rates of ITScore-PR and five other scoring functions (dRNA, DARS-RNP, QUASI-RNP, ZDOCK 2.1 and PMF) as a function of the number of top ranked orientations for (a) all the unbound cases in which homologous unbound structures are included (72 complexes), and (b) the ‘truly’ unbound test cases of the 50 complexes from the protein–RNA docking benchmark by Huang and Zou (49). The details are explained in the text.
Figure 5.The comparison between the predicted complex (protein: light blue, RNA: yellow) and experimentally determined crystal structure (protein: red, RNA: cyan) of three selected unbound test cases: (a) 1QTQ ( Å, Å and ), (b) 2ZZM ( Å, Å and ) and (c) 1H3E ( Å, Å and ).
Figure 6.The success rates of ITScore-PR and five other scoring functions (dRNA, DARS-RNP, QUASI-RNP, ZDOCK 2.1 and PMF) as a function of the number of top ranked orientations for the unbound test cases of the 72 complexes from the protein–RNA docking benchmark by Perez-Cano et al. (48).
Figure 7.The success rates of ITScore-PR and four other scoring functions (DECK-RP, DARS-RNP, the Li potential and RPDOCK) as a function of the number of top ranked orientations for the RPDOCK docking decoys based on (a) the 43 test cases in the protein–RNA docking benchmark by Perez-Cano et al. (48) and (b) the 50 test cases in the protein–RNA docking benchmark by Huang and Zou (49).