| Literature DB >> 18353859 |
Ranjit Prasad Bahadur1, Martin Zacharias, Joël Janin.
Abstract
We analyze the protein-RNA interfaces in 81 transient binary complexes taken from the Protein Data Bank. Those with tRNA or duplex RNA are larger than with single-stranded RNA, and comparable in size to protein-DNA interfaces. The protein side bears a strong positive electrostatic potential and resembles protein-DNA interfaces in its amino acid composition. On the RNA side, the phosphate contributes less, and the sugar much more, to the interaction than in protein-DNA complexes. On average, protein-RNA interfaces contain 20 hydrogen bonds, 7 that involve the phosphates, 5 the sugar 2'OH, and 6 the bases, and 32 water molecules. The average H-bond density per unit buried surface area is less with tRNA or single-stranded RNA than with duplex RNA. The atomic packing is also less compact in interfaces with tRNA. On the protein side, the main chain NH and Arg/Lys side chains account for nearly half of all H-bonds to RNA; the main chain CO and side chain acceptor groups, for a quarter. The 2'OH is a major player in protein-RNA recognition, and shape complementarity an important determinant, whereas electrostatics and direct base-protein interactions play a lesser part than in protein-DNA recognition.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18353859 PMCID: PMC2377425 DOI: 10.1093/nar/gkn102
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The protein–RNA data set
| A. Complexes with tRNA ( | |||||||||||
| 1asy | 1c0a | 1f7u* | 1ffy* | 1gax | 1h3e | 1h4s | 1j1u* | 1n78* | 1qf6 | 1qtq | 1ser |
| 1ttt | 1u0b* | 1vfg | 2azx | 2bte | 2csx | 2drb | 2fk6 | 2fmt | |||
| B. Ribosomal proteins ( | |||||||||||
| 1dfu* | 1f7y | 1feu* | 1g1x | 1i6u | 1mji | 1mms | 1mzp | 1s03 | 1sds* | 2hw8* | |
| C. Duplex RNA ( | |||||||||||
| 1di2* | 1e7k | 1hq1* | 1msw* | 1ooa | 1r3e* | 1rpu | 1si3 | 1wne | 1yvp* | 1zbi* | 2az0 |
| 2ez6* | 2f8s | 2gjw | 2hvy* | 2ipy | |||||||
| D. Single-stranded RNA ( | |||||||||||
| 1a9n | 1av6 | 1cvj | 1g2e* | 1jbs* | 1jid* | 1k8w* | 1knz | 1kq2 | 1lng* | 1m5o* | 1m8v |
| 1m8w* | 1n35 | 1wpu* | 1wsu* | 1zbh | 1zh5* | 2a8v | 2anr* | 2asb* | 2b3j* | 2bx2 | 2db3* |
| 2f8k* | 2g4b | 2gic | 2i82* | 2ix1 | 2j0s* | ||||||
| E. Miscellaneous ( | |||||||||||
| 2bgg* | 2bh2* |
Asterisk indicates PDB entry with resolution better than 2.4 Å.
Average properties of the protein-RNA interfaces
| Average value of interface parameter | Protein/RNA | Protein/DNA | Protein/protein | ||||
|---|---|---|---|---|---|---|---|
| All classes | A tRNA | B ribosomal | C duplex RNA | D single-strand | |||
| Number of complexes | 81 | 21 | 11 | 17 | 30 | 75 | 70 |
| Nucleotides in RNA | 42 ± 28 | 76 | 47 | 33 | 21 | – | – |
| BSA (Å | 2530 ± 1210 | 3460 | 2260 | 2630 | 1890 | 3100 | 1910 |
| Protein | 1210 | 1660 | 1110 | 1270 | 880 | 1540 | – |
| Nucleic acid | 1320 | 1800 | 1150 | 1360 | 1010 | 1560 | – |
| Number of | |||||||
| Amino acids N_aa | 43 ± 21 | 61 | 34 | 45 | 33 | 48 | 57 |
| Nucleotides N_nu | 17.5 ± 10 | 26 | 21 | 18 | 10 | 18 | – |
| BSA (Å | |||||||
| Amino acid | 28 | 27 | 33 | 28 | 27 | 33 | 33 |
| Nucleotide | 75 | 68 | 53 | 75 | 106 | 72 | – |
| Percentage buried atoms f_bu | |||||||
| Protein | 29 ± 9 | 24 | 31 | 41 | 32 | 24 | 34 |
| Nucleic acid | 29 ± 8 | 24 | 32 | 42 | 30 | 28 | – |
| Packing index L_D | |||||||
| Protein | 37 ± 8 | 35 | 37 | 36 | 38 | 39 | 42 |
| Nucleic acid | 43 ± 9 | 38 | 40 | 42 | 46 | 46 | – |
| H-bonds | |||||||
| Number per interface | 20 ± 11 | 25 | 19 | 24 | 15 | 22 | 10 |
| BSA per bond (Å | 125 | 141 | 117 | 110 | 126 | 145 | 190 |
| Water molecules | |||||||
| Number per interface | 32 ± 19 | 21 | 20 | ||||
| Per 1000 Å | 12.6 | 6.7 | 10.0 | ||||
| Bridging H-bonds | 11 ± 7 | 6 | |||||
aData from ref. (4).
bData from ref. (19).
cBahadur et al. (20) report values of L_D = 42 for protein-protein complexes and L_D = 32 for crystal packing interfaces. All other values are from this work.
dIn 36 PDB entries with resolution better than 2.4 Å. The data for protein-protein interfaces are from ref. (28).
Figure 1.Size of protein–RNA interfaces. Histogram of the buried surface area (protein plus RNA) in each of the 81 complexes. The classes are defined in Table 1.
Figure 2.Buried interface area, atoms, residues and nucleotides. The number of interface atoms (A) and of interface amino acid residues or nucleotides (B) is plotted against the ASA lost by either the protein (x) or the RNA (•) component of the 81 complexes.
Chemical composition of the interfaces
| Average area contribution (%) | Protein–RNA | Protein–DNA | Protein–protein | |||
|---|---|---|---|---|---|---|
| Interface | Accessible surface | Interface | Accessible surface | Interface | Accessible surface | |
| Polypeptide | ||||||
| Main chain | 15 | 20 | 13 | 20 | 20 | 23 |
| Side chain | 85 | 80 | 87 | 80 | 80 | 77 |
| Non-polar | 55 | 56 | 52 | 56 | 58 | 55 |
| Neutral polar | 21 | 22 | 24 | 23 | 28 | 29 |
| Charged (positive) | 20 | 12 | 23 | 12 | 9 | 8 |
| Charged (negative) | 4 | 10 | 2 | 9 | 5 | 8 |
| Nucleotide | ||||||
| Phosphate | 26 | 32 | 43 | 35 | ||
| Sugar | 39 | 36 | 29 | 38 | ||
| Base | 35 | 32 | 27 | 28 | ||
| Non-polar | 33 | 30 | 41 | 47 | ||
| Neutral polar | 41 | 39 | 16 | 19 | ||
| Charged (negative) | 26 | 32 | 43 | 34 | ||
aThe contributions of each atom type to the BSA or ASA are averaged over all the complexes.
bTaken from ref. (4).
cCalculated on the dataset in ref. (19).
dAll carbon-containing groups are counted as nonpolar; O, N and S are counted as polar; N is positively charged in Arg/Lys side chains. O negatively charged in Asp/Glu side chains.
eO1P, O2P and P atoms are ‘phosphate’. All carbon-containing groups are ‘non-polar’; N and O are ‘neutral polar’ except for O1P and O2P, which are negatively charged.
Figure 3.Shape and electrostatic potential of protein–RNA interfaces. The molecular surface of the proteins is colored according to its electrostatic potential; blue is positive and red negative. The RNA backbone is drawn as a tube. (A) The RNase E subunit binds a 15-mer RNA with the 5′-end at its active site (23); the interface is one of the smallest in our sample, but the 15-mer makes other contacts in the crystal (2bx2, class D). (B) The splicing endonuclease is a dimer (36); it forms an average size interface with a double-stranded 19-mer (2gjw, class C). (C) Yeast arginyl-tRNA synthetase (37) forms an extensive interface with tRNA-Arg (1f7u, class A). (D) Ribosomal protein S8 in complex with a 37-mer stem-loop fragment of 16S rRNA (38) (1i6u, class B). (E) The SAM domain of the Vts1 post-transcriptional regulator in complex with a 16-mer hairpin RNA (39) (2f8k, class D). (F) The 15.5 kDa spliceosomal protein in complex with a 22-mer stem-loop fragment of U4 snRNA (40) (1e7k, class C). The figure was created using PyMOL (DeLano Scientific LLC, San Carlos, CA, ).
Amino acid and nucleotide compositions
| Composition | Number-Based | Area-Based | ||||||
|---|---|---|---|---|---|---|---|---|
| Protein–RNA | Protein–RNA | Protein–DNA | Protein–protein | |||||
| Surface | Interface | Surface | Interface | Surface | Interface | Surface | Interface | |
| Nucleotides | ||||||||
| A | 20.0 | 20.5 | 20.2 | 24.2 | 26.7 | 24.6 | ||
| U/T | 20.6 | 21.6 | 21.2 | 23.3 | 27.1 | 31.5 | ||
| G | 31.8 | 28.7 | 30.8 | 25.4 | 23.8 | 23.4 | ||
| C | 27.6 | 29.1 | 26.8 | 26.5 | 22.5 | 20.5 | ||
| Amino acids | ||||||||
| Ala | 5.6 | 4.8 | 3.4 | 3.1 | 3.4 | 3.4 | 4.0 | 2.6 |
| Arg | 8.4 | 13.6 | 12.6 | 20.6 | 12.1 | 23.8 | 8.9 | 10.1 |
| Asn | 4.2 | 5.3 | 4.3 | 6.0 | 5.3 | 6.3 | 6.2 | 5.5 |
| Asp | 6.9 | 5.3 | 7.1 | 3.8 | 6.4 | 1.6 | 7.1 | 5.2 |
| Cys | 0.6 | 0.7 | 0.3 | 0.3 | 0.4 | 0.8 | 0.7 | 1.5 |
| Gln | 4.3 | 4.5 | 5.0 | 4.7 | 5.5 | 5.1 | 6.0 | 4.2 |
| Glu | 11.1 | 5.5 | 15.3 | 4.2 | 12.3 | 2.5 | 9.8 | 6.1 |
| Gly | 6.8 | 6.6 | 3.8 | 4.7 | 3.3 | 3.6 | 4.5 | 4.6 |
| His | 2.3 | 3.1 | 2.3 | 4.2 | 2.9 | 3.8 | 1.9 | 3.6 |
| Ile | 3.9 | 3.5 | 2.6 | 2.9 | 2.8 | 2.8 | 2.4 | 4.2 |
| Leu | 6.9 | 5.0 | 4.8 | 3.9 | 5.1 | 2.4 | 4.1 | 5.5 |
| Lys | 9.9 | 11.3 | 15.5 | 14.0 | 16.5 | 17.5 | 11.8 | 6.7 |
| Met | 1.7 | 2.1 | 1.5 | 1.9 | 1.8 | 1.2 | 1.2 | 3.2 |
| Phe | 3.1 | 3.4 | 2.3 | 3.6 | 1.8 | 3.8 | 2.0 | 4.4 |
| Pro | 5.3 | 4.5 | 4.9 | 3.6 | 4.3 | 2.2 | 5.1 | 4.0 |
| Ser | 5.1 | 6.2 | 3.5 | 4.6 | 4.7 | 6.3 | 8.4 | 5.5 |
| Thr | 4.8 | 5.3 | 3.8 | 4.2 | 4.4 | 6.7 | 7.3 | 5.1 |
| Trp | 1.1 | 1.2 | 0.9 | 1.9 | 0.8 | 0.5 | 1.3 | 4.5 |
| Tyr | 3.5 | 4.3 | 2.9 | 5.0 | 3.4 | 3.4 | 3.2 | 9.1 |
| Val | 4.7 | 4.0 | 3.2 | 2.7 | 2.9 | 2.4 | 3.6 | 3.8 |
aPercent fraction of the number of nucleotides or amino acid residues of each type present on the surface or at the interface. U/T includes pseudouracil.
bPercent fraction of the ASA or BSA contributed by each type of nucleotide or residue.
cData from ref. (4).
dData from ref. (3).
Figure 4.Euclidean distances between amino acid compositions. Values of Δf are calculated from the area based compositions in Table 4 as reported under ‘Results’ section.
Protein–nucleic acid hydrogen bonds
| H bonds | Protein–RNA | Protein–DNA |
|---|---|---|
| Number per interface | 20 | 22 |
| Protein chemical group (%) | ||
| Main chain O | 12 | 10 |
| Main chain N | 14 | 18 |
| Side chain groups | 74 | 73 |
| N Arg, Lys | 34 | 41 |
| N Asn, Gln, His, Trp | 11 | 14 |
| OH Ser, Thr, Tyr | 17 | 17 |
| S Cys, Met | 0.2 | 1 |
| O Asp, Glu, Asn, Gln | 12 | – |
| Nucleic acid chemical group (%) | ||
| Phosphate | 36 | 60 |
| Sugar | 33 | 6 |
| Base | 31 | 34 |
| Guanine | 10.5 | 16 |
| Adenine | 6.0 | 7 |
| Cytosine | 7.7 | 7 |
| Uracil/Thymine | 7.4 | 4 |
aData from ref. (4).
bPercentage of the 1637 protein–RNA H bonds contributed by different chemical groups.
cIncludes pseudouracil.
Figure 5.The H-bonding pattern of RNA to proteins. The numbers are percent fractions of the 1637 protein–RNA H-bonds identified in the 81 complexes; a 5% fraction represents approximately one bond per complex. (A) Bonds involving the RNA backbone. (B) Bonds involving the bases. U/T includes pseudouracil.