| Literature DB >> 19465396 |
Yaroslav Chushak1, Morley O Stone.
Abstract
In vitro selection of RNA aptamers that bind to a specific ligand usually begins with a random pool of RNA sequences. We propose a computational approach for designing a starting pool of RNA sequences for the selection of RNA aptamers for specific analyte binding. Our approach consists of three steps: (i) selection of RNA sequences based on their secondary structure, (ii) generating a library of three-dimensional (3D) structures of RNA molecules and (iii) high-throughput virtual screening of this library to select aptamers with binding affinity to a desired small molecule. We developed a set of criteria that allows one to select a sequence with potential binding affinity from a pool of random sequences and developed a protocol for RNA 3D structure prediction. As verification, we tested the performance of in silico selection on a set of six known aptamer-ligand complexes. The structures of the native sequences for the ligands in the testing set were among the top 5% of the selected structures. The proposed approach reduces the RNA sequences search space by four to five orders of magnitude--significantly accelerating the experimental screening and selection of high-affinity aptamers.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19465396 PMCID: PMC2709588 DOI: 10.1093/nar/gkp408
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Reduction in size of the RNA sequence space for experimental screening and selection of RNA aptamers by in silico approach. The secondary structure of more than 2.5 × 108 RNA sequences was analyzed to select 100 000 sequences for the RNA 3D structure library. The high-throughput virtual screening of the developed library selected 103–104 sequences suitable for the experimental screening and verification.
The free energy of secondary structure formation for RNA aptamers that bind different ligands comparing with the mean free energy of the same-length random sequences
| Ligand (reference) | Aptamer length (bases) | Mean free energy (kcal/mol) | SD (kcal/mol) | Aptamer free energy (kcal/mol) | Aptamer |
|---|---|---|---|---|---|
| ATP ( | 40 | −6.54 | 3.23 | −17.7 | −3.5 |
| Codeine ( | 34 | −4.96 | 2.90 | −7.60 | −0.9 |
| Flavin ( | 35 | −5.27 | 3.00 | −18.0 | −4.2 |
| Gentamicin ( | 27 | −3.18 | 2.47 | −13.9 | −4.3 |
| Guanine ( | 32 | −4.39 | 2.78 | −13.7 | −3.4 |
| Isoleucine ( | 27 | −3.18 | 2.47 | −7.30 | −1.7 |
| Neomycin | 23 | −2.17 | 2.10 | −11.0 | −4.2 |
| Neomycin | 31 | −4.23 | 2.70 | −7.50 | −1.2 |
| Theophylline ( | 33 | −4.73 | 2.80 | −11.6 | −2.5 |
| Tobramycin ( | 26 | −2.90 | 2.35 | −8.70 | −2.5 |
Non-functional neomycin aptamer.
Neomycin aptamer that can be integrated into riboswitch to regulate gene expression.
Figure 2.Secondary structure of RNA aptamers for different small-molecule ligands. Blue-circled bases participate in the formation of ligand-binding pocket and red-circled bases form hydrogen bonds with the ligand.
Figure 3.Distribution of structural motifs in 105 random RNA sequences. The solid blue line represents sequences with two constraints: the free energy of secondary structure less than −5.7 kcal/mol and the number of unpaired bases is at least 11. The sequences are folded into 725 different secondary structures. The dashed red line shows a distribution when the number of the same structural motifs is limited to 150. In this case, the pool of 100 000 sequences contains 997 different structural motifs.
The average backbone RMSD of the five lowest energy-predicted structures from the native structure
| PDB | Rosetta Score | Minimization | |
|---|---|---|---|
| 1BYJ | 27 | 5.2 (1.2) | 3.8 (1.3) |
| 1EHT | 33 | 7.4 (0.9) | 6.9 (0.6) |
| 1ESY | 19 | 5.6 (0.3) | 4.9 (0.6) |
| 1KKA | 17 | 6.9 (1.1) | 6.2 (0.7) |
| 1Q9A | 27 | 6.5 (0.4) | 6.2 (0.4) |
| 1QWA | 21 | 7.3 (0.7) | 6.1 (0.8) |
| 28SP | 28 | 4.2 (0.9) | 3.7 (0.6) |
| 2F88 | 34 | 6.4 (0.6) | 5.8 (0.4) |
| 2TOB | 20 | 5.7 (0.6) | 5.1 (0.5) |
For NMR models, the first model was designated as the reference structure. Values in parentheses are standard deviations.
Predicted structures were ranked using Rosetta scoring function (13).
Predicted structures were minimized using AMBER force field and ranked based on their final energy.
Ranking and binding energy of the native RNA–ligand complex from a pool of 5000 generated 3D structures based on docking procedure
| Ligand | Native rank | Predicted Δ | Expt. Δ | Expt. |
|---|---|---|---|---|
| Codeine | 216 | −9.64 | −7.62 | 2.56 |
| FMN | 116 | −7.67 | −8.59 | 0.50 |
| Gentamicin | 127 | −11.4 | −10.9 | 0.01 |
| Guanine | 131 | −7.87 | −7.83 | 1.80 |
| Isoleucine | 57 | −4.24 | −3.98 | 1200 |
| Theophylline | 102 | −4.91 | −8.72 | 0.40 |
The RNA structures were ranked based on the binding affinity to a small molecule ligand. The native aptamer structure typically is ranked among the top 5% of the best structures.
Figure 4.Comparison of predicted docking structures with the experimental results for gentamicin-binding aptamer. (a) The predicted docking pose with the highest score of gentamicin (yellow), and the experimental pose (red) inside the experimental NMR structure of RNA aptamer. The first NMR model was taken as the reference structure. (b) The predicted docking pose of gentamicin (blue) and the experimental configuration (red) inside the predicted structure of RNA aptamer. The predicted RNA structure was aligned with the experimental using backbone atoms. The experimental pose of gentamicin molecule was fixed in the same position as in the experimental aptamer.