| Literature DB >> 15985178 |
Wouter Boomsma1, Thomas Hamelryck.
Abstract
BACKGROUND: Various forms of the so-called loop closure problem are crucial to protein structure prediction methods. Given an N- and a C-terminal end, the problem consists of finding a suitable segment of a certain length that bridges the ends seamlessly. In homology modelling, the problem arises in predicting loop regions. In de novo protein structure prediction, the problem is encountered when implementing local moves for Markov Chain Monte Carlo simulations. Most loop closure algorithms keep the bond angles fixed or semi-fixed, and only vary the dihedral angles. This is appropriate for a full-atom protein backbone, since the bond angles can be considered as fixed, while the (phi, psi) dihedral angles are variable. However, many de novo structure prediction methods use protein models that only consist of Calpha atoms, or otherwise do not make use of all backbone atoms. These methods require a method that alters both bond and dihedral angles, since the pseudo bond angle between three consecutive Calpha atoms also varies considerably.Entities:
Mesh:
Substances:
Year: 2005 PMID: 15985178 PMCID: PMC1192790 DOI: 10.1186/1471-2105-6-159
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1A protein segment's Cα trace. The Cα positions are numbered, and the pseudo bond angles θ and pseudo dihedrals τ are indicated. The segment has length 5, and is thus fully described by two pseudo dihedral and three pseudo bond angles.
Figure 2The action of the FCCD algorithm in Cα space. The Cα traces of the moving, fixed and closed segments are shown in red, green and blue, respectively. The Cα atoms are represented as spheres. The labels f0, f1 and f2 indicate the three fixed vectors at the N-terminus that are initially common between the fixed and moving segments. The loop is closed when the three C-terminal vectors of the moving segment (labelled m, m, m) superimpose with an RMSD below the given threshold on the three C-terminal vectors of the fixed segment (labelled (f, f, f). This figure and Figure 3 were made with PyMol .
| # Start iteration over pivots |
| |
| |
| |
| |
| |
| |
| |
| Σ = |
| |
| |
| |
| |
| Γ = |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| # Failed: RMSD threshold not reached before maxit |
The accept function rejects or accepts the proposed rotation, based on the resulting (θ, τ) pair. The svd function performs singular value decomposition, and calc_rmsd calculates the RMSD between two lists of vectors.
Performance of the FCCD algorithm for various segment lengths. The first and second number in columns 2–4 refer to unconstrained and constrained FCCD, respectively. Columns 2 and 3 respectively show the average time and number of iterations needed for closing a single segment successfully. The percentage of loops successfully closed in under 1000 iterations is shown in the last column.
| Segment length | Average time (ms) | Average iterations | % Closed |
| 5 | 4.5/51.7 | 14.0/27.0 | 99.90/86.50 |
| 10 | 5.2/28.3 | 10.5/16.8 | 99.40/98.20 |
| 15 | 5.6/28.6 | 7.8/12.1 | 99.60/99.40 |
| 20 | 6.2/27.1 | 6.3/9.0 | 99.80/99.40 |
| 25 | 7.6/31.7 | 5.5/7.6 | 99.00/99.90 |
| 30 | 7.1/31.0 | 4.4/6.3 | 99.70/99.40 |
Minimum RMSD (out of 1000 tries) between a fixed segment derived from a protein structure and a closed segment generated by FCCD. The length of the loops is shown between parentheses in the upper row.
| Loop (4) | RMSD | Loop (8) | RMSD | Loop (12) | RMSD |
| 1dvj, A, 20–23 | 0.59 | 1cru, A, 85–92 | 2.31 | 1cru, A, 358–369 | 3.37 |
| 1dys, A, 47–50 | 0.67 | 1ctq, A, 144–151 | 2.22 | 1ctq, A, 26–37 | 2.40 |
| 1egu, A, 404–407 | 0.61 | 1d8w, A, 334–341 | 2.04 | 1d4o, A, 88–99 | 3.20 |
| 1ej0, A, 74–77 | 0.61 | 1ds1, A, 20–27 | 2.20 | 1d8w, A, 43–54 | 2.74 |
| 1i0h, A, 123–126 | 0.73 | 1gk8, A, 122–129 | 2.20 | 1ds1, A, 282–293 | 3.16 |
| 1id0, A, 405–408 | 0.66 | 1i0h, A, 145–152 | 2.42 | 1dys, A, 291–302 | 2.90 |
| 1qnr, A, 195–198 | 0.54 | 1ixh, 106–113 | 1.98 | 1egu, A, 508–519 | 3.06 |
| 1qop, A, 44–47 | 0.58 | 1lam, 420–427 | 2.16 | 1f74, A, 11–22 | 3.12 |
| 1tca, 95–98 | 0.76 | 1qop, B, 14–21 | 2.17 | 1q1w, A, 31–42 | 3.04 |
| 1thf, D, 121–124 | 0.56 | 3chb, D, 51–58 | 1.97 | 1qop, A, 175–186 | 2.97 |
| Average RMSD | 0.63 | Average RMSD | 2.17 | Average RMSD | 3.00 |
Figure 3Loops generated by FCCD (blue) that are close to real protein loops (green). The loops with lowest RMSD to a given loop of length 4 (top), 8 and 12 (bottom) are shown (loops 1qnr, A, 195–198, 3chb, D, 51–58 and 1ctq, A, 26–37). The N- terminus is at the left hand side.
SABMark identifiers of the 236 structures used as fold representatives
| 1ew6a_ | 1ail__ | 1l1la_ | 1kid__ | 1n8yc1 | 1gzhb1 | 1e5da1 | 1ep3b2 | 1ihoa_ | 1m0wa1 |
| 1dhs__ | 1gpua2 | 2lefa_ | 1nsta_ | 1eaf__ | 1iiba_ | 1d5ra2 | 1foha3 | 1gpua3 | 1crza2 |
| 3pvia_ | 1i6pa_ | 1e4ft1 | 1kx5d_ | 2pth__ | 1lu9a2 | 1dkla_ | 1fsga_ | 1m2oa3 | 2dpma_ |
| 1ajsa_ | 1fxoa_ | 3tgl__ | 1bx4a_ | 1mtyg_ | 1duvg2 | 1qopb_ | 1iata_ | 1k2yx2 | 1f0ka_ |
| 1ayl_1 | 1toaa_ | 8abp__ | 1nh8a1 | 1bi5a2 | 2mhr__ | 1a2pa_ | 3lzt__ | 1dkia_ | 1e7la2 |
| 1bf4a_ | 1bb8__ | 1kpf__ | 1mu5a2 | 1lfda_ | 1gpea2 | 1jqca_ | 1a2va2 | 1jfma_ | 1ll7a2 |
| 1cjxa1 | 1lo7a_ | 1fm0e_ | 1fs1b2 | 1o0wa2 | 1dtja_ | 1k0ra3 | 1evsa_ | 1jpdx2 | 1qd1a1 |
| 1d5ya3 | 1h3fa2 | 1iq0a3 | 1tig__ | 1xxaa_ | 1ck9a_ | 1gyxa_ | 1e5qa2 | 1ivsa2 | 1qbea_ |
| 3grs_3 | 1f08a_ | 1c7ka_ | 1lkka_ | 1dq3a3 | 1uox_1 | 12asa_ | 1bob__ | 1m4ja_ | 1dv5a_ |
| 1f5ma_ | 1k2ea_ | 1ei1a2 | 1jdw__ | 1ln1a_ | 2pola2 | 1f0ia1 | 1rl6a1 | 1fvia2 | 1j7la_ |
| 1is2a1 | 1e8ga2 | 1qr0a1 | 2dnja_ | 1kuua_ | 1qh5a_ | 1ii7a_ | 1b8pa2 | 1j7na3 | 1chua3 |
| 1f00i3 | 1grj_1 | 1nkd__ | 1mwxa3 | 1jp4a_ | 1ih7a2 | 1eula2 | 1gnla_ | 1maz__ | 2por__ |
| 4htci_ | 1es7b_ | 1tocr1 | 1d1la_ | 1fd3a_ | 1i8na_ | 1h8pa1 | 4sgbi_ | 1fltv_ | 1quba1 |
| 1d4va3 | 1tpg_2 | 1iuaa_ | 1fv5a_ | 1mdya_ | 1zmec1 | 1fjgn_ | 1eska_ | 1i50i2 | 1fbva4 |
| 1dmc__ | 1e53a_ | 1ezvb1 | 1jeqa1 | 1k3ea_ | 1rec__ | 1lm5a_ | 1k82a1 | 1jaja_ | 1m0ka_ |
| 1c0va_ | 1kqfc_ | 1ocrk_ | 1h67a_ | 2cpga_ | 1ljra1 | 1brwa1 | 1hs7a_ | 2cbla2 | 1jmxa2 |
| 1hyp__ | 1cuk_2 | 1ecwa_ | 1l9la_ | 1g7da_ | 1jkw_1 | 1dgna_ | 1iqpa1 | 1pa2a_ | 1ko9a1 |
| 1f1za1 | 1ks9a1 | 2sqca2 | 1d2ta_ | 1h3la_ | 1wer__ | 1b3ua_ | 1n1ba2 | 1poc__ | 1e79i_ |
| 1m1qa_ | 1enwa_ | 1g4ma1 | 1e5ba_ | 1qhoa2 | 1kv7a2 | 1l4ia2 | 1c8da_ | 1amm_1 | 1ca1_2 |
| 1phm_2 | 1d7pm_ | 1jjcb2 | 1flca1 | 1gr3a_ | 1mjsa_ | 1a8d_1 | 1lf6a2 | 1fqta_ | 1jb0e_ |
| 1jh2a_ | 1lcya1 | 1mgqa_ | 1hcia1 | 1b3qa2 | 1jlxa1 | 1dar_1 | 1exma2 | 1ejea_ | 1agja_ |
| 1e79d2 | 2rspa_ | 1h0ha1 | 1gtra1 | 2erl__ | 1btn__ | 1lf7a_ | 1jmxa5 | 1crua_ | 1m1xa4 |
| 1hx0a1 | 1goia1 | 1ciy_2 | 1daba_ | 3tdt__ | 1gg3a1 | 1pmi__ | 1bdo__ | 1h3ia2 | 1gppa_ |
| 1f39a_ | 1k6wa1 | 1jqna_ | 1lu9a1 | 1m6ia1 | 1o94a3 |