| Literature DB >> 23368768 |
Swakkhar Shatabda1, M A Hakim Newton, Mahmood A Rashid, Duc Nghia Pham, Abdul Sattar.
Abstract
BACKGROUND: Given a protein's amino acid sequence, the protein structure prediction problem is to find a three dimensional structure that has the native energy level. For many decades, it has been one of the most challenging problems in computational biology. A simplified version of the problem is to find an on-lattice self-avoiding walk that minimizes the interaction energy among the amino acids. Local search methods have been preferably used in solving the protein structure prediction problem for their efficiency in finding very good solutions quickly. However, they suffer mainly from two problems: re-visitation and stagnancy.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23368768 PMCID: PMC3549842 DOI: 10.1186/1471-2105-14-S2-S19
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Local Setach Framework.
| Procedure localSearch() | Procedure selectMove() | ||
|---|---|---|---|
| initialize() | |||
| initializeTabu() | | ||
| | |||
| selectMonomerType() | | ||
| generateMoves() | | ||
| selectMove() | | ||
| performMove() | discard | ||
| updateCosts() | | ||
| | updateEliteSet() | ||
| storeLocalMinima() | return | ||
| | | ||
| | |||
| initializeTabu() | |||
| selectFromEliteSet() | | ||
| | | ||
Pseudo-code for Elite Set Methods.
| Procedure updateEliteSet() | Procedure selectFromEliteSet() | ||
|---|---|---|---|
| | |||
| | | ||
| | | ||
| | | ||
| | | ||
| | return | ||
| | | ||
| | | ||
Figure 1Isomorphic Encoding. Two identical structures in cubic lattice having different absolute encoding; structure in the left has the encoding "DSES", and the structure at right with encoding "UNEN", where D = Down U = Up, N = North, S = South, E = East and W = West.
Pseudo-code for Non-Isomorphic Encoding.
| Procedure getNonIsoEncoding(s) | |
|---|---|
| initMap() | |
| | |
| | |
| | |
| | |
| | |
| | |
| return | |
Experimental Results.
| Protein | LS-New | LS-Mem | LS-Tabu | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Length | best | avg | best | avg | best | avg | LNS | ||||
| R1 | 200 | -384 | - | -353 | -326 | 22.41% | -332 | -318 | 31.81% | -330 | |
| R2 | 200 | -383 | - | -351 | -330 | 24.52% | -337 | -324 | 32.20% | -333 | |
| R3 | 200 | -385 | - | -352 | -330 | 18.18% | -339 | -323 | 27.41% | -334 | |
| f180_1 | 180 | -378* | - | -360 | -334 | 15.90% | -338 | -327 | 27.45% | -293 | |
| f180_2 | 180 | -381* | - | -362 | -340 | 24.39% | -345 | -334 | 34.02% | -312 | |
| f180_3 | 180 | -378 | - | -357 | -343 | 34.28% | -352 | -339 | 41.02% | -313 | |
| 3no6 | 229 | -455 | - | -400 | -375 | 27.50% | -390 | -373 | 29.26% | - | |
| 3mr7 | 189 | -355 | - | -311 | -292 | 19.04% | -301 | -287 | 25% | - | |
| 3mse | 179 | -323 | - | -278 | -254 | 22.63% | -266 | -249 | 29.72% | - | |
| 3mqz | 215 | -474 | - | -415 | -386 | 20.45% | -401 | -383 | 23.07% | - | |
| 3on7 | 279 | ? | - | -499 | -463 | - | -491 | -461 | - | - | |
| 3no3 | 258 | -494 | - | -397 | -361 | 11.27% | -388 | -359 | 12.59% | - | |
The best and average energy levels achieved and relative improvements of our algorithm over other algorithms for the R, f180 and instances taken from CASP.
Figure 2Search Progress. Search progress of three algorithms for Protein R1 over 300 minutes.
Effect of Non-Isomorphic Encoding.
| Protein | Our Encoding | Relative Encoding | ||
|---|---|---|---|---|
| runtime | # discards | runtime | # of discards | |
| R1 | 28.33 | 354712 | 39.6 | 91664 |
| R2 | 30.4 | 406572 | 42.6 | 91219 |
| R3 | 25.74 | 475357 | 42.6 | 92765 |
| f180_1 | 22.90 | 402738 | 35.4 | 103059 |
| f180_2 | 27.34 | 317317 | 39.0 | 93814 |
| f180_3 | 24.54 | 358326 | 37.8 | 89810 |
Comparison in runtime (in minutes) and the numbers of discards while using our non-isomorphic encoding and the relative encoding [32] for first 1 million iterations of the memory-based algorithm by Shatabda et al. [10].