| Literature DB >> 30116298 |
Cheng-Hong Yang1,2, Kuo-Chuan Wu1,3, Yu-Shiun Lin1, Li-Yeh Chuang4, Hsueh-Wei Chang5,6,7.
Abstract
BACKGROUND: The function of a protein is determined by its native protein structure. Among many protein prediction methods, the Hydrophobic-Polar (HP) model, an ab initio method, simplifies the protein folding prediction process in order to reduce the prediction complexity.Entities:
Keywords: Global search; Hydrophobic-polar (HP) model; IMOG; Ion motion optimization; Local search; Protein folding
Year: 2018 PMID: 30116298 PMCID: PMC6083565 DOI: 10.1186/s13040-018-0176-6
Source DB: PubMed Journal: BioData Min ISSN: 1756-0381 Impact factor: 2.522
Fig. 1Flowchart for developing an IMO algorithm
Fig. 2Example of a 2D-HP-model. a 2D triangular lattice model with six neighbors. b Illustration of best solution in 2D-HP-model. The arrow with number indicates next direction of current amino acid. The red dotted line is H-H contact
Fig. 3Illustration of the best fit obtained from the 2D-HP-model
Fig. 4Illustration of IMO with a local search by a greedy algorithm. The greedy algorithm is utilized for local search that randomly selects the position point and searches the best fitness of six neighbors
First benchmark of amino acids sequences in HP model [9]
| Sequence | length |
| Amino acids sequence*2 |
|---|---|---|---|
| 1 | 20 | -15 | (101001)20110(01)2 |
| 2 | 24 | −17 | 1(100)21(001)51 |
| 3 | 25 | −12 | (001)2(100001)31 |
| 4 | 36 | −24 | 0(0011)2(0)5(1)7(001100)2100 |
| 5 | 48 | −43 | 001(0011)2(0)5(1)10(0)6(1100)2100(1)5 |
| 6 | 50 | −41 | 1(10)4(1)4(0100)300(1000)210111(10)411 |
| 7 | 60 | – | 001110(1)8000(1)1001000(1)12(0)4(1)4011010 |
| 8 | 64 | – | (1)12(01)20(1100)2(1001)2(100)2(1100)2(10)2(1)12 |
*1 is the best energy [7] value in 2D HP model. E was defined to the number of H-H contacts h, i.e., E = − |ε| h, where |ε| is a positive constant. The units of energy E is |ε|. For simplification, E is calculated by the following formula as described previously [9, 10] as mentioned in formula (1) to (3)
*2 0 represents hydrophobic (H); 1 represents polarity (P) in amino acids sequence; (…) represents i-fold repetitions of the respective subsequence in data
Second benchmark of amino acids sequences in HP model [21]
| Sequence | length |
| Amino acids sequence*2 |
|---|---|---|---|
| 1 | 12 | −11 | 1(10)51 |
| 2 | 14 | −11 | 1100(10)5 |
| 3 | 14 | −11 | 1(100)2(10)31 |
| 4 | 16 | −11 | 110(100)41 |
| 5 | 16 | −11 | 1(100)2(10)3010 |
| 6 | 17 | −11 | 1(100)51 |
| 7 | 17 | −17 | 1(11)711 |
| 8 | 20 | − 17 | 1(100)2(10)3(01)31 |
| 9 | 20 | −17 | 1(10)41(001)31 |
| 10 | 21 | −17 | 1(100)2(10100)21011 |
| 11 | 21 | −17 | 110(100)2(10)2(100)211 |
| 12 | 21 | −17 | 1100(10)3(01)2(001)21 |
| 13 | 22 | −17 | 1(100)2(10)3(010)2011 |
| 14 | 23 | −25 | 11(10)9111 |
| 15 | 24 | −17 | 1(100)711 |
| 16 | 24 | −25 | 11(10)3(01)711 |
| 17 | 24 | −25 | 11(10)4(01)611 |
| 18 | 30 | −25 | 11(100)41(01001)200111 |
| 19 | 30 | −25 | 11(100)3(10)2(01)2(001)311 |
| 20 | 37 | −29 | 11(100)3(10)21(001)3(0)5(10)2111 |
*1 is the best energy value in 2D HP model. E is calculated by the following formula as described previously [9, 10] as mentioned in formula (1) to (3)
*20 represents hydrophobic (H); 1 represents polarity (P) in amino acids sequence; (…) represents i-fold repetitions of the respective subsequence in data
Comparison of algorithms studied here for optimal solutions
| Sequence*1 | SGA | HGA | TS | ERS-GA | HHGA | IMOG |
|---|---|---|---|---|---|---|
| 1 | −11 |
|
|
|
|
|
| 2 | −10 | −13 |
| − 13 |
|
|
| 3 | − 10 | − 10 |
| −12 |
|
|
| 4 | − 16 | −19 |
| − 20 | − 23 |
|
| 5 | −26 | −32 | − 40 | − 32 |
| −40 |
| 6 | −21 | −23 | NA | −30 | −38 |
|
| 7 | −40 | − 46 |
| − 55 | − 66 | − 67 |
| 8 | − 33 | −46 | −50 | − 47 | − 63 |
|
*1Sequences from Table 1. Bold numbers indicate the best solution for the same test sequence
NA: not available; SGA: simple genetic algorithm; HGA: Hybrid Genetic Algorithm [22]; TS: tabu search [23]; ERS-GA: elite-based reproduction strategy-genetic algorithm [9]; HHGA: hybrid of hill climbing and genetic algorithm [9]; IMOG: Ions motion optimization with a greedy algorithm
Comparison of the best prediction results of IMO with MMA algorithm
| Sequence*1 | MMA | IMOG | Sequence*1 | MMA | IMOG |
|---|---|---|---|---|---|
| 1 | NA |
| 11 |
|
|
| 2 |
|
| 12 |
|
|
| 3 |
|
| 13 |
|
|
| 4 |
|
| 14 |
|
|
| 5 |
|
| 15 | −16 |
|
| 6 |
|
| 16 |
|
|
| 7 |
|
| 17 |
|
|
| 8 |
|
| 18 | −24 |
|
| 9 |
|
| 19 | −24 |
|
| 10 |
|
| 20 | −26 |
|
*1. Sequences from Table 2. Bold number indicates the best solution for the same test sequence
NA not available, MMA Multimeme Algorithm using the new mating strategy based on the contact map memory [24], IMOG Ions motion optimization with greedy algorithm
Comparison of the best solutions and stabilities with other algorithms
| Sequence*1 |
| ERS-GA | HHGA | IMOG | |||
|---|---|---|---|---|---|---|---|
| Best | Mean | Best | Mean | Best | Mean | ||
| 1 | −15 |
| −12.50 |
| −14.73 |
| − 14.73 |
| 2 | − 17 | − 13 | −10.20 |
| −14.93 |
| −14.93 |
| 3 | − 12 |
| − 8.47 |
| − 11.57 |
| − 11.57 |
| 4 | − 24 | −20 | − 16.17 | −23 | −21.27 | − 23 | −21.27 |
| 5 | −43 | −32 | −28.13 |
| − 37.30 |
| − 37.30 |
| 6 | −41 | − 30 | − 25.30 | −38 | −34.10 | − 38 | −34.10 |
| 7 | – | −55 | −49.43 | − 66 | − 61.83 | − 66 | −61.83 |
| 8 | – | −47 | −42.37 | −63 | − 56.53 | −63 | −56.53 |
*1Sequences from Table 1. Bold number indicates the best solution for the same test sequence
*2 is the best energy value in 2D HP model. E is calculated by the following formula as described previously [9, 10] as mentioned in formula (1) to (3)
ERS-GA elite-based reproduction strategy-genetic algorithm [9], HHGA hybrid of hill climbing and genetic algorithm [9]; IMOG Ions motion optimization with greedy algorithm