Literature DB >> 29047165

OPUS-CSF: A C-atom-based scoring function for ranking protein structural models.

Gang Xu1, Tianqi Ma2,3, Tianwu Zang2,3, Qinghua Wang4, Jianpeng Ma1,2,3,4.   

Abstract

We report a C-atom-based scoring function, named OPUS-CSF, for ranking protein structural models. Rather than using traditional Boltzmann formula, we built a scoring function (CSF score) based on the native distributions (derived from the entire PDB) of coordinate components of mainchain C (carbonyl) atoms on selected residues of peptide segments of 5, 7, 9, and 11 residues in length. In testing OPUS-CSF on decoy recognition, it maximally recognized 257 native structures out of 278 targets in 11 commonly used decoy sets, significantly outperforming other popular all-atom empirical potentials. The average correlation coefficient with TM-score was also comparable with those of other potentials. OPUS-CSF is a highly coarse-grained scoring function, which only requires input of partial mainchain information, and very fast. Thus, it is suitable for applications at early stage of structural building.
© 2017 The Authors Protein Science published by Wiley Periodicals, Inc. on behalf of The Protein Society.

Entities:  

Keywords:  coarse-graining; decoy recognition; protein folding; protein structure modeling; scoring function

Mesh:

Substances:

Year:  2017        PMID: 29047165      PMCID: PMC5734313          DOI: 10.1002/pro.3327

Source DB:  PubMed          Journal:  Protein Sci        ISSN: 0961-8368            Impact factor:   6.725


Introduction

A potential function plays a central role in predicting protein structures. Generally, there are two kinds of potential functions: physics‐based potentials and knowledge‐based potentials. Physics‐based potentials typically are the all‐atom molecular mechanics force‐fields,1, 2, 3, 4, 5 such as CHARMM1,2 and AMBER.4 They also include coarse‐grained potentials such as MARTINI,6 UNRES7, 8 and OPEP.9 The knowledge‐based potentials are derived from statistical analysis of known structures and are widely used in structural prediction.10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41 They usually perform better than the physical potentials in structural prediction. In general, knowledge‐based potentials can be constructed either at coarse‐grained residue level17, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31 or at atomic level.32, 33, 34, 35, 36, 37, 38, 39, 40, 41 Although coarse‐grained potentials may not be rigorous, it helps to focus on essential features and excludes less important details, thus reduces computational cost.42, 43 The performance of coarse‐grained potential is related to how one designs the coarse‐graining scheme. For example, OPUS‐Ca potential30 uses the positions of Cα atoms as input, calculates other atomic positions as pseudo‐positions and significantly reduces the computing cost. Other applications of coarse‐grained models using Cα positions are also reported in literature.44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55 In this work, unlike traditional empirical potential functions using Boltzmann formula, we built a scoring function based on the native distributions of coordinate components of mainchain C (carbonyl) atoms on a few selected residues of small peptide segments of 5, 7, 9, and 11 residues in length. A lookup table, termed as configurational native distribution (CND) lookup table, was first generated for native distributions of coordinate components by analyzing peptide segments in the entire Protein Data Bank (PDB). Then the scoring function, termed as CSF scoring function, was calculated for a particular test structure by comparing the information of its segments with the CND lookup table. The performance of OPUS‐CSF was tested on 11 commonly used decoy sets, the results indicated that OPUS‐CSF was able to identify significantly more native structures from their decoys than other empirical potentials. In terms of the correlation coefficients between CSF scores and TM‐scores, they were comparable to those of popular all‐atom empirical potentials. Most importantly, OPUS‐CSF achieved such performance despite its highly coarse‐grained nature. That indicates the advantages of OPUS‐CSF in terms of its speed and also for its applicability in the early stage of structural modeling. This is vitally important for applications such as building structural models from intermediate resolution data from experimental techniques like cryogenic electro‐microscopy (cryo‐EM).

Results and Discussion

We compared the performance of OPUS‐CSF on 11 commonly used decoy sets with that of popular all‐atom potential functions. In Table 1, we listed the results of 5‐residue segment case (OPUS‐CSF5) and all‐segment combined case (OPUS‐CSF). For the 5‐residue segment case, OPUS‐CSF5 successfully recognized 244 out of 278 native structures from their decoys and had the average Z‐score (–3.56) nearly identical to that of GOAP (–3.57). For combined segment case, OPUS‐CSF performs even better and successfully recognized 257 out of 278 native structures from their decoys and had an average Z‐score (–4.12) better than that of GOAP (–3.57). It is interesting that although OPUS‐CSF is a highly coarse‐grained scoring function, its performance is significantly better than other all‐atom potentials.
Table 1

The results of OPUS‐CSF5 (5‐residue segment) and OPUS‐CSF (combined segment length) on 11 decoys sets compared with different potentialsa

Decoy setsTotal # of targetsDFIRERWplusdDFIREOPUS‐PSPGOAPOPUS‐CSF5OPUS‐CSF
4state_reduced76 (–3.48)6 (3.51)7 (–4.15) 7 (4.49) 7 (–4.38)7 (–3.38)7 (–3.31)
fisa4 3 (4.87) 3 (–4.79)3 (–3.80)3 (–4.24)3 (–3.97)2 (–2.31)2 (–2.55)
fisa_casp354 (–4.80)4 (–5.17)4 (–4.83) 5 (6.33) 5 (–5.27)4 (–4.38)4 (–6.72)
hg_structal2912 (–1.97)12 (–1.74)16 (–1.33)18 (1.87)22 (–2.73) 23 (2.07) 23 (–2.06)
ig_structal610 (0.92)0 (1.11)26 (–1.02)20 (0.69)47 (–1.62)49 (–2.03) 56 (2.14)
ig_structal_hires200 (0.17)0 (0.32)16 (–2.05)14 (–0.77)18 (–2.35)19 (–2.19) 20 (2.08)
I–TASSER5649 (–4.02)56 (–5.77)48 (–5.03)55 (–7.43)45 (–5.36)55 (–5.32) 56 (6.39)
lattice_ssfit88 (–9.44)8 (–8.85)8 (–10.12)8 (–6.75)8 (–8.38)8 (–9.56) 8 (11.79)
lmds107 (–0.88)7 (–1.03)6 (–2.44)8 (–5.63)7 (–4.07)8 (–5.47) 8 (6.80)
MOULDER2019 (–2.97)19 (–2.84)18 (–2.74)19 (–4.84)19 (–3.58) 20 (3.18) 20 (–3.16)
ROSETTA5820 (–1.82)20 (–1.47)12 (–0.83)39 (–3.00)45 (–3.70)49 (–3.68) 53 (4.53)
Total278128 (–1.94)135 (–2.13)164 (–2.52)196 (–2.86)226 (–3.57)244 (–3.56) 257 (4.12)

The results of other potentials come from the GOAP paper. The numbers of targets, with their native structures successfully recognized by various potentials, are listed in the table. The numbers in parentheses are the average Z‐scores of the native structures. The larger the absolute value of Z‐score, the better. Out of the total 278 targets in 11 decoy sets, OPUS‐CSF5 (5‐residue segment) recognized 244 and OPUS‐CSF (combined segment length) recognizes 257 native structures from their decoys. The bold number in each row indicates the best one among all the potential functions for that particular decoy set (if the numbers of targets are the same, the bold face entries are those having the better Z‐scores).

The results of OPUS‐CSF5 (5‐residue segment) and OPUS‐CSF (combined segment length) on 11 decoys sets compared with different potentialsa The results of other potentials come from the GOAP paper. The numbers of targets, with their native structures successfully recognized by various potentials, are listed in the table. The numbers in parentheses are the average Z‐scores of the native structures. The larger the absolute value of Z‐score, the better. Out of the total 278 targets in 11 decoy sets, OPUS‐CSF5 (5‐residue segment) recognized 244 and OPUS‐CSF (combined segment length) recognizes 257 native structures from their decoys. The bold number in each row indicates the best one among all the potential functions for that particular decoy set (if the numbers of targets are the same, the bold face entries are those having the better Z‐scores). We also calculated the Pearson's correlation coefficients between CSF score and TM‐score56 in all decoy sets. The results are shown in Table 2. OPUS‐CSF has comparable average correlation coefficient with those of GOAP and OPUS‐PSP despite the fact that OPUS‐CSF is highly coarse‐grained and the other two are all‐atom potentials.
Table 2

Average Pearson correlation coefficients of CSF scores with TM‐scoresa

Decoy setsOPUS‐PSPGOAPOPUS‐CSF
4state_reduced−0.5890.694 −0.667
fisa−0.282−0.3470.552
fisa_casp3−0.095−0.2210.333
hg_structal−0.7520.825 −0.803
ig_structal−0.779−0.8650.882
ig_structal_hires−0.832−0.8850.901
I–TASSER−0.2840.477 −0.452
lattice_ssfit−0.051−0.0580.151
lmds−0.091−0.1460.342
MOULDER−0.8020.886 −0.863
ROSETTA−0.3430.476 −0.391
Average−0.521−0.632−0.624

The correlation coefficient of a decoy set is the average coefficient of all targets in that decoy set. In calculating the correlation coefficients, the native structure was excluded. OPUS‐CSF has comparable average correlation coefficient with other two potentials. The bold number in each row indicates the best one among the three potential functions for that particular decoy set. For OPUS‐CSF, only those results for the combined segment case are listed.

Average Pearson correlation coefficients of CSF scores with TM‐scoresa The correlation coefficient of a decoy set is the average coefficient of all targets in that decoy set. In calculating the correlation coefficients, the native structure was excluded. OPUS‐CSF has comparable average correlation coefficient with other two potentials. The bold number in each row indicates the best one among the three potential functions for that particular decoy set. For OPUS‐CSF, only those results for the combined segment case are listed. For further analysis of the method, we use 5‐residue segment case as an example, Figure 1 shows the histogram of standard deviations of the coordinate components of mainchain C (carbonyl) atoms of the 1st and 5th residues in the CND lookup table. It is clear that the distribution peaks at a very small value indicating that the coordinate components are clustered in a narrow distribution, that is, the configurational distributions of the 5‐residue peptide segments are narrow,57 which provides a foundation for the success of OPUS‐CSF. The narrow configurational distribution of small peptide fragments is also seen in other studies.58 In addition, the average value of the standard deviation is 1.20 Å.
Figure 1

The histogram of standard deviations of the coordinate components in the CND lookup table for 5‐residue segment case. The distribution peaks at a very small value of standard deviation indicating that the coordinate components of the 1st and 5th mainchain C (carbonyl) are clustered in a narrow distribution, that is, the configurational distributions of the 5‐residue peptide segments are narrow. In addition, the average value of the standard deviation is 1.20 Å.

The histogram of standard deviations of the coordinate components in the CND lookup table for 5‐residue segment case. The distribution peaks at a very small value of standard deviation indicating that the coordinate components of the 1st and 5th mainchain C (carbonyl) are clustered in a narrow distribution, that is, the configurational distributions of the 5‐residue peptide segments are narrow. In addition, the average value of the standard deviation is 1.20 Å. It needs to be mentioned that, in the implementation of OPUS‐CSF, we assume that the smaller the CSF score, the more likely the structure to be native. This is an approximation because even a native structure may not usually have a zero CSF score. However, the narrow distributions of standard deviations of the coordinate components of mainchain C (carbonyl) atoms (Fig. 1) suggests small scores for the native structures. Figure 2 shows a population distribution of the CSF scores for 278 native structures in 11 decoy sets (per independent coordinate component). The average value of the native CSF scores is 0.84 and the standard deviation is 0.27. Thus, in native structures, the deviations of the coordinate components from their average values are less than one standard deviation of the coordinate component distribution in CND lookup table. The fluctuation of the native CSF scores is also very small.
Figure 2

The population distribution of CSF scores for 278 native structures in 11 decoy sets. The X‐axis is the CSF score (per independent coordinate component variable). The Y‐axis is the histogram of the population.

The population distribution of CSF scores for 278 native structures in 11 decoy sets. The X‐axis is the CSF score (per independent coordinate component variable). The Y‐axis is the histogram of the population. Figure 3 shows the frequencies of sequence repeating in the CND lookup table in 5‐residue case. In principle, the more times a sequence repeats in PDB, the better statistics one would have for that sequence in CND lookup table. In the 5‐residue case, half of the sequences repeat >26 times in the distribution. The largest value of X‐axis is 29,618 with one sequence. In constructing CND lookup table, there is always an issue between the sequence diversity and sequence repeating frequency in PDB.
Figure 3

The distribution of frequency of sequence repeating in the CND lookup table. The X‐axis is the repeating frequency, and the Y‐axis is the number of sequences with particular repeating frequency. Sequences that repeat less than five times were omitted in our study. Analysis of this distribution indicates that half of the sequences repeat >26 times. The largest value of X‐axis is 29,618 with one sequence, but not shown for the purpose of clarity.

The distribution of frequency of sequence repeating in the CND lookup table. The X‐axis is the repeating frequency, and the Y‐axis is the number of sequences with particular repeating frequency. Sequences that repeat less than five times were omitted in our study. Analysis of this distribution indicates that half of the sequences repeat >26 times. The largest value of X‐axis is 29,618 with one sequence, but not shown for the purpose of clarity. We examined OPUS‐CSF using different length of segments. As the length of segment increases, naturally the coverage decreases, and the ratio of the number of segments that appear more than five times to the total number of segments in PDB decreases (Table 3). On the other hand, if Coverage is defined as the ratio between the number of segments available in CND lookup table and the number of total segments of a test sequence, the average coverage of the 11 decoy sets (in total 278 targets) decreases as the length of segment increases. If a test sequence has <20% of its segments available in the CND lookup table, that is, its coverage is <20%, it is regarded as Unknown, then the number of unknowns increase as the lengths of segments increase. More details of OPUS‐CSF on different segment lengths can be found in Supplemental Information.
Table 3

The result of OPUS‐CSF built by different length of residue segmentsa

Num_above5Num_allNum_above5/Num_all
5‐residues176627323509690.751
7‐residues373677895448580.391
9‐residues3713506102622430.362
11‐residues3743204106988020.350

Num_above5 is the number of sequence segments which occur at least five times in PDB. Num_all shows the total number of sequence segments in PDB. The ratio decreases as the length of segments increases.

The result of OPUS‐CSF built by different length of residue segmentsa Num_above5 is the number of sequence segments which occur at least five times in PDB. Num_all shows the total number of sequence segments in PDB. The ratio decreases as the length of segments increases. The 5‐residue case delivers the best performance in terms of decoy recognition (244 out 278 native recognition in Table 4). However, the Z‐scores are better for longer‐segment cases. This is probably because the longer segments preserve more sequence homology information.
Table 4

The performance of OPUS‐CSF based on different lengths of residue segments on the 11 decoys setsa

5‐residues7‐residues9‐residues11‐residues
Success numbers244 (278)218 (278)220 (278)219 (278)
Z‐scores−3.56−4.55−4.62−4.57
Average Coverage0.9710.7490.7120.683
Unknowns0414546

Success numbers are the numbers of native structures that OPUS‐CSF correctly recognized from the decoys. Numbers in parentheses (278) are the total number of native structures (or targets) in 11 decoy sets. The Z‐scores are the calculated for the CSF scores of the native structures with respect to their decoys. Coverage means the ratio between the number of segments available in CND lookup table and the number of total segments of a target sequence. The table shows the average coverage among 278 targets in 11 decoy sets. Unknowns are the numbers of target sequences that have <20% of coverage. For these sequences, OPUS‐CSF is not applicable. Note, 5‐residue case does not have sequence classified as unknown, while 7‐residue case, for example, has 41 out of 278 sequences not applicable for OPUS‐CSF. The number of unknown increases slightly as the length of segment increases. Note, in the combined segment case, the longer segments may make no contribution to the CSF score if they are unknowns. Since the 5‐residue segment case has no unknowns, it guarantees OPUS‐CSF applicable to all target sequences even in rare ones that all longer segments are regarded as unknown.

The performance of OPUS‐CSF based on different lengths of residue segments on the 11 decoys setsa Success numbers are the numbers of native structures that OPUS‐CSF correctly recognized from the decoys. Numbers in parentheses (278) are the total number of native structures (or targets) in 11 decoy sets. The Z‐scores are the calculated for the CSF scores of the native structures with respect to their decoys. Coverage means the ratio between the number of segments available in CND lookup table and the number of total segments of a target sequence. The table shows the average coverage among 278 targets in 11 decoy sets. Unknowns are the numbers of target sequences that have <20% of coverage. For these sequences, OPUS‐CSF is not applicable. Note, 5‐residue case does not have sequence classified as unknown, while 7‐residue case, for example, has 41 out of 278 sequences not applicable for OPUS‐CSF. The number of unknown increases slightly as the length of segment increases. Note, in the combined segment case, the longer segments may make no contribution to the CSF score if they are unknowns. Since the 5‐residue segment case has no unknowns, it guarantees OPUS‐CSF applicable to all target sequences even in rare ones that all longer segments are regarded as unknown. For the 5‐residue case, we also tested a scenario by constructing CND lookup table using four residues (1, 2, 4, and 5), instead of using two terminal residues (1, 5). The number of native recognition and Z‐score are 226 and −3.60, while, in the case of (1, 5), they are 244 and −3.56 (as indicated in Table 4). This is very interesting as it indicates that using two terminal residues (1, 5) captures a better coarse graining level than using more residues (1, 2, 4, and 5). OPUS‐CSF has some obvious advantages. First, the CND lookup table is constructed directly from the entire PDB, and it contains the information of all allowed configurational information of the native segments (at least for the ones repeated more than five times in PDB). The results seem to indicate that it is better than Boltzmann formula based methods. Second, the speed of OPUS‐CSF is very fast, especially for longer polypeptide chains. This is because the entire chain is scanned once and linearly, it only requires partial mainchain atom coordinates to calculate the CSF score for a structure. Unlike other potentials such as GOAP40 and OPUS‐PSP,34 no inter‐atomic distances need to be calculated. We want to emphasize that, in modeling protein structures, an empirical potential function or a scoring function, should be fast and accurate. In early stage of modeling, it is advantageous that the scoring function requires minimal amount of structural information. In this regard, OPUS‐CSF seems to be a good choice.

Methods

Scanning through the polypeptide chain with a step size of one residue, we collected small peptide segments with sequence length of 5, 7, 9, and 11 residues and searched for their configurations in the entire PDB. Totally, we downloaded 130,054 PDB structures on June 7, 2017 via ftp://ftp.wwpdb.org/pub/pdb/data/structures/divided/pdb. The sequences that appeared less than five times in PDB were discarded. The number five was chosen empirically. Peptide segments with poorly resolved structures such as broken bonds were not included. Here we use 5‐residue segment case as an example to illustrate the details of the procedure. The ratio of segments that appear more than five times to all segments in PDB is 75.1%, which means we can utilize 75.1% of the information in the whole PDB using 5‐residue segments (also see Table 3 in Results and Discussion). A local molecular coordinate system was defined for every segment using the positions of three main‐chain atoms in the middle residue. The origin was set at the Cα atom, the X‐axis was defined along the line connecting Cα and C (carbonyl) atoms, Y‐axis was in the Cα ‐C‐O plane, parallel to component of C‐O vector that was perpendicular to the X‐axis, and the Z‐axis was defined correspondingly (Fig. 4).
Figure 4

Local molecular coordinate system in OPUS‐CSF defined by the mainchain atoms of the 3rd residues. The origin is on Cα atom. The X‐axis is along the Cα–C line. Y‐axis is in the plan of Cα–C–O atoms, and parallel to the orthogonal projection of C–O vector. Z‐axis is defined accordingly.

Local molecular coordinate system in OPUS‐CSF defined by the mainchain atoms of the 3rd residues. The origin is on Cα atom. The X‐axis is along the Cα–C line. Y‐axis is in the plan of Cα–C–O atoms, and parallel to the orthogonal projection of C–O vector. Z‐axis is defined accordingly. For a 5‐residue segment with a specific sequence, we saved the mainchain C (carbonyl) coordinates of the 1st and 5th residue in the local coordinate system, denoted as and . And under our assumption, we treated coordinate components as six independent variables. By scanning through the entire PDB, we generated six independent distributions of these variables, called configurational native distributions (CNDs) of 5‐residue segments. We then calculated the means and standard deviations of the distributions and they were kept as the CND lookup table. For a test structure, we scanned through its sequence with 5‐residue‐segments. For each segment and its sequence, we looked for the Z‐scores of the six independent variables in the CND lookup table. At the end, we added up all the absolute values of Z‐scores of all variables for all segments, and it was called CSF score. We assume the structure with smallest CSF score has the largest likelihood to be the native structure. The segments of varying lengths are denoted as 5(1, 3, 5), 7(2, 4, 6), 9(1, 3, 5, 7, 9) and 11(2, 4, 6, 8, 10). Here, in segments with the form of 5(1, 3, 5), for example, the first number 5 is the segment length, 1,5 in the parenthesis are the residues that we record C (carbonyl) atom positional distributions in local coordinate system, 3 is the residue on which the local coordinate system is defined. For 9(1, 3, 5, 7, 9) and 11(2, 4, 6, 8, 10), four atoms are used for recording mainchain C (carbonyl) positional distributions, thus totally 12 independent variables are used. The CSF score can be calculated either based on one particular segment length or by combining all segment length together. In the case of combined segment length, final CSF score is a linear sum of all CSF scores of different segment length. No weighting function is introduced for the contribution of different segment lengths. The 11 commonly used decoy sets we used to test OPUS‐CSF are the same as those used in GOAP,40 including decoy sets of 4state_reduced,59 fisa,58 fisa_casp3.58 hg_structal, ig_structal and ig_structal_hires (R. Samudrala, E. Huang, and M. Levitt, unpublished). I‐TASSER,39 lattice_ssfit,60, 61 lmds,62 MOULDER63 and ROSETTA.64

Accessibility of OPUS‐CSF

The scoring function is freely available to the academic community. Supporting Information Click here for additional data file.
  59 in total

Review 1.  Statistical potentials and scoring functions applied to protein-ligand binding.

Authors:  H Gohlke; G Klebe
Journal:  Curr Opin Struct Biol       Date:  2001-04       Impact factor: 6.809

2.  Derivation of protein-specific pair potentials based on weak sequence fragment similarity.

Authors:  J Skolnick; A Kolinski; A Ortiz
Journal:  Proteins       Date:  2000-01-01

3.  A distance-dependent atomic knowledge-based potential for improved protein structure selection.

Authors:  H Lu; J Skolnick
Journal:  Proteins       Date:  2001-08-15

4.  An efficient method for reconstructing protein backbones from alpha-carbon coordinates.

Authors:  Yoriko Iwata; Atsushi Kasuya; Shuichi Miyamoto
Journal:  J Mol Graph Model       Date:  2002-10       Impact factor: 2.518

5.  Comparative protein structure modeling by iterative alignment, model building and model assessment.

Authors:  Bino John; Andrej Sali
Journal:  Nucleic Acids Res       Date:  2003-07-15       Impact factor: 16.971

6.  A Structural-informatics approach for tracing beta-sheets: building pseudo-C(alpha) traces for beta-strands in intermediate-resolution density maps.

Authors:  Yifei Kong; Xing Zhang; Timothy S Baker; Jianpeng Ma
Journal:  J Mol Biol       Date:  2004-05-21       Impact factor: 5.469

Review 7.  Development of novel statistical potentials for protein fold recognition.

Authors:  N-V Buchete; J E Straub; D Thirumalai
Journal:  Curr Opin Struct Biol       Date:  2004-04       Impact factor: 6.809

8.  Scoring function for automated assessment of protein structure template quality.

Authors:  Yang Zhang; Jeffrey Skolnick
Journal:  Proteins       Date:  2004-12-01

9.  Folding of small helical proteins assisted by small-angle X-ray scattering profiles.

Authors:  Yinghao Wu; Xia Tian; Mingyang Lu; Mingzhi Chen; Qinghua Wang; Jianpeng Ma
Journal:  Structure       Date:  2005-11       Impact factor: 5.006

10.  OPUS-Ca: a knowledge-based potential function requiring only Calpha positions.

Authors:  Yinghao Wu; Mingyang Lu; Mingzhi Chen; Jialin Li; Jianpeng Ma
Journal:  Protein Sci       Date:  2007-07       Impact factor: 6.725

View more
  7 in total

1.  OPUS-SSF: A side-chain-inclusive scoring function for ranking protein structural models.

Authors:  Gang Xu; Tianqi Ma; Qinghua Wang; Jianpeng Ma
Journal:  Protein Sci       Date:  2019-04-11       Impact factor: 6.725

2.  OPUS-Rota4: a gradient-based protein side-chain modeling framework assisted by deep learning-based predictors.

Authors:  Gang Xu; Qinghua Wang; Jianpeng Ma
Journal:  Brief Bioinform       Date:  2022-01-17       Impact factor: 11.622

3.  Secondary structure specific simpler prediction models for protein backbone angles.

Authors:  M A Hakim Newton; Fereshteh Mataeimoghadam; Rianon Zaman; Abdul Sattar
Journal:  BMC Bioinformatics       Date:  2022-01-04       Impact factor: 3.169

4.  A simple neural network implementation of generalized solvation free energy for assessment of protein structural models.

Authors:  Shiyang Long; Pu Tian
Journal:  RSC Adv       Date:  2019-11-06       Impact factor: 4.036

5.  Enhancing protein backbone angle prediction by using simpler models of deep neural networks.

Authors:  Fereshteh Mataeimoghadam; M A Hakim Newton; Abdollah Dehzangi; Abdul Karim; B Jayaram; Shoba Ranganathan; Abdul Sattar
Journal:  Sci Rep       Date:  2020-11-10       Impact factor: 4.379

Review 6.  Computational reconstruction of atomistic protein structures from coarse-grained models.

Authors:  Aleksandra E Badaczewska-Dawid; Andrzej Kolinski; Sebastian Kmiecik
Journal:  Comput Struct Biotechnol J       Date:  2019-12-26       Impact factor: 7.271

7.  OPUS-X: An Open-Source Toolkit for Protein Torsion Angles, Secondary Structure, Solvent Accessibility, Contact Map Predictions, and 3D Folding.

Authors:  Gang Xu; Qinghua Wang; Jianpeng Ma
Journal:  Bioinformatics       Date:  2021-09-03       Impact factor: 6.937

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.