M De Maeyer1, J Desmet, I Lasters. 1. Center for Transgene Technology and Gene Therapy, Flanders Interuniversity Institute for Biotechnology, KU Leuven, Belgium.
Abstract
BACKGROUND: About a decade ago, the concept of rotamer libraries was introduced to model sidechains given known mainchain coordinates. Since then, several groups have developed methods to handle the challenging combinatorial problem that is faced when searching rotamer libraries. To avoid a combinatorial explosion, the dead-end elimination method detects and eliminates rotamers that cannot be members of the global minimum energy conformation (GMEC). Several groups have applied and further developed this method in the fields of homology modelling and protein design. RESULTS: This work addresses at the same time increased prediction accuracy and calculation speed improvements. The proposed enhancements allow the elimination of more than one-third of the possible rotameric states before applying the dead-end elimination method. This is achieved by using a highly detailed rotamer library allowing the safe application of an energy-based rejection criterion without risking the elimination of a GMEC rotamer. As a result, we gain both in modelling accuracy and in computational speed. Being completely automated, the current implementation of the dead-end elimination prediction of protein sidechains can be applied to the modelling of sidechains of proteins of any size on the high-end computer systems currently used in molecular modelling. The improved accuracy is highlighted in a comparative study on a collection of proteins of varying size for which score results have previously been published by multiple groups. Furthermore, we propose a new validation method for the scoring of the modelled structure versus the experimental data based upon the volume overlap of the predicted and observed sidechains. This overlap criterion is discussed in relation to the classic RMSD and the frequently used +/- 40 degrees window in comparing chi 1 and chi 2 angles. CONCLUSIONS: We have shown that a very detailed library allows the introduction of a safe energy threshold rejection criterion, thereby increasing both the execution speed and the accuracy of the modelling program. We speculate that the current method will allow the sidechain prediction of medium-sized proteins and complex protein interfaces involving up to 150 residues on low-end desktop computers.
BACKGROUND: About a decade ago, the concept of rotamer libraries was introduced to model sidechains given known mainchain coordinates. Since then, several groups have developed methods to handle the challenging combinatorial problem that is faced when searching rotamer libraries. To avoid a combinatorial explosion, the dead-end elimination method detects and eliminates rotamers that cannot be members of the global minimum energy conformation (GMEC). Several groups have applied and further developed this method in the fields of homology modelling and protein design. RESULTS: This work addresses at the same time increased prediction accuracy and calculation speed improvements. The proposed enhancements allow the elimination of more than one-third of the possible rotameric states before applying the dead-end elimination method. This is achieved by using a highly detailed rotamer library allowing the safe application of an energy-based rejection criterion without risking the elimination of a GMEC rotamer. As a result, we gain both in modelling accuracy and in computational speed. Being completely automated, the current implementation of the dead-end elimination prediction of protein sidechains can be applied to the modelling of sidechains of proteins of any size on the high-end computer systems currently used in molecular modelling. The improved accuracy is highlighted in a comparative study on a collection of proteins of varying size for which score results have previously been published by multiple groups. Furthermore, we propose a new validation method for the scoring of the modelled structure versus the experimental data based upon the volume overlap of the predicted and observed sidechains. This overlap criterion is discussed in relation to the classic RMSD and the frequently used +/- 40 degrees window in comparing chi 1 and chi 2 angles. CONCLUSIONS: We have shown that a very detailed library allows the introduction of a safe energy threshold rejection criterion, thereby increasing both the execution speed and the accuracy of the modelling program. We speculate that the current method will allow the sidechain prediction of medium-sized proteins and complex protein interfaces involving up to 150 residues on low-end desktop computers.