| Literature DB >> 9390317 |
W Zheng1, S J Cho, I I Vaisman, A Tropsha.
Abstract
We propose new algorithms for sequence-structure compatibility (fold recognition) searches in multi-dimensional sequence-structure space. Individual amino acid residues in protein structures are represented by their C alpha atoms; thus each protein is described as a collection of points in three-dimensional space. Delaunay tessellation of a protein generates an aggregate of space-filling, irregular tetrahedra, or Delaunay simplices. Statistical analysis of quadruplet residue compositions of all Delaunay simplices in a representative dataset of protein structures leads to a novel four body contact residue potential expressed as log likelihood factor q. The q factors are calculated for native 20 letter amino acid alphabet and several reduced alphabets. Two sequence-structure compatibility functions are computed as (i) the sum of q factors for all Delaunay simplices in a given protein, or (ii) 3D-1D Delaunay tessellation profiles where the individual residue profile value is calculated as the sum of q factors for all simplices that share this vertex residue. Both threading functions have been implemented in structure-recognizes-sequence and sequence-recognizes-structure protocols for protein fold recognition. We find that both profile and total score based threading functions can distinguish both the native fold from incorrect folds for a sequence, and the native sequence from non-native sequences for a fold.Mesh:
Substances:
Year: 1997 PMID: 9390317
Source DB: PubMed Journal: Pac Symp Biocomput ISSN: 2335-6928