| Literature DB >> 18931918 |
Yoshito Sawada1, Shinya Honda.
Abstract
Integration of knowledge on the sequence-structure correlation of proteins provides a basis for the structural design of artificial novel proteins. As one of strategies, it is effective to consider a short segment, whose size is in between an amino acid and a domain, as a correlation unit for exploring the structure-to-sequence relationship. Here we report the development of a database called ProSeg, which consists of two sub-databases, Segment DB and Cluster DB. Segment DB contains tens of thousands of segments that were prepared by dividing the primary sequences of 370 proteins using a sliding L-residue window (L = 5, 9, 11, 15). These segments were classified into several thousands of clusters according to their three-dimensional structural resemblance. Cluster DB contains much cluster-related information, which includes image, rank, frequency, secondary structure assignment, sequence profile, etc. Users can search for a suitable cluster by inputting an appropriate parameter (i.e., PDB ID, dihedral angles, or DSSP symbols), which identifies the backbone structure of a query segment. Analogous to a language, ProSeg could be regarded as a 'structure-sequence dictionary' that contains over 10,000 'protein words'. ProSeg is freely accessible through the Internet ( http://riodb.ibase.aist.go.jp/proseg/ ).Entities:
Mesh:
Substances:
Year: 2008 PMID: 18931918 DOI: 10.1007/s10822-008-9248-x
Source DB: PubMed Journal: J Comput Aided Mol Des ISSN: 0920-654X Impact factor: 3.686