| Literature DB >> 18484501 |
Abstract
A novel representation of proteins was introduced. It is independent of arbitrary decisions with respect to the choice of labels to be assigned to the 20 natural amino acids. The approach is based on an assignment of 20 unit vectors in 20-dimensional vector space to the 20 natural amino acids. Proteins are then represented by a walk, that is, a sequence of steps in the 20-dimensional space analogous to a walk in the (x, y) plane in the case of binary strings. A straightforward numerical characterization of proteins is obtained from the distance matrix associated with the walk representing the protein in 20-dimensional space combining the information on the Euclidean distance between various amino acids in protein sequence. The Line Distance matrix offers additional numerical characterization of proteins, while the lengths of steps of the walk in 20-D space allow construction of a "protein profile," which represents distribution of average lengths of the steps and their powers.Mesh:
Substances:
Year: 2008 PMID: 18484501 DOI: 10.1080/10629360802085066
Source DB: PubMed Journal: SAR QSAR Environ Res ISSN: 1026-776X Impact factor: 3.000