Literature DB >> 35707020

Geometry-based distance for clustering amino acids.

Samira F Abushilah1,2, Charles C Taylor1, Arief Gusnanto1.   

Abstract

Clustering amino acids is one of the most challenging problems in functional and structural prediction of protein. Previous studies have proposed clusters based on measurements of physical and biochemical characteristics of the amino acids such as volume, area, hydrophilicity, polarity, hydrogen bonding, shape, and charge. These characteristics, although important, are less directly related to the protein structure compared to geometrical characteristics such as dihedral angles between amino acids. We propose using the p-value from a test of equality of dihedral-angle distributions as the basis of a distance measure for the clustering. In this novel approach, an energy test is modified to deal with bivariate angular data and the p-value is obtained via a permutation method. The results indicate that the clusters of amino acids have sensible interpretation where Glycine, Proline, and Asparagine each forms a distinct cluster. A simulation study suggests that this approach has good working characteristics to cluster amino acids.
© 2019 Informa UK Limited, trading as Taylor & Francis Group.

Entities:  

Keywords:  Circular distance; energy statistic; hierarchical clustering; permutation two-sample test; similarity indices; squared Euclidean distance

Year:  2019        PMID: 35707020      PMCID: PMC9041948          DOI: 10.1080/02664763.2019.1673324

Source DB:  PubMed          Journal:  J Appl Stat        ISSN: 0266-4763            Impact factor:   1.416


  5 in total

1.  Structure validation by Calpha geometry: phi,psi and Cbeta deviation.

Authors:  Simon C Lovell; Ian W Davis; W Bryan Arendall; Paul I W de Bakker; J Michael Word; Michael G Prisant; Jane S Richardson; David C Richardson
Journal:  Proteins       Date:  2003-02-15

2.  A new criterion and method for amino acid classification.

Authors:  Carolin Kosiol; Nick Goldman; Nigel H Buttimore
Journal:  J Theor Biol       Date:  2004-05-07       Impact factor: 2.691

3.  A generative, probabilistic model of local protein structure.

Authors:  Wouter Boomsma; Kanti V Mardia; Charles C Taylor; Jesper Ferkinghoff-Borg; Anders Krogh; Thomas Hamelryck
Journal:  Proc Natl Acad Sci U S A       Date:  2008-06-25       Impact factor: 11.205

4.  A new approach to clustering the amino acids.

Authors:  L E Stanfel
Journal:  J Theor Biol       Date:  1996-11-21       Impact factor: 2.691

5.  Use of fuzzy clustering technique and matrices to classify amino acids and its impact to Chou's pseudo amino acid composition.

Authors:  D N Georgiou; T E Karakasidis; J J Nieto; A Torres
Journal:  J Theor Biol       Date:  2008-11-12       Impact factor: 2.691

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.