Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Identification of protein motifs using conserved amino acid properties and partitioning techniques.

Literature DB >> 7584465

Identification of protein motifs using conserved amino acid properties and partitioning techniques.

Abstract

Analyzing a set of protein sequences involves a fundamental relationship between the coherency of the set and the specificity of the motif that describes it. Motifs may be obscured by training sets that contain incoherent sequences, in part due to protein subclasses, contamination, or errors. We develop an algorithm for motif identification that systematically explores possible patterns of coherency within a set of protein sequences. Our algorithm constructs alternative partitions of the training set data, where one subset of each partition is presumed to contain coherent data and is used for forming a motif. The motif is represented by multiple overlapping amino acid groups based on evolutionary, biochemical, or physical properties. We demonstrate our method on a training set of reverse transcriptases that contains subclasses, sequence errors, misalignments, and contaminating sequences. Despite these complications, our program identifies a novel motif for the subclass of retroviral and retrovirus-related reverse transcriptases. This motif has a much higher specificity than previously reported motifs and suggests the importance of conserved hydrophilic and hydrophobic residues in the structure of reverse transcriptases.

Mesh：

Substances：

Year: 1995 PMID： 7584465

Source DB: PubMed Journal: Proc Int Conf Intell Syst Mol Biol ISSN： 1553-0833

Keyword Cloud
Cited

2 in total

1. Highly specific protein sequence motifs for genome analysis.

Authors: C G Nevill-Manning; T D Wu; D L Brutlag
Journal: Proc Natl Acad Sci U S A Date: 1998-05-26 Impact factor: 11.205

2. Evaluating deterministic motif significance measures in protein databases.

Authors: Pedro Gabriel Ferreira; Paulo J Azevedo
Journal: Algorithms Mol Biol Date: 2007-12-24 Impact factor: 1.405

2 in total