Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Ranking and compacting binding segments of protein families using aligned pattern clusters.

Literature DB >> 24564874

Ranking and compacting binding segments of protein families using aligned pattern clusters.

Abstract

BACKGROUND: Discovering sequence patterns with variation can unveil functions of a protein family that are important for drug discovery. Exploring protein families using existing methods such as multiple sequence alignment is computationally expensive, thus pattern search, called motif finding in Bioinformatics, is used. However, at present, combinatorial algorithms result in large sets of solutions, and probabilistic models require a richer representation of the amino acid associations. To overcome these shortcomings, we present a method for ranking and compacting these solutions in a new representation referred to as Aligned Pattern Clusters (APCs). To tackle the problem of a large solution set, our method reveals a reduced set of candidate solutions without losing any information. To address the problem of representation, our method captures the amino acid associations and conservations of the aligned patterns. Our algorithm renders a set of APCs in which a set of patterns is discovered, pruned, aligned, and synthesized from the input sequences of a protein family.
RESULTS: Our algorithm identifies the binding or other functional segments and their embedded residues which are important drug targets from the cytochrome c and the ubiquitin protein families taken from Unitprot. The results are independently confirmed by pFam's multiple sequence alignment. For cytochrome c protein the number of resulting patterns with variations are reduced by 76.62% from the number of original patterns without variations. Furthermore, all of the top four candidate APCs correspond to the binding segments with one of each of their conserved amino acid as the binding residue. The discovered proximal APCs agree with pFam and PROSITE results. Surprisingly, the distal binding site discovered by our algorithm is not discovered by pFam nor PROSITE, but confirmed by the three-dimensional cytochrome c structure. When applied to the ubiquitin protein family, our results agree with pFam and reveals six of the seven Lysine binding residues as conserved aligned columns with entropy redundancy measure of 1.0.
CONCLUSION: The discovery, ranking, reduction, and representation of a set of patterns is important to avert time-consuming and expensive simulations and experimentations during proteomic study and drug discovery.

Entities: Chemical Disease Gene Species

Year: 2013 PMID： 24564874 PMCID： PMC3907781 DOI： 10.1186/1477-5956-11-S1-S8

Source DB: PubMed Journal: Proteome Sci ISSN： 1477-5956 Impact factor: 2.480

27 in total

1. T-Coffee: A novel method for fast and accurate multiple sequence alignment.

Authors: C Notredame; D G Higgins; J Heringa
Journal: J Mol Biol Date: 2000-09-08 Impact factor: 5.469

2. On position-specific scoring matrix for protein function prediction.

Authors: Jong Cheol Jeong; Xiaotong Lin; Xue-Wen Chen
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2011 Mar-Apr Impact factor: 3.710

3. Cytochrome c release from mitochondria: all or nothing.

Authors: J C Martinou; S Desagher; B Antonsson
Journal: Nat Cell Biol Date: 2000-03 Impact factor: 28.824

4. Statistical analysis of residue variability in cytochrome c.

Authors: A K Wong; T S Liu; C C Wang
Journal: J Mol Biol Date: 1976-04-05 Impact factor: 5.469

5. Characterization of polyubiquitin chain structure by middle-down mass spectrometry.

Authors: Ping Xu; Junmin Peng
Journal: Anal Chem Date: 2008-03-20 Impact factor: 6.986

6. Pfam: a comprehensive database of protein domain families based on seed alignments.

Authors: E L Sonnhammer; S R Eddy; R Durbin
Journal: Proteins Date: 1997-07

7. [Nucleotide makeup of the DNA of thermophilic bacteria of the genus Thermus].

Authors: N I Aleksandrushkina; L A Egorova
Journal: Mikrobiologiia Date: 1978 Mar-Apr

Review 8. Principles of early drug discovery.

Authors: J P Hughes; S Rees; S B Kalindjian; K L Philpott
Journal: Br J Pharmacol Date: 2011-03 Impact factor: 8.739

9. The Pfam protein families database.

Authors: Robert D Finn; Jaina Mistry; John Tate; Penny Coggill; Andreas Heger; Joanne E Pollington; O Luke Gavin; Prasad Gunasekaran; Goran Ceric; Kristoffer Forslund; Liisa Holm; Erik L L Sonnhammer; Sean R Eddy; Alex Bateman
Journal: Nucleic Acids Res Date: 2009-11-17 Impact factor: 16.971

Review 10. Neurodegenerative diseases: a decade of discoveries paves the way for therapeutic breakthroughs.

Authors: Mark S Forman; John Q Trojanowski; Virginia M-Y Lee
Journal: Nat Med Date: 2004-10 Impact factor: 53.440

3 in total

1. Discovering co-occurring patterns and their biological significance in protein families.

Authors: En-Shiun Lee; Sanderz Fung; Ho-Yin Sze-To; Andrew K C Wong
Journal: BMC Bioinformatics Date: 2014-11-06 Impact factor: 3.169

2. Revealing Subtle Functional Subgroups in Class A Scavenger Receptors by Pattern Discovery and Disentanglement of Aligned Pattern Clusters.

Authors: Pei-Yuan Zhou; En-Shiun Annie Lee; Antonio Sze-To; Andrew K C Wong
Journal: Proteomes Date: 2018-02-08

3. Discovery and disentanglement of aligned residue associations from aligned pattern clusters to reveal subgroup characteristics.

Authors: Pei-Yuan Zhou; Antonio Sze-To; Andrew K C Wong
Journal: BMC Med Genomics Date: 2018-11-20 Impact factor: 3.063

3 in total