Literature DB >> 19505947

Rapid detection, classification and accurate alignment of up to a million or more related protein sequences.

Andrew F Neuwald1.   

Abstract

MOTIVATION: The patterns of sequence similarity and divergence present within functionally diverse, evolutionarily related proteins contain implicit information about corresponding biochemical similarities and differences. A first step toward accessing such information is to statistically analyze these patterns, which, in turn, requires that one first identify and accurately align a very large set of protein sequences. Ideally, the set should include many distantly related, functionally divergent subgroups. Because it is extremely difficult, if not impossible for fully automated methods to align such sequences correctly, researchers often resort to manual curation based on detailed structural and biochemical information. However, multiply-aligning vast numbers of sequences in this way is clearly impractical.
RESULTS: This problem is addressed using Multiply-Aligned Profiles for Global Alignment of Protein Sequences (MAPGAPS). The MAPGAPS program uses a set of multiply-aligned profiles both as a query to detect and classify related sequences and as a template to multiply-align the sequences. It relies on Karlin-Altschul statistics for sensitivity and on PSI-BLAST (and other) heuristics for speed. Using as input a carefully curated multiple-profile alignment for P-loop GTPases, MAPGAPS correctly aligned weakly conserved sequence motifs within 33 distantly related GTPases of known structure. By comparison, the sequence- and structurally based alignment methods hmmalign and PROMALS3D misaligned at least 11 and 23 of these regions, respectively. When applied to a dataset of 65 million protein sequences, MAPGAPS identified, classified and aligned (with comparable accuracy) nearly half a million putative P-loop GTPase sequences. AVAILABILITY: A C++ implementation of MAPGAPS is available at http://mapgaps.igs.umaryland.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19505947      PMCID: PMC2732367          DOI: 10.1093/bioinformatics/btp342

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  38 in total

1.  Classification and evolution of P-loop GTPases and related ATPases.

Authors:  Detlef D Leipe; Yuri I Wolf; Eugene V Koonin; L Aravind
Journal:  J Mol Biol       Date:  2002-03-15       Impact factor: 5.469

Review 2.  Intein spread and extinction in evolution.

Authors:  S Pietrokovski
Journal:  Trends Genet       Date:  2001-08       Impact factor: 11.639

Review 3.  Alpha/Beta-hydrolase fold enzymes: structures, functions and mechanisms.

Authors:  M Holmquist
Journal:  Curr Protein Pept Sci       Date:  2000-09       Impact factor: 3.272

4.  MUSCLE: multiple sequence alignment with high accuracy and high throughput.

Authors:  Robert C Edgar
Journal:  Nucleic Acids Res       Date:  2004-03-19       Impact factor: 16.971

5.  Ran's C-terminal, basic patch, and nucleotide exchange mechanisms in light of a canonical structure for Rab, Rho, Ras, and Ran GTPases.

Authors:  Andrew F Neuwald; Natarajan Kannan; Aleksandar Poleksic; Naoya Hata; Jun S Liu
Journal:  Genome Res       Date:  2003-04       Impact factor: 9.043

Review 6.  From cofactor to enzymes. The molecular evolution of pyridoxal-5'-phosphate-dependent enzymes.

Authors:  P Christen; P K Mehta
Journal:  Chem Rec       Date:  2001       Impact factor: 6.771

7.  Evolution and classification of P-loop kinases and related proteins.

Authors:  Detlef D Leipe; Eugene V Koonin; L Aravind
Journal:  J Mol Biol       Date:  2003-10-31       Impact factor: 5.469

8.  Evolutionary constraints associated with functional specificity of the CMGC protein kinases MAPK, CDK, GSK, SRPK, DYRK, and CK2alpha.

Authors:  Natarajan Kannan; Andrew F Neuwald
Journal:  Protein Sci       Date:  2004-08       Impact factor: 6.725

9.  An evolving hierarchical family classification for glycosyltransferases.

Authors:  Pedro M Coutinho; Emeline Deleury; Gideon J Davies; Bernard Henrissat
Journal:  J Mol Biol       Date:  2003-04-25       Impact factor: 5.469

10.  Galpha Gbetagamma dissociation may be due to retraction of a buried lysine and disruption of an aromatic cluster by a GTP-sensing Arg Trp pair.

Authors:  Andrew F Neuwald
Journal:  Protein Sci       Date:  2007-11       Impact factor: 6.725

View more
  36 in total

1.  Surveying the manifold divergence of an entire protein class for statistical clues to underlying biochemical mechanisms.

Authors:  Andrew F Neuwald
Journal:  Stat Appl Genet Mol Biol       Date:  2011-08-04

2.  Global analysis of protein expression and phosphorylation of three stages of Plasmodium falciparum intraerythrocytic development.

Authors:  Brittany N Pease; Edward L Huttlin; Mark P Jedrychowski; Eric Talevich; John Harmon; Timothy Dillman; Natarajan Kannan; Christian Doerig; Ratna Chakrabarti; Steven P Gygi; Debopam Chakrabarti
Journal:  J Proteome Res       Date:  2013-08-26       Impact factor: 4.466

3.  Structural and evolutionary divergence of cyclic nucleotide binding domains in eukaryotic pathogens: Implications for drug design.

Authors:  Smita Mohanty; Eileen J Kennedy; Friedrich W Herberg; Raymond Hui; Susan S Taylor; Gordon Langsley; Natarajan Kannan
Journal:  Biochim Biophys Acta       Date:  2015-04-03

4.  Identification of a hidden strain switch provides clues to an ancient structural mechanism in protein kinases.

Authors:  Krishnadev Oruganty; Nakul Suhas Talathi; Zachary A Wood; Natarajan Kannan
Journal:  Proc Natl Acad Sci U S A       Date:  2012-12-31       Impact factor: 11.205

5.  Coupled regulation by the juxtamembrane and sterile α motif (SAM) linker is a hallmark of ephrin tyrosine kinase evolution.

Authors:  Annie Kwon; Mihir John; Zheng Ruan; Natarajan Kannan
Journal:  J Biol Chem       Date:  2018-02-12       Impact factor: 5.157

6.  Tracing the origin and evolution of pseudokinases across the tree of life.

Authors:  Annie Kwon; Steven Scott; Rahil Taujale; Wayland Yeung; Krys J Kochut; Patrick A Eyers; Natarajan Kannan
Journal:  Sci Signal       Date:  2019-04-23       Impact factor: 8.192

7.  Mitochondrial ADCK3 employs an atypical protein kinase-like fold to enable coenzyme Q biosynthesis.

Authors:  Jonathan A Stefely; Andrew G Reidenbach; Arne Ulbrich; Krishnadev Oruganty; Brendan J Floyd; Adam Jochem; Jaclyn M Saunders; Isabel E Johnson; Catherine E Minogue; Russell L Wrobel; Grant E Barber; David Lee; Sheng Li; Natarajan Kannan; Joshua J Coon; Craig A Bingman; David J Pagliarini
Journal:  Mol Cell       Date:  2014-12-11       Impact factor: 17.970

Review 8.  An evolutionary perspective on the kinome of malaria parasites.

Authors:  Eric Talevich; Andrew B Tobin; Natarajan Kannan; Christian Doerig
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2012-09-19       Impact factor: 6.237

9.  mTORC2 controls the activity of PKC and Akt by phosphorylating a conserved TOR interaction motif.

Authors:  Timothy R Baffi; Gema Lordén; Jacob M Wozniak; Andreas Feichtner; Wayland Yeung; Alexandr P Kornev; Charles C King; Jason C Del Rio; Ameya J Limaye; Julius Bogomolovas; Christine M Gould; Ju Chen; Eileen J Kennedy; Natarajan Kannan; David J Gonzalez; Eduard Stefan; Susan S Taylor; Alexandra C Newton
Journal:  Sci Signal       Date:  2021-04-13       Impact factor: 8.192

10.  Structural and evolutionary adaptation of rhoptry kinases and pseudokinases, a family of coccidian virulence factors.

Authors:  Eric Talevich; Natarajan Kannan
Journal:  BMC Evol Biol       Date:  2013-06-06       Impact factor: 3.260

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.