Literature DB >> 9600885

Highly specific protein sequence motifs for genome analysis.

C G Nevill-Manning1, T D Wu, D L Brutlag.   

Abstract

We present a method for discovering conserved sequence motifs from families of aligned protein sequences. The method has been implemented as a computer program called EMOTIF (http://motif. stanford.edu/emotif). Given an aligned set of protein sequences, EMOTIF generates a set of motifs with a wide range of specificities and sensitivities. EMOTIF also can generate motifs that describe possible subfamilies of a protein superfamily. A disjunction of such motifs often can represent the entire superfamily with high specificity and sensitivity. We have used EMOTIF to generate sets of motifs from all 7,000 protein alignments in the BLOCKS and PRINTS databases. The resulting database, called IDENTIFY (http://motif. stanford.edu/identify), contains more than 50,000 motifs. For each alignment, the database contains several motifs having a probability of matching a false positive that range from 10(-10) to 10(-5). Highly specific motifs are well suited for searching entire proteomes, while generating very few false predictions. IDENTIFY assigns biological functions to 25-30% of all proteins encoded by the Saccharomyces cerevisiae genome and by several bacterial genomes. In particular, IDENTIFY assigned functions to 172 of proteins of unknown function in the yeast genome.

Entities:  

Mesh:

Substances:

Year:  1998        PMID: 9600885      PMCID: PMC34488          DOI: 10.1073/pnas.95.11.5865

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  27 in total

1.  Automated construction and graphical presentation of protein blocks from unaligned sequences.

Authors:  S Henikoff; J G Henikoff; W J Alford; S Pietrokovski
Journal:  Gene       Date:  1995-10-03       Impact factor: 3.688

2.  Enumerating and ranking discrete motifs.

Authors:  C G Nevill-Manning; K S Sethi; T D Wu; D L Brutlag
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1997

3.  Cluster analysis of amino acid indices for prediction of protein structure and function.

Authors:  K Nakai; A Kidera; M Kanehisa
Journal:  Protein Eng       Date:  1988-07

4.  Identification of protein motifs using conserved amino acid properties and partitioning techniques.

Authors:  T D Wu; D L Brutlag
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1995

5.  GeneQuiz: a workbench for sequence analysis.

Authors:  M Scharf; R Schneider; G Casari; P Bork; A Valencia; C Ouzounis; C Sander
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1994

6.  SCOP: a structural classification of proteins database for the investigation of sequences and structures.

Authors:  A G Murzin; S E Brenner; T Hubbard; C Chothia
Journal:  J Mol Biol       Date:  1995-04-07       Impact factor: 5.469

7.  Identification of sequence motifs from a set of proteins with related function.

Authors:  M A Saqi; M J Sternberg
Journal:  Protein Eng       Date:  1994-02

8.  PRINTS--a database of protein motif fingerprints.

Authors:  T K Attwood; M E Beck; A J Bleasby; D J Parry-Smith
Journal:  Nucleic Acids Res       Date:  1994-09       Impact factor: 16.971

9.  The FSSP database of structurally aligned protein fold families.

Authors:  L Holm; C Sander
Journal:  Nucleic Acids Res       Date:  1994-09       Impact factor: 16.971

10.  Hidden Markov models in computational biology. Applications to protein modeling.

Authors:  A Krogh; M Brown; I S Mian; K Sjölander; D Haussler
Journal:  J Mol Biol       Date:  1994-02-04       Impact factor: 5.469

View more
  41 in total

1.  ProtoMap: automatic classification of protein sequences and hierarchy of protein families.

Authors:  G Yona; N Linial; M Linial
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

2.  PRINTS-S: the database formerly known as PRINTS.

Authors:  T K Attwood; M D Croning; D R Flower; A P Lewis; J E Mabey; P Scordis; J N Selley; W Wright
Journal:  Nucleic Acids Res       Date:  2000-01-01       Impact factor: 16.971

3.  Expression of the Bs2 pepper gene confers resistance to bacterial spot disease in tomato.

Authors:  T H Tai; D Dahlbeck; E T Clark; P Gajiwala; R Pasion; M C Whalen; R E Stall; B J Staskawicz
Journal:  Proc Natl Acad Sci U S A       Date:  1999-11-23       Impact factor: 11.205

4.  The InterPro database, an integrated documentation resource for protein families, domains and functional sites.

Authors:  R Apweiler; T K Attwood; A Bairoch; A Bateman; E Birney; M Biswas; P Bucher; L Cerutti; F Corpet; M D Croning; R Durbin; L Falquet; W Fleischmann; J Gouzy; H Hermjakob; N Hulo; I Jonassen; D Kahn; A Kanapin; Y Karavidopoulou; R Lopez; B Marx; N J Mulder; T M Oinn; M Pagni; F Servant; C J Sigrist; E M Zdobnov
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

5.  The EMOTIF database.

Authors:  J Y Huang; D L Brutlag
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

6.  HWP1 functions in the morphological development of Candida albicans downstream of EFG1, TUP1, and RBF1.

Authors:  L L Sharkey; M D McNemar; S M Saporito-Irwin; P S Sypherd; W A Fonzi
Journal:  J Bacteriol       Date:  1999-09       Impact factor: 3.490

7.  Motif-based fold assignment.

Authors:  L Salwinski; D Eisenberg
Journal:  Protein Sci       Date:  2001-12       Impact factor: 6.725

8.  DIAN: a novel algorithm for genome ontological classification.

Authors:  Y Pouliot; J Gao; Q J Su; G G Liu; X B Ling
Journal:  Genome Res       Date:  2001-10       Impact factor: 9.043

9.  FoldMiner: structural motif discovery using an improved superposition algorithm.

Authors:  Jessica Shapiro; Douglas Brutlag
Journal:  Protein Sci       Date:  2004-01       Impact factor: 6.725

10.  Using structural motif templates to identify proteins with DNA binding function.

Authors:  Susan Jones; Jonathan A Barker; Irene Nobeli; Janet M Thornton
Journal:  Nucleic Acids Res       Date:  2003-06-01       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.