Literature DB >> 15888677

Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

Kevin Horan1, Josh Lauricha, Julia Bailey-Serres, Natasha Raikhel, Thomas Girke.   

Abstract

The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15888677      PMCID: PMC1104159          DOI: 10.1104/pp.104.059048

Source DB:  PubMed          Journal:  Plant Physiol        ISSN: 0032-0889            Impact factor:   8.340


  41 in total

1.  CAST: an iterative algorithm for the complexity analysis of sequence tracts. Complexity analysis of sequence tracts.

Authors:  V J Promponas; A J Enright; S Tsoka; D P Kreil; C Leroy; S Hamodrakas; C Sander; C A Ouzounis
Journal:  Bioinformatics       Date:  2000-10       Impact factor: 6.937

2.  CluSTr: a database of clusters of SWISS-PROT+TrEMBL proteins.

Authors:  E V Kriventseva; W Fleischmann; E M Zdobnov; R Apweiler
Journal:  Nucleic Acids Res       Date:  2001-01-01       Impact factor: 16.971

3.  SYSTERS, GeneNest, SpliceNest: exploring sequence space from genome to protein.

Authors:  Antje Krause; Stefan A Haas; Eivind Coward; Martin Vingron
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

4.  An efficient algorithm for large-scale detection of protein families.

Authors:  A J Enright; S Van Dongen; C A Ouzounis
Journal:  Nucleic Acids Res       Date:  2002-04-01       Impact factor: 16.971

Review 5.  Phototropins 1 and 2: versatile plant blue-light receptors.

Authors:  Winslow R Briggs; John M Christie
Journal:  Trends Plant Sci       Date:  2002-05       Impact factor: 18.313

6.  ProClust: improved clustering of protein sequences with an extended graph-based approach.

Authors:  P Pipenbacher; A Schliep; S Schneckener; A Schönhuth; D Schomburg; R Schrader
Journal:  Bioinformatics       Date:  2002       Impact factor: 6.937

7.  The Bioperl toolkit: Perl modules for the life sciences.

Authors:  Jason E Stajich; David Block; Kris Boulez; Steven E Brenner; Stephen A Chervitz; Chris Dagdigian; Georg Fuellen; James G R Gilbert; Ian Korf; Hilmar Lapp; Heikki Lehväslaiho; Chad Matsalla; Chris J Mungall; Brian I Osborne; Matthew R Pocock; Peter Schattner; Martin Senger; Lincoln D Stein; Elia Stupka; Mark D Wilkinson; Ewan Birney
Journal:  Genome Res       Date:  2002-10       Impact factor: 9.043

8.  Characterization of a family of Arabidopsis genes related to xyloglucan fucosyltransferase1.

Authors:  R Sarria; T A Wagner; M A O'Neill; A Faik; C G Wilkerson; K Keegstra; N V Raikhel
Journal:  Plant Physiol       Date:  2001-12       Impact factor: 8.340

9.  An Arabidopsis gene encoding an alpha-xylosyltransferase involved in xyloglucan biosynthesis.

Authors:  Ahmed Faik; Nicholas J Price; Natasha V Raikhel; Kenneth Keegstra
Journal:  Proc Natl Acad Sci U S A       Date:  2002-05-28       Impact factor: 11.205

10.  The Arabidopsis phospholipase D family. Characterization of a calcium-independent and phosphatidylcholine-selective PLD zeta 1 with distinct regulatory domains.

Authors:  Chunbo Qin; Xuemin Wang
Journal:  Plant Physiol       Date:  2002-03       Impact factor: 8.340

View more
  17 in total

1.  Biological databases for plant research.

Authors:  Seung Yon Rhee; Bill Crosby
Journal:  Plant Physiol       Date:  2005-05       Impact factor: 8.340

2.  The acyltransferase GPAT5 is required for the synthesis of suberin in seed coat and root of Arabidopsis.

Authors:  Fred Beisson; Yonghua Li; Gustavo Bonaventure; Mike Pollard; John B Ohlrogge
Journal:  Plant Cell       Date:  2007-01-26       Impact factor: 11.277

3.  AffyTrees: facilitating comparative analysis of Affymetrix plant microarray chips.

Authors:  Tancred Frickey; Vagner Augusto Benedito; Michael Udvardi; Georg Weiller
Journal:  Plant Physiol       Date:  2007-12-07       Impact factor: 8.340

4.  PLAZA: a comparative genomics resource to study gene and genome evolution in plants.

Authors:  Sebastian Proost; Michiel Van Bel; Lieven Sterck; Kenny Billiau; Thomas Van Parys; Yves Van de Peer; Klaas Vandepoele
Journal:  Plant Cell       Date:  2009-12-29       Impact factor: 11.277

Review 5.  Genomics and bioinformatics resources for crop improvement.

Authors:  Keiichi Mochida; Kazuo Shinozaki
Journal:  Plant Cell Physiol       Date:  2010-03-05       Impact factor: 4.927

6.  Phylogenetic and expression analysis of RNA-binding proteins with triple RNA recognition motifs in plants.

Authors:  Lila Peal; Niranjani Jambunathan; Ramamurthy Mahalingam
Journal:  Mol Cells       Date:  2010-11-25       Impact factor: 5.034

7.  TriFLDB: a database of clustered full-length coding sequences from Triticeae with applications to comparative grass genomics.

Authors:  Keiichi Mochida; Takuhiro Yoshida; Tetsuya Sakurai; Yasunari Ogihara; Kazuo Shinozaki
Journal:  Plant Physiol       Date:  2009-05-15       Impact factor: 8.340

8.  Annotating genes of known and unknown function by large-scale coexpression analysis.

Authors:  Kevin Horan; Charles Jang; Julia Bailey-Serres; Ron Mittler; Christian Shelton; Jeff F Harper; Jian-Kang Zhu; John C Cushman; Martin Gollery; Thomas Girke
Journal:  Plant Physiol       Date:  2008-03-19       Impact factor: 8.340

9.  Phylogenetics and evolution of Su(var)3-9 SET genes in land plants: rapid diversification in structure and function.

Authors:  Xinyu Zhu; Hong Ma; Zhiduan Chen
Journal:  BMC Evol Biol       Date:  2011-03-09       Impact factor: 3.260

10.  Coding region structural heterogeneity and turnover of transcription start sites contribute to divergence in expression between duplicate genes.

Authors:  Chungoo Park; Kateryna D Makova
Journal:  Genome Biol       Date:  2009-01-28       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.