Literature DB >> 11743721

Automatic clustering of orthologs and in-paralogs from pairwise species comparisons.

M Remm1, C E Storm, E L Sonnhammer.   

Abstract

Orthologs are genes in different species that originate from a single gene in the last common ancestor of these species. Such genes have often retained identical biological roles in the present-day organisms. It is hence important to identify orthologs for transferring functional information between genes in different organisms with a high degree of reliability. For example, orthologs of human proteins are often functionally characterized in model organisms. Unfortunately, orthology analysis between human and e.g. invertebrates is often complex because of large numbers of paralogs within protein families. Paralogs that predate the species split, which we call out-paralogs, can easily be confused with true orthologs. Paralogs that arose after the species split, which we call in-paralogs, however, are bona fide orthologs by definition. Orthologs and in-paralogs are typically detected with phylogenetic methods, but these are slow and difficult to automate. Automatic clustering methods based on two-way best genome-wide matches on the other hand, have so far not separated in-paralogs from out-paralogs effectively. We present a fully automatic method for finding orthologs and in-paralogs from two species. Ortholog clusters are seeded with a two-way best pairwise match, after which an algorithm for adding in-paralogs is applied. The method bypasses multiple alignments and phylogenetic trees, which can be slow and error-prone steps in classical ortholog detection. Still, it robustly detects complex orthologous relationships and assigns confidence values for both orthologs and in-paralogs. The program, called INPARANOID, was tested on all completely sequenced eukaryotic genomes. To assess the quality of INPARANOID results, ortholog clusters were generated from a dataset of worm and mammalian transmembrane proteins, and were compared to clusters derived by manual tree-based ortholog detection methods. This study led to the identification with a high degree of confidence of over a dozen novel worm-mammalian ortholog assignments that were previously undetected because of shortcomings of phylogenetic methods.A WWW server that allows searching for orthologs between human and several fully sequenced genomes is installed at http://www.cgb.ki.se/inparanoid/. This is the first comprehensive resource with orthologs of all fully sequenced eukaryotic genomes. Programs and tables of orthology assignments are available from the same location. Copyright 2001 Academic Press.

Entities:  

Mesh:

Substances:

Year:  2001        PMID: 11743721     DOI: 10.1006/jmbi.2000.5197

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


  571 in total

1.  Genomic and proteomic analysis of mitochondrial carrier proteins in Arabidopsis.

Authors:  A Harvey Millar; Joshua L Heazlewood
Journal:  Plant Physiol       Date:  2003-02       Impact factor: 8.340

2.  CeCaFDB: a curated database for the documentation, visualization and comparative analysis of central carbon metabolic flux distributions explored by 13C-fluxomics.

Authors:  Zhengdong Zhang; Tie Shen; Bin Rui; Wenwei Zhou; Xiangfei Zhou; Chuanyu Shang; Chenwei Xin; Xiaoguang Liu; Gang Li; Jiansi Jiang; Chao Li; Ruiyuan Li; Mengshu Han; Shanping You; Guojun Yu; Yin Yi; Han Wen; Zhijie Liu; Xiaoyao Xie
Journal:  Nucleic Acids Res       Date:  2014-11-11       Impact factor: 16.971

3.  Predicting genetic modifier loci using functional gene networks.

Authors:  Insuk Lee; Ben Lehner; Tanya Vavouri; Junha Shin; Andrew G Fraser; Edward M Marcotte
Journal:  Genome Res       Date:  2010-06-09       Impact factor: 9.043

4.  Augmented annotation of the Schizosaccharomyces pombe genome reveals additional genes required for growth and viability.

Authors:  Danny A Bitton; Valerie Wood; Paul J Scutt; Agnes Grallert; Tim Yates; Duncan L Smith; Iain M Hagan; Crispin J Miller
Journal:  Genetics       Date:  2011-01-26       Impact factor: 4.562

5.  Comprehensive analysis of orthologous protein domains using the HOPS database.

Authors:  Christian E V Storm; Erik L L Sonnhammer
Journal:  Genome Res       Date:  2003-10       Impact factor: 9.043

6.  Experimental analysis of the Arabidopsis mitochondrial proteome highlights signaling and regulatory components, provides assessment of targeting prediction programs, and indicates plant-specific mitochondrial proteins.

Authors:  Joshua L Heazlewood; Julian S Tonti-Filippini; Alexander M Gout; David A Day; James Whelan; A Harvey Millar
Journal:  Plant Cell       Date:  2003-12-11       Impact factor: 11.277

7.  Functionality of system components: conservation of protein function in protein feature space.

Authors:  Lars Juhl Jensen; David W Ussery; Søren Brunak
Journal:  Genome Res       Date:  2003-10-14       Impact factor: 9.043

8.  Quantitative measures for the management and comparison of annotated genomes.

Authors:  Karen Eilbeck; Barry Moore; Carson Holt; Mark Yandell
Journal:  BMC Bioinformatics       Date:  2009-02-23       Impact factor: 3.169

9.  Mapping metabolic and transcript temporal switches during germination in rice highlights specific transcription factors and the role of RNA instability in the germination process.

Authors:  Katharine A Howell; Reena Narsai; Adam Carroll; Aneta Ivanova; Marc Lohse; Björn Usadel; A Harvey Millar; James Whelan
Journal:  Plant Physiol       Date:  2008-12-12       Impact factor: 8.340

10.  PanOCT: automated clustering of orthologs using conserved gene neighborhood for pan-genomic analysis of bacterial strains and closely related species.

Authors:  Derrick E Fouts; Lauren Brinkac; Erin Beck; Jason Inman; Granger Sutton
Journal:  Nucleic Acids Res       Date:  2012-08-16       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.