| Literature DB >> 16216832 |
Leon Goldovsky1, Paul Janssen, Dag Ahrén, Benjamin Audit, Ildefonso Cases, Nikos Darzentas, Anton J Enright, Núria López-Bigas, José M Peregrin-Alvarez, Mike Smith, Sophia Tsoka, Victor Kunin, Christos A Ouzounis.
Abstract
MOTIVATION: CoGenT++ is a data environment for computational research in comparative and functional genomics, designed to address issues of consistency, reproducibility, scalability and accessibility. DESCRIPTION: CoGenT++ facilitates the re-distribution of all fully sequenced and published genomes, storing information about species, gene names and protein sequences. We describe our scalable implementation of ProXSim, a continually updated all-against-all similarity database, which stores pairwise relationships between all genome sequences. Based on these similarities, derived databases are generated for gene fusions--AllFuse, putative orthologs--OFAM, protein families--TRIBES, phylogenetic profiles--ProfUse and phylogenetic trees. Extensions based on the CoGenT++ environment include disease gene prediction, pattern discovery, automated domain detection, genome annotation and ancestral reconstruction.Entities:
Mesh:
Year: 2005 PMID: 16216832 DOI: 10.1093/bioinformatics/bti579
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937