| Literature DB >> 19740451 |
Virginie Lopez Rascol1, Anthony Levasseur, Olivier Chabrol, Simona Grusea, Philippe Gouret, Etienne G J Danchin, Pierre Pontarotti.
Abstract
BACKGROUND: Understanding genome evolution provides insight into biological mechanisms. For many years comparative genomics and analysis of conserved chromosomal regions have helped to unravel the mechanisms involved in genome evolution and their implications for the study of biological systems. Detection of conserved regions (descending from a common ancestor) not only helps clarify genome evolution but also makes it possible to identify quantitative trait loci (QTLs) and investigate gene function.The identification and comparison of conserved regions on a genome scale is computationally intensive, making process automation essential. Three key requirements are necessary: consideration of phylogeny to identify orthologs between multiple species, frequent updating of the annotation and panel of compared genomes and computation of statistical tests to assess the significance of identified conserved gene clusters.Entities:
Mesh:
Year: 2009 PMID: 19740451 PMCID: PMC2756280 DOI: 10.1186/1471-2105-10-284
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1CASSIOPE multi-agent system, showing all the agents and the communications (blue) between them, together with non-system elements. Pink: Expert System; Violet: persistence agent and Postgres database; Green: tree agent and FIGENIX platform; Yellow: Web agent.
Figure 2The diagram depicts how processes and agents are overlaid. Agents are pink, green and yellow, as represented in Fig 1. Arrows represent communication channels between the expert system and the other agents. The persistence agent is not represented as it is used throughout the process: (input). The user gives the two boundary genes of the target region - (1) The region is completed by all genes present in the Ensembl database - (2) The phylogenetic tree is computed for each gene - (3) Orthologous genes are calculated (forester) - (4) Each orthologous gene is located on the chromosome in its species - (5) Orthologous genes are clusterized if they are in the same chromosome and the same species. If the cluster contains fewer than three genes, it is removed - (6) Clusters are completed with genes contained in the region and that are not orthologous genes - (7) A score is calculated for each cluster: if the conserved site is not significant, then the cluster is removed - (8) The system restarts with new regions.
Global parameters
| Completed | Yes |
| Scope | No |
| Source URI | |
| Ortholog/paralog | Ortholog |
| Reverse search | Yes |
| Range | All |
| Pipeline model | __CassiopePhylo+M__ |
| Scope | No |
| NoDuplication Range | [9615, 10090] |
| Database | Ensembl |
| Tree of life | cassiope1 [see Additional file |
| Distance between 2 genes | 10,000,000 bps |
Figure 3Conserved regions from a human MHC-like region (283 genes). Each box represents species possessing one or several conserved regions with the start region. Regions are represented on the corresponding chromosome in the species. The right-hand-side boxes show Ensembl results (regions are delimited by red boxes), and the left-hand-side boxes show CASSIOPE results. If Ensembl results are lacking, the box is left empty. CASSIOPE found more conserved regions than Ensembl. The presence of duplication in teleost fish genomes is shown.