| Literature DB >> 26223881 |
Mary E Dolan1, Richard M Baldarelli, Susan M Bello, Li Ni, Monica S McAndrews, Carol J Bult, James A Kadin, Joel E Richardson, Martin Ringwald, Janan T Eppig, Judith A Blake.
Abstract
The mouse genome database (MGD) is the model organism database component of the mouse genome informatics system at The Jackson Laboratory. MGD is the international data resource for the laboratory mouse and facilitates the use of mice in the study of human health and disease. Since its beginnings, MGD has included comparative genomics data with a particular focus on human-mouse orthology, an essential component of the use of mouse as a model organism. Over the past 25 years, novel algorithms and addition of orthologs from other model organisms have enriched comparative genomics in MGD data, extending the use of orthology data to support the laboratory mouse as a model of human biology. Here, we describe current comparative data in MGD and review the history and refinement of orthology representation in this resource.Entities:
Mesh:
Year: 2015 PMID: 26223881 PMCID: PMC4534493 DOI: 10.1007/s00335-015-9588-5
Source DB: PubMed Journal: Mamm Genome ISSN: 0938-8990 Impact factor: 2.957
A chronological list of significant changes to MGD orthology representation
| Year | Significant changes to MGD orthology representation |
|---|---|
| 1994 | MGD went online |
| 1997 | Determination of homology in MGD is based on experimental analysis |
| Interactive Oxford Grids displaying comparative mapping between two species are available for mouse, human, rat, cow, pig, sheep, and cat | |
| 1998 | Over 2500 mouse/human homologies are found in MGD as well as a more limited number of homology assertions for >60 other mammalian species |
| Mammalian homologs can also be displayed as part of the detail for graphical map displays | |
| 2000 | The type of evidence used to determine the homology relationship is provided: Sequence similarity, conserved location, or functional analysis |
| MGD starts to emphasize the relationship of mouse genes to those in other model organisms such as Drosophila | |
| 2002 | MGD provides gene family pages that summarize information about curated orthology assertions of mouse, human, and rat orthologs |
| 2004 | MGD works with the HomoloGene resource at the NCBI to reciprocally incorporate some of the HomoloGene computational three-way reciprocal best-hit sets into the MGI system |
| 2005 | MGD’s priority effort focuses on the creation of orthology sets among mouse, human, and rat |
| 2007 | MGD incorporates UniProt Protein Information Resource Superfamily (PIRSF) protein classifications into a Protein Superfamily Vocabulary Browser |
| MGD provides new mouse–human–rat comparative GO graphs | |
| 2008 | MGD includes links to the TreeFam resource |
| 2013 | A banner displaying information about the human ortholog of each mouse gene is added to the Gene Detail pages in MGD to improve comparisons of gene–disease associations in mouse and human |
| MGI implements a many-to-many homology paradigm to better reflect current understanding about the relationships between genes among mammals | |
| 2015 | MGI expands the many-to-many homology paradigm to include HGNC orthology assertions to maximize the use of human:mouse comparative genomics |
A summary of the increased representation of mouse:human and mouse:rat orthology sets in MGD
| Mouse/human orthologs | Mouse/rat orthologs | |
|---|---|---|
| 1998 | 2500 | |
| 2002 | 6123 | |
| 2003 | 7488 | |
| 2004 | 9987 | |
| 2005 | 14,893 | |
| 2006 | 15,849 | 15,532 |
| 2007 | 15,672 | 14,758 |
| 2008 | 16,927 | 15,801 |
| 2009 | 16,685 | 15,787 |
| 2010 | 17,787 | 16,768 |
| 2011 | 17,852 | |
| 2012 | 17,847 | 16,686 |
| 2013 | 17,773 | 17,253 |
| 2014 | 17,092 | 17,811 |
| 2015 | 17,055 | 18,461 |
Fig. 1Comparative graphs present human, mouse, and rat GO annotations in the context of the ontology structure to better enable comparison among organisms. The graphs have been adapted, as shown here, to accommodate MGI’s many:many homology paradigm
Fig. 2The Vertebrate homology ribbon on the mouse gene Klk1 detail page displays information on the HomoloGene class that contains 1 human and 14 mouse genes. There are links to HCOP homology predictions for the human gene KLK3 called by HomoloGene and to KLK1 called by HGNC. The Human homolog ribbon displays additional information on both human genes associated with Klk1. The orthology data presented on the gene detail page are inclusive of orthologs called by both sources
Fig. 3The Human Disease and Mouse Model Detail page provides a direct comparison of mouse and human orthologs of genes associated with a human disease. These associations are based on the hybrid homology rules. In case where HomoloGene and HGNC agree, as in the last three genes displayed here, that agreement is shown in the last column. In cases where the orthology sets disagree, our rules select the more inclusive set; as shown here the HomoloGene pair FGFR3–Fgfr3 and the HGNC set for ACAN–Acan. The hybrid homology set includes 25,999 ortholog sets from HomoloGene and 33,717 from HGNC
Fig. 4Searching the HMDC with mouse or human symbols returns a row with the Hybrid homology set for each gene matching a search term. The mouse phenotype annotations and human and mouse disease annotations for genes in the homology set are shown in the row. The matrix shown in the figure has been filtered to reduce the number of rows and columns. The source for each homology cluster is: ACAN, Acan, HGNC; APOE, Apoe, HomoloGene, and HGNC; C4A, C4B, C4a, C4b, HomoloGene, and HGNC; GK, Gk, HGNC; and SMN1, SMN2, Smn1, HomoloGene. The C4A, C4B, C4a, C4b represents a case where MGI constructed a multi-gene homology cluster from several HGNC pairs. This constructed cluster is identical to the one in HomoloGene. For ACAN, Acan and SMN1, SMN2, Smn1 clusters, the source selected is the only one that had a cluster containing both mouse and human genes. For GK, Gk both sources had clusters containing mouse and human genes, the hybrid uses HGNC clusters in these cases