| Literature DB >> 26150326 |
Janan T Eppig1, Joel E Richardson1, James A Kadin1, Cynthia L Smith1, Judith A Blake1, Carol J Bult1.
Abstract
The Mouse Genome Database (MGD, www.informatics.jax.org) is the international scientific database for genetic, genomic, and biological data on the laboratory mouse to support the research requirements of the biomedical community. To accomplish this goal, MGD provides broad data coverage, serves as the authoritative standard for mouse nomenclature for genes, mutants, and strains, and curates and integrates many types of data from literature and electronic sources. Among the key data sets MGD supports are: the complete catalog of mouse genes and genome features, comparative homology data for mouse and vertebrate genes, the authoritative set of Gene Ontology (GO) annotations for mouse gene functions, a comprehensive catalog of mouse mutations and their phenotypes, and a curated compendium of mouse models of human diseases. Here, we describe the data acquisition process, specifics about MGD's key data areas, methods to access and query MGD data, and outreach and user help facilities.Entities:
Keywords: database; gene function; genomics; human disease models; mouse
Mesh:
Year: 2015 PMID: 26150326 PMCID: PMC4545690 DOI: 10.1002/dvg.22874
Source DB: PubMed Journal: Genesis ISSN: 1526-954X Impact factor: 2.487
Figure 1Data integration. The process of data integration includes: (a) gathering data from various sources (other database resources, electronic files from online submissions or literature curation, etc.), (b) identifying common objects among the input, (c) resolving conflicts or inconsistent information or discovering missing data, and (d) assembling the relationships among the resolved data. Integration is key to knowledge discovery. For example, in this diagram the integrated data of the pink object is now seen to have a relationship to the orange object via its association with the dark blue square and light blue square, although the pink and orange objects were not coincident in the incoming data.
Common Ways to Search MGD and the Data That Can Be Retrieveda
| Search form | Access from | What is searched for | What is retrieved |
|---|---|---|---|
| Quick Search (most used access method) | Located prominently at the top of the topics index boxes on the homepage ( | Broad searches through all MGD data, including nomenclature, vocabulary/ontology terms, and annotations; provides less search specificity, but greater breadth in a simple search | Genome Features (protein coding genes, non‐coding RNA genes, QTL, symbol synonyms, etc.) and Vocabulary Terms (GO terms, Phenotype terms, Diseases, etc.) with links to data details |
| Genes and Markers Query | Use Search pull‐down on the navigation bar (Genes) or select Genes from topic index boxes on the homepage | Genes and genome features using search parameters: nomenclature, feature type ( | Genes and genome features with location, maps, homologs, mutants and alleles, GO annotations, embryonic expression, sequence and protein links, references |
| Phenotypes, Alleles & Disease Model Search | Use Search pull‐down on the navigation bar (Phenotypes) or select Phenotypes & Mutant Alleles from topic index boxes on the homepage | Mutant or genetically engineered alleles, transgenes, QTL, etc. using search parameters: phenotype or disease terms, gene or allele nomenclature, genome location, allele generation method and/or allele attributes, or allele project collections | Summary of alleles for specified parameters and displaying allele attributes, system level phenotypes, and human diseases modeled. Links to genotypes, phenotypes, and disease annotations |
| Human‐Mouse Disease Connection | Use Search pull‐down on the navigation bar (Human Disease) or select Human‐Mouse: Disease Connection from the topic index boxes on the homepage | Mouse or human genes and orthologs, genome locations, phenotypes and disease terms. Searches accept mouse or human values as input; VCF files and text files of genes or IDs are also accepted input | Grid and table views showing mouse/human orthologs, phenotype classes, and human diseases/disease models fitting the parameters entered |
| Mouse Genome Browser | Use Search pull‐down on the navigation bar (Mouse Genome Browser) or select the Browser under the Genes topic from the homepage | Search by chromosome and genome coordinates | Optionally turn on tracks for mutant alleles, SNPs, QTL, phenotypes, or switch to viewing human GRCh38 build or pseudo genomes for strains other than the C57BL/6J reference genome |
| Vocabulary Browsers | Use Search pull‐down on the navigation bar and, hovering over the Vocabularies section, select GO, Mammalian Phenotype Ontology, OMIM | Search or browse vocabulary terms and select term of interest; GO and Phenotype Ontologies terms are displayed hierarchically; OMIM terms are displayed alphabetically | GO and MP terms provide definitions, synonyms and links to all annotations for the selected term; OMIM terms link to MGD Disease Model pages and to OMIM entries |
This is not an exhaustive list of search methods or data that can be retrieved from MGD. Users are encouraged to explore MGD, visit the “Getting Started” section on the homepage (www.informatics.jax.org), the Help, and FAQ sections in the upper left of each web page, and to contact User Support (email mgi-help@jax.org) for additional questions and assistance.
Figure 2MGD Homepage. This figure shows the top portion of the homepage including the table of contents for the database. In the topic‐specific boxes (left), the major areas of data content are presented. Each topic‐specific box is a link to a sub‐page describing contents of that topic, links to specific search pages for that topic, FAQs, and links to documentation and collaborators. Beneath the topic index on the homepage is a “Getting Started” section of particular interest to new users. The “Quick Search” at the top of the table of contents is described further in Figure 3. The right side of the homepage includes links to and “About” page and to “MGI Publications,” a rotating informational image, a “What's New” section, providing information on the latest software and web changes, and MGI statistics on data content. The bottom section (not shown) holds items of “Community Interest.” The navy blue navigation bar at the top appears on each web page and allows users to quickly jump to the area of interest within MGD. The “Search,” “Download,” and “More Resources” items are each pull‐down menus leading to additional choices.
Figure 3Results of a Quick Search query. In this example, the search term cardiac arrhythmia was used. The results page displays two sections: Genome Features, delineating genes and genome features matched in the search (10 displayed of 5,579 results that can be viewed sequentially) and Vocabulary Terms, showing various annotations in MGD to phenotype, disease, expression, function, and protein domain terms matched in the search (10 of 652 results shown). Results are returned by weighting, so that the “best matched” results appear first. Rarely does one need to scroll through many pages to discover their term of interest.
Figure 4Genes and Marker Query Form. This Advanced Query Form illustrates the further precision in searching that one can obtain using specific Query Forms versus the Quick Search method. Here one can specify gene(s) by name or symbol, as one can in the Quick Search. But, in addition, data can be further specified by one or more characteristics—feature type, genome location, GO terms, Protein domains, and mouse phenotypes or human disease terms. To access the Genes and Markers Query form use the “Search” pull‐down menu in the navigation bar and follow the Genes section to select Genes and Markers Query or click the Gene topic box in the content boxes on the homepage and follow the appropriate link.
Figure 5Mouse Genome Browser results. Here the Mouse Genome Browser is providing a view of 221.6 kb of chromosome 11 from nucleotide 86478542 to 86700141. Genome features are displayed at the top (green) gene structures and the associated phenotypes shown in gold. There are several ways to access the Mouse Genome Browser, including from the navigation bar of each web page under the “Search” pull‐down menu, where it is listed directly as the last item in the pull‐down. In addition, the Mouse Genome Browser link appears on “Search” submenus of the Genes or Sequences pull‐down menu items, and under the Genes topic box on the homepage.
Figure 6Human‐Mouse: Disease Connection (HMDC), www.diseasemodel.org. The top panel shows the upper portion of the HMDC homepage with three distinct search boxes to allow searching by either mouse or human genes, genome locations, or disease or phenotype terms. Note that options are provided to upload a gene file or a VCF file to use as search parameters as well. In this example Ehlers–Danlos syndrome, Type I was entered in the disease/phenotype term box. The lower panel shows the resulting grid display where human and mouse orthologs are shown in rows and phenotypes and diseases are shown in columns. Blue indicates mouse data; orange indicates human data. The highlighted Ehlers–Danlos syndrome column shows both human COL5A1 and mouse Col5a1, respectively, are associated to the disease. Mouse gene Lum and human genes COL1A1 and COL5A2 are associated to this human disease as well, but not coincidentally. These data suggest that mice with mutations in Col1a1 and Col5a2 should be examined for phenotypes correlated to human Ehlers–Danlos syndrome and that human patients with Ehlers–Danlos phenotypes might be checked for mutations in the LUM gene.