Literature DB >> 22102583

Mouse Phenome Database (MPD).

Terry P Maddatu1, Stephen C Grubb, Carol J Bult, Molly A Bogue.   

Abstract

The Mouse Phenome Project was launched a decade ago to complement mouse genome sequencing efforts by promoting new phenotyping initiatives under standardized conditions and collecting the data in a central public database, the Mouse Phenome Database (MPD; http://phenome.jax.org). MPD houses a wealth of strain characteristics data to facilitate the use of the laboratory mouse in translational research for human health and disease, helping alleviate problems involving experimentation in humans that cannot be done practically or ethically. Data sets are voluntarily contributed by researchers from a variety of institutions and settings, or in some cases, retrieved by MPD staff from public sources. MPD maintains a growing collection of standardized reference data that assists investigators in selecting mouse strains for research applications; houses treatment/control data for drug studies and other interventions; offers a standardized platform for discovering genotype-phenotype relationships; and provides tools for hypothesis testing. MPD improvements and updates since our last NAR report are presented, including the addition of new tools and features to facilitate navigation and data mining as well as the acquisition of new data (phenotypic, genotypic and gene expression).

Entities:  

Mesh:

Year:  2011        PMID: 22102583      PMCID: PMC3245053          DOI: 10.1093/nar/gkr1061

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Phenomics, the comprehensive study of phenotypes, complements advances in knowledge from genomics, pharmacogenomics and other -omics approaches. Most human diseases are complex, polygenic and subject to modifications by environmental, genetic and epigenetic factors. A phenomics approach requires collecting comprehensive high-dimensional phenotypic data, integrating it with other types of diverse biological data, and evaluating it with computational tools and bioinformatic approaches. MPD is the only database of its kind—amassing, annotating, integrating and maintaining quantitative phenotypic data for the laboratory mouse, and providing online tools to access and analyze those data. MPD is an established community resource and research tool enabling investigators to access thousands of data points from hundreds of strains, saving time and money by precluding the need to (re)characterize strains. MPD provides many tools to facilitate mouse research. For example, MPD enables users to: Compare strains to identify mouse models of human disease. Select optimal genetic backgrounds for new or existing mutations. Take advantage of the MPD Protocol Library containing detailed, illustrated procedures that have been validated by experts in their fields (and further, to view data generated from these protocols to confirm experimental results). Discover phenotype–genotype relationships and Validate and refine results from traditional mapping studies. More powerful mapping tools are increasingly essential and indispensable in the analysis of complex traits and systems genomics to dramatically reduce the broad, unwieldy intervals of previously identified quantitative trait loci (QTL) containing hundreds (if not thousands) of genes, down to the gene level. For several years now, the community has had access to high-density SNP data for 16 inbred strains at over 8 million genome-wide locations through the NIEHS-Perlegen resequencing effort (1) and for 94 strains at 132K locations provided by the Mouse HapMap Project (2). These data were used to develop a novel high-density genotyping array, the Mouse Diversity Genotyping Array (3) (see also http://cgd/tools/diversityarray.shtml). These SNP data sets are available in MPD: Perlegen2 (4), Broad2 (5) and CGD2 (6), respectively. MPD has Mouse Diversity Array data for 123 inbred strains at 550K locations. In addition, the Wellcome Trust Sanger Institute recently released whole-genome sequence data for 17 inbred mouse strains and is generating transcriptome data (RNA-Seq) for these strains (7,8). Taken together with high-quality phenotype data, transcriptome and dense genotypic data enable the identification of regions of the genome that may be causal to a phenotype of interest through correlative analyses. Sanger SNPs, indels and structural variants will be added to MPD and made available in the coming months. MPD history, standards, schema and other topics have been presented elsewhere (9–11). Here, we review a few salient points regarding MPD contents. The recommended basic study design is a strain survey format, e.g. 10 females and 10 males of a large number of genetically stable, reproducible strains (such as inbred strains). Acquired data sets are annotated and formatted to meet a set of defined MPD standards. Each data set is accompanied by a detailed protocol; health status and environmental parameters of the test animals; and any other information essential to understanding and evaluating the data. Data and documentation are then bundled together into an MPD ‘project’, which is given a descriptive title and annotated with submitter information, references and funding acknowledgements. MPD phenotypic data sets are typically quantitative, although we do house categorical data such as MHC haplotypes and a growing collection of annotated images. Data from microarray gene expression studies are considered a special class of phenotype data and handled accordingly. Summary statistics are computed as part of the MPD accessioning process, where the unit of analysis is an MPD ‘measurement’ with strain (by sex) being the analysis group. Each measurement in a data set is classified and integrated in the MPD phenotype category structure. Male and female data are not combined. Data from different MPD measurements are not averaged together although these data may be displayed in the same plot for comparison (composite/consensus view). As part of the accessioning process, regression analysis is performed on all measurements (pair-wise) to identify compelling correlations, which may indicate common genetic determinants or overlapping biological pathways. Results are stored to optimize and support queries based on correlations. Strain genotypes are collected and stored in MPD so that phenotypic and genotypic data can be linked or visually juxtaposed, facilitating the ability to determine how allele-specific variations translate to variation in mouse phenotype. Our last NAR update was published in 2009 (11). Here, we summarize new data and data types, and describe major improvements in MPD functionality. This update refers to Supplementary Data.

CONTENTS

‘MPD current contents’ are summarized in Table 1, and a snapshot of ‘What’s New’ is shown in Table 2. Users can search the database by our keyword search, or browse by strain, phenotype category, data submitter, interventions, study design and other convenient groupings (Supplementary Data, Navigation and Finding Data: S1–S8). Briefly, over 750 strains of mice are represented in MPD with phenotypic and/or genotypic data available. Strain names are kept updated with current nomenclature, and linkouts to other databases are provided. Table 1 provides an inventory of strains, strain panels and strain sources (also Search or Browse Strains: S1, S2, S3). Investigators associated with MPD as data contributors or as formal collaborators represent approximately 150 institutions from 14 countries supported by over 80 funding agencies and research foundations worldwide (Investigators: S4; Large-Scale Initiatives: S5).
Table 1.

MPD current contents

Phenotype measurements (including pending) [S7] collectively ∼600 strains∼3000
SNPs [S11]
    Total calls (excluding imputed)∼350 million
    Genomic locations8+ million
    Number of strains with data450+
Gene expression strain surveys [S12]8
    Number of probesets280K
Strains [S2, S3]761
Strain types
    Inbred208
    Recombinant inbred strains330
    F1 hybrids87
    Wild-derived inbred70
    Chromosome substitution strains38
    Other28
Recombinant inbred panels9
    AKXD
    AXB
    BXA
    BXD
    BXH
    CXB
    ILSXISS
    LGXSM
    SWXJ
Chromosome substitution panels2
    C57BL/6J-Chr#A/J/NaJ23
    C57BL/6J-Chr#PWD/Ph/ForeJ19
Animal vendors and repositories10
    Animal Resources Centre (AUS)
    B&K Universal (UK)
    Charles River
    Harlan Laboratories
    The Jackson Laboratory
    Nihon Clea, Tokyo (Japan)
    Pasteur Institute (France)
    RIKEN BioResource Center (Japan)
    National Institute of Genetics—Shigen (Japan)
    Taconic
Table 2.

Summary of What's New since our last NAR update

MPD categories with new dataTreatment and intervention studies [S8] (with new data)
AppearanceDietary interventions
    Coat color    Capsaicin
Behavior    Diet restriction
    Anxiety    High-fat diet
    Exploratory    High-fat diet + ethanol
    Learning and memory    Sucrose
    Locomotor activityDrugs and alcohol
    Motor coordination and activity    Acetaminophen
    Social    Ethanol
    Stress reactivity    Fluoxetine
    Wildness    Haloperidol
Blood    Isoniazid
    Clinical chemistry    Lithium
    Hematology    β-adrenergic drugs
    LipidsEnvironmental insult
Body composition    Trichloroethylene
    BMIInfectious diseases
    Fat pads    Candida
Body metrics    Influenza A (H1N1)
    Size, weight, growth curve    Malaria
BoneLifestyle and healthspan
    Mineral density and content    Aging
    Metrics    Exercise
Brain
    Drug concentration
    Gene expression
Cardiovascular
    Echocardiography
    Blood pressure, heart rate
Cell and tissue damage
    Apoptosis
    Chromosome instability
Ear
    Balance
    Hearing
Endocrine
    Adrenal morphology
    Hormones
Immune system
    Cell counts
    Immunoglobulins
    Response to infection
Kidney
    Urinalysis
    Vesico-ureteral reflux
Liver
    Pathology
    Homeostasis
    Gene expression
Longevity
    Lifespan
Metabolism
    Body temperature
    Energy
    Food and water intake
Nervous system
    Autonomic function
    Neuron projection morphology
    Neurogenesis
    Neuromuscular function
    Sensorimotor function
Neurosensory
    Nociception
    Sensory gating
Organ metrics
    Eye
    Kidney
    Liver
    Spleen
Reproduction
    Fecundity
    Female
    Male
    Sex distribution
Respiratory
    Airway resistance
    Breathing pattern
    Lung compliance
    Lung volume
MPD current contents Summary of What's New since our last NAR update

Phenotype data

MPD contains approximately 3000 phenotypic strain survey measurements, collectively representing approximately 600 of the 750+ strains in MPD (Find Measurements: S6, S7, S8). Most measurements (70%) include data for both sexes. On an average, 20 strains are tested per measurement, but as many as 72 strains have been tested in a single project (CGDpheno1, pending). Currently available phenotypic measurements are classified as baseline (64%), aging-related data (14%) or controlled studies of intervention effects (22%), e.g. administering drugs or high-fat diet (Interventions: S8). MPD has acquired and formatted hundreds of experimental procedures, housing a library of protocols where detailed procedures and assays are maintained so that the research community may benefit from their use (Protocols: S9, S10).

Genotype data

The current MPD SNP database contains ∼350 million SNP calls (count excludes imputed SNPs) obtained from approximately 30 publicly available or contributed SNP sources, including Broad, Celera, Center for Genome Dynamics, Merck, Perlegen (NIEHS), Wellcome Trust, Novartis (GNF) and The Jackson Laboratory (MPD SNP Database: S11). MPD makes SNPs available in sets for RI panels (8 total), C57BL/6 substrains (13 variants) and 129 substrains (7 variants). In total, MPD has SNP data for >450 verified strains (Find Strains with SNPs: S2, S3). MPD combines SNP data with gene annotations from external resources such as Mouse Genome Informatics (MGI) (12), NCBI dbSNP (13) and Ensembl (14).

Gene expression data

In addition to new phenotypic and genotypic data, MPD has recently begun accessioning gene expression strain survey data. Currently, there are eight gene expression projects, representing ∼280K probesets. Four of these projects are part of a treatment/control paradigm, and phenotype data are available in parallel. These intervention studies are from two laboratories studying the effects of a high-fat diet [Shockley1 (15,16)] and an environmental insult (trichloroethylene exposure) [Rusyn2 (17,18)]. Thus far, data are from either whole brain or liver; efforts are underway to obtain data from other tissues and cell types. Most data sets are inbred strain surveys (10–30 strains), and another is for the ILSXISS recombinant inbred strain panel (61 strains) [GX-Tabakoff2 (19)]. Functionality of the new gene expression interface is showcased below in detail (see also Gene Expression: S12). Since our last reporting, we have accessioned over 70 new phenotype or gene expression projects from over 100 investigators representing >40 institutions. See Table 2 for ‘What's New’ in MPD categories and intervention groups. Several of these new projects are important drug studies, including acetaminophen toxicity [Threadgill1, Threadgill2 (20–23)], haloperidol effects on behavior [Crowley1 (24,25)] and effects of β-adrenergic drugs on heart function [Maurer1 (26,27)] (Projects: S6; Interventions: S8). Most have associated publications, providing assurance that the study has been peer-reviewed and the data validated by experts.

FUNCTIONALITY

MPD provides many options to view, analyze and download data (MPD Toolbox Demo: S13). Since our last NAR update, we have made significant improvements and added new or expanded functionality to the MPD Toolbox. Here, we present a comprehensive example to demonstrate the power and utility of MPD and to highlight some of our new tools and features. We start with MPD phenotype data (IGF-1 levels at 6 months of age) and use new MPD tools to confirm and further narrow published mapping results (Yuan1) (28,29). This example spans Figures 1–3 and illustrates the following new or updated features: MPD has been exploring and evaluating methods for genome-wide association studies (GWAS) that may be developed into general tools for MPD users. One approach that has received a lot of attention is the ‘efficient mixed-model method’ (EMMA), which corrects for inbred strain population structure thereby significantly reducing the number of false positive associations (30). GWAS results for selected MPD measurements (e.g. testing at least 20 strains) are available in collaboration with the UCLA Computer Science and Bioinformatics group (ZarLab) using EMMA and the Mouse HapMap for 94 inbred strains at 132K locations (2). Wild-derived strains are automatically excluded from these analyses (see more details in Figure 2). For ineligible MPD measurements (<20 strains), MPD provides an option to produce input-ready files specifically formatted for a real-time run on the UCLA EMMA Server (mouse.cs.ucla.edu/emmaserver). Users can choose to include wild-derived strains and are able to further manipulate the input file, e.g. to delete specific strains, and to select other SNP data sets for interrogation.
Figure 2.

New GWAS tools. Continuing with the IGF-1 example [Yuan1 (28)], we performed a genome-wide association analysis by clicking on the ‘GWAS’ option above the plot shown in Figure 1 (the relevant part of this panel is shown here for convenience). We found significant EMMA-corrected peaks for females on chromosomes 2, 10, 13, 14, 15, 17 and 18 (lower panel; green stars were manually overlaid on this plot to ensure visibility of peaks). Interestingly Leduc et al. (29) (the data submitter’s publication for Yuan1) found three of the same in silico QTLs using another method, haplotype association mapping (HAM), which employed the Hidden Markov Model and a set of SNPs at 70K locations (includes imputed calls). We chose to concentrate on the most significant peak identified by these methods on Chr 10 in the vicinity of the Igf1 gene. Using other supporting data sets and bioinformatics tools, Leduc et al. narrowed their results from over 45 genes annotated in the QTL interval to 21 candidate genes. We wondered if we might narrow these results even further by using another new MPD tool for finding significant gene expression correlations (see Figure 3). The new GWAS tool is provided in collaboration with the UCLA Computer Science and Bioinformatics group (ZarLab) (30). MPD is not responsible for the UCLA website or EMMA Server (mouse.cs.ucla.edu/emmaserver).

Improvement of various tools with new layouts, more options and links to other tools and information (Figure 1).
Figure 1.

Phenotype data: new layout, viewing options and tools. An MPD summary plot is shown from an aging study measuring IGF-1 levels in males and females of 33 inbred strains at 6 months of age [Yuan1 (28); see also accompanying paper (29)]. From this plot, users have several choices for viewing (see plot options), and there are one-click options for deploying relevant tools specific for this measurement (boxed selections just above the plot), which include finding other phenotype measurements that correlate significantly, finding genes with correlated expression patterns and performing EMMA-corrected genome-wide association analysis (GWAS). For most MPD measurements with at least 20 inbred strains tested, the GWAS option is a direct link to pre-computed results hosted on the UCLA EMMA website (see text). About the plot: strain means are shown (±1 SEM); dotted horizontal lines indicate 1 SD for the overall strain mean (red = female; blue = male). All words in blue are links to tools, other views or additional information. For a continuation of this example, see Figure 2 (GWAS) and Figure 3 (gene expression).

Development of interfaces to facilitate association mapping using the UCLA EMMA-Correction Server (see below and Figure 2). Implementation of a user interface for working with gene expression data and providing tools for finding significant correlations to phenotype data using an innovative variance-based, data-screening feature to minimize false positives (Gene Expression: S12; Correlations Center S14) (Figure 3) and
Figure 3.

New gene expression interface. Continuing with the IGF-1 example [Yuan1 (28)] from Figures 1 and 2, we interrogated the Chr 10 QTL interval (84.5–90.0 Mbp) identified by both association mapping methods (HAM and EMMA) using the new MPD gene expression interface. Users can simply click on the correlated probes/genes option above the plot shown in Figure 1 (the relevant part of this panel is shown in the upper-left panel for convenience), or users may choose to begin in the Correlations Center (S14). Of those 21 candidate genes identified by Leduc et al. (29) (see Figure 2), we went on to confirm that 6 of those genes had compelling expression profiles consistent with IGF-1 levels (P < 0.05). Three of these genes are listed in the middle-left panel (see Note 1 for more information about this panel). As indicated, three probesets from two gene expression projects [GX-Su1 (31,32), GX-Tabakoff1 (33)] were identified using various filtering and sorting options that are available with this tool. Users can adjust stringency to broaden or narrow results by selecting P-value cutoff and setting the data point variance (DPV) quality filter where stricter settings will omit probesets that have larger standard errors relative to the range of strain means. Users may further filter results by minimum number of strains, sex, specific data set and chromosome. Results can be sorted by P-value, correlation coefficient, location order, MPD project or MPD category. A phenotype-gene expression scatterplot is available through the ‘Plot’ link. IGF-1 versus Nr1h4 for males is shown in the lower-left panel. By clicking on gene name, we accessed the gene directory page (lower-right panel, and see Note 2 below), to deploy the 550K SNP panel [CGD2 (3,6)] specifically for the gene of interest to compare the segregation of phenotype with polymorphisms. Using MPD SNP tools, we confirmed that three of the six genes have polymorphic intronic or UTR regions, consistent with the notion that these genes are differentially regulated. The corresponding SNP results for Nr1h4 are shown (middle-right panel) where strains are grouped according to ‘hi’ (left) or ‘lo’ (right) IGF-1 levels. These strains fall into two major haplotypes (wild-derived strains fall into a third haplotype, not shown). For ease of comparison, an optional view of the male IGF-1 data is shown in the upper-right panel [color-guides from the SNP results to plot: blue = high IGF-1 levels, yellow = low IGF-1 levels (see also, strain distribution in Figure 1)]. All words in blue are links to tools, other views or additional information. While we demonstrate that MPD can help identify (or further verify) candidate genes exhibiting differential expression, we also note that caution should be used in making claims of causality based solely on microarray data, SNPs and functional relevance of positional candidates. All MPD phenotype-gene expression pair-wise correlations are available for bulk download (Download Center: S16). Note 1: The ‘collect gene’ option in the middle of the gene expression results (middle-left panel) is one place where users can take advantage of our new ‘workbook’ feature for paring down and gathering candidate genes. Lists are provided a new window, and checkboxes are supplied to exclude (or include) genes in an updated list. At any time, users may opt to generate a separate clean list of gene symbols or various accession ID options to copy/paste into batch query forms of other bioinformatics resources. Note 2: The gene directory box shown in the bottom–right panel is a hub for information and linkouts to other resources (additional features are added as needed). We provide one-click deployment to SNP tools, finding MPD phenotype measurements that correlate, and information about available gene expression probesets, including tools to compare probesets between gene expression projects and to compare female and male for a single probeset within a project. Users will arrive at a gene directory page from any gene link on the MPD website, from gene searches and from general searches that turn up genes. For Developers wanting to link to our gene directory page, use this URL template: http://phenome.jax.org/db/qp?rtn=markers/details&reqsym=Abcd1, where Abcd1 is gene symbol (For Developers: S22).

Implementation of a ‘workbook’ feature to help users pare-down and gather candidate genes with an option to generate an output of gene IDs for quick copy–paste into other bioinformatics resources (Figure 3). Phenotype data: new layout, viewing options and tools. An MPD summary plot is shown from an aging study measuring IGF-1 levels in males and females of 33 inbred strains at 6 months of age [Yuan1 (28); see also accompanying paper (29)]. From this plot, users have several choices for viewing (see plot options), and there are one-click options for deploying relevant tools specific for this measurement (boxed selections just above the plot), which include finding other phenotype measurements that correlate significantly, finding genes with correlated expression patterns and performing EMMA-corrected genome-wide association analysis (GWAS). For most MPD measurements with at least 20 inbred strains tested, the GWAS option is a direct link to pre-computed results hosted on the UCLA EMMA website (see text). About the plot: strain means are shown (±1 SEM); dotted horizontal lines indicate 1 SD for the overall strain mean (red = female; blue = male). All words in blue are links to tools, other views or additional information. For a continuation of this example, see Figure 2 (GWAS) and Figure 3 (gene expression). New GWAS tools. Continuing with the IGF-1 example [Yuan1 (28)], we performed a genome-wide association analysis by clicking on the ‘GWAS’ option above the plot shown in Figure 1 (the relevant part of this panel is shown here for convenience). We found significant EMMA-corrected peaks for females on chromosomes 2, 10, 13, 14, 15, 17 and 18 (lower panel; green stars were manually overlaid on this plot to ensure visibility of peaks). Interestingly Leduc et al. (29) (the data submitter’s publication for Yuan1) found three of the same in silico QTLs using another method, haplotype association mapping (HAM), which employed the Hidden Markov Model and a set of SNPs at 70K locations (includes imputed calls). We chose to concentrate on the most significant peak identified by these methods on Chr 10 in the vicinity of the Igf1 gene. Using other supporting data sets and bioinformatics tools, Leduc et al. narrowed their results from over 45 genes annotated in the QTL interval to 21 candidate genes. We wondered if we might narrow these results even further by using another new MPD tool for finding significant gene expression correlations (see Figure 3). The new GWAS tool is provided in collaboration with the UCLA Computer Science and Bioinformatics group (ZarLab) (30). MPD is not responsible for the UCLA website or EMMA Server (mouse.cs.ucla.edu/emmaserver). New gene expression interface. Continuing with the IGF-1 example [Yuan1 (28)] from Figures 1 and 2, we interrogated the Chr 10 QTL interval (84.5–90.0 Mbp) identified by both association mapping methods (HAM and EMMA) using the new MPD gene expression interface. Users can simply click on the correlated probes/genes option above the plot shown in Figure 1 (the relevant part of this panel is shown in the upper-left panel for convenience), or users may choose to begin in the Correlations Center (S14). Of those 21 candidate genes identified by Leduc et al. (29) (see Figure 2), we went on to confirm that 6 of those genes had compelling expression profiles consistent with IGF-1 levels (P < 0.05). Three of these genes are listed in the middle-left panel (see Note 1 for more information about this panel). As indicated, three probesets from two gene expression projects [GX-Su1 (31,32), GX-Tabakoff1 (33)] were identified using various filtering and sorting options that are available with this tool. Users can adjust stringency to broaden or narrow results by selecting P-value cutoff and setting the data point variance (DPV) quality filter where stricter settings will omit probesets that have larger standard errors relative to the range of strain means. Users may further filter results by minimum number of strains, sex, specific data set and chromosome. Results can be sorted by P-value, correlation coefficient, location order, MPD project or MPD category. A phenotype-gene expression scatterplot is available through the ‘Plot’ link. IGF-1 versus Nr1h4 for males is shown in the lower-left panel. By clicking on gene name, we accessed the gene directory page (lower-right panel, and see Note 2 below), to deploy the 550K SNP panel [CGD2 (3,6)] specifically for the gene of interest to compare the segregation of phenotype with polymorphisms. Using MPD SNP tools, we confirmed that three of the six genes have polymorphic intronic or UTR regions, consistent with the notion that these genes are differentially regulated. The corresponding SNP results for Nr1h4 are shown (middle-right panel) where strains are grouped according to ‘hi’ (left) or ‘lo’ (right) IGF-1 levels. These strains fall into two major haplotypes (wild-derived strains fall into a third haplotype, not shown). For ease of comparison, an optional view of the male IGF-1 data is shown in the upper-right panel [color-guides from the SNP results to plot: blue = high IGF-1 levels, yellow = low IGF-1 levels (see also, strain distribution in Figure 1)]. All words in blue are links to tools, other views or additional information. While we demonstrate that MPD can help identify (or further verify) candidate genes exhibiting differential expression, we also note that caution should be used in making claims of causality based solely on microarray data, SNPs and functional relevance of positional candidates. All MPD phenotype-gene expression pair-wise correlations are available for bulk download (Download Center: S16). Note 1: The ‘collect gene’ option in the middle of the gene expression results (middle-left panel) is one place where users can take advantage of our new ‘workbook’ feature for paring down and gathering candidate genes. Lists are provided a new window, and checkboxes are supplied to exclude (or include) genes in an updated list. At any time, users may opt to generate a separate clean list of gene symbols or various accession ID options to copy/paste into batch query forms of other bioinformatics resources. Note 2: The gene directory box shown in the bottom–right panel is a hub for information and linkouts to other resources (additional features are added as needed). We provide one-click deployment to SNP tools, finding MPD phenotype measurements that correlate, and information about available gene expression probesets, including tools to compare probesets between gene expression projects and to compare female and male for a single probeset within a project. Users will arrive at a gene directory page from any gene link on the MPD website, from gene searches and from general searches that turn up genes. For Developers wanting to link to our gene directory page, use this URL template: http://phenome.jax.org/db/qp?rtn=markers/details&reqsym=Abcd1, where Abcd1 is gene symbol (For Developers: S22). In addition to the highlighted examples (Figures 1–3), MPD has also implemented the following since our last NAR update: Correlations Center: centralizes, organizes and explains all MPD correlation operations on a single webpage; and provides convenient links to relevant tools (Correlations Center: S14). Private Analysis Portal: users enter their own experimental data and run real-time regression analysis across the entire MPD to identify statistically significant correlations to existing phenotype and gene expression data (Private Analysis Portal: S15) and Tool to generate ratio or difference vectors for user-selected MPD measurements: users can view plots and identify independent MPD measurements that are highly correlated to the de novo generated vector (MPD Toolbox Demo: S13; Correlations Center: S14).

Additional supplementary data

Individual animal data, summary statistics and other metadata are available for downloading in bulk or in customized data sets specified by the user (Download Center: S16; SNP Downloads: S11). For convenience, we are providing Supplemental Material for general content and information (MPD Content and Quick Facts: S17) and a user's manual (MPD User Manual and FAQ: S18).

HIGH-LEVEL OVERVIEW OF IMPLEMENTATION

The specifics of our implementation have not changed significantly since our previous update (11). MPD runs on a SUSE Linux/Apache 2 system using open source database, HTML generation and graphics generation software, with custom-written C modules used in performance-critical areas. Phenotype data sets are typically submitted by data contributors as Excel spreadsheets. Genotype and gene expression data sets are usually downloaded by us from open public resources. Ad hoc processing methods are used to integrate these data into our standardized framework. In conjunction with data contributors, MPD staff creates supporting documentation in HTML. We add content and perform testing and QA on an internal development node; then this instance is copied over for public release. MPD follows an agile development philosophy with rapid prototyping and frequent updates; version numbering is not used. For more documentation, see MPD ‘BioDBcore Statement’ (S19). Information for developers is also available (For Developers: S20).

FUTURE MPD DIRECTIONS AND AN INVITATION TO INVESTIGATORS

MPD will continue implementing useful linkages to other databases and populating the database with phenotype, genotype and gene expression data. We continue to refine MPD data classifications with relevant ontologies, optimizing search and navigation functionality as well as facilitating data integration within MPD and across databases. We are in the process of loading Sanger SNPs, indels and structural variants derived from whole-genome sequencing of 17 inbred mouse strains (7,8), and we will take advantage of the availability of the RNA-seq data being released by Sanger on this set of strains. RNA-seq data will complement existing MPD data and will enable queries to determine expression levels of different alleles of a gene and to detect post-transcriptional mutations or other genetic alterations. Many phenotypic domains are currently represented in MPD; however, the acquisition of new data is open-ended with the goal of collecting data on a broader scope (and in some cases to a deeper level for more granular phenotypes or endophenotypes) as well as collecting data generated from new, more sophisticated phenotyping technologies. Although we have added a significant number of phenotypes since our last update (Table 2), we will continue working toward accessioning more data for inbred strains and other strain types as well as for new panels such as the Collaborative Cross (34–36). We would like to take this opportunity to make an appeal for clinically relevant data in toxicology and specific disease areas: cancer, digestive diseases, kidney disease, neurological disorders, drug addiction and infectious diseases. To expand the scope and maximize the utility of MPD, members of the global scientific community are invited to contribute their strain survey data or join us in a coordinated effort to seek funding that will support these studies. This spirit of collaboration has shaped and will continue to guide the future growth and development of MPD. Researchers interested in contributing data or in collaborating on new phenotyping projects should contact us at phenome@jax.org. Data submission guidelines are accessible through the MPD homepage (Contributing Data: S21).

CITING MPD AND USER SUPPORT

For general citation of MPD, this NAR article may be used. In addition, the following citation format is preferred when referring to MPD projects or using MPD data sets. For more information visit our website and search on ‘citing’. Investigator(s) name (year project posted) Project title. MPD project symbol (e.g. Smith1) and/or accession number (MPD:XXX). Mouse Phenome Database Website, The Jackson Laboratory, Bar Harbor, ME, USA. (URL: http://phenome.jax.org), date of download or access. MPD provides user support through online documentation (S17, S18) and via email (phenome@jax.org). We welcome user input and suggestions, which can be submitted anonymously through our Suggestion Box links at the bottom of most MPD webpages.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online: Supplementary Table 1.

FUNDING

The Jackson Laboratory and National Institutes of Health (grant numbers HG003057, HL66611, AG025707, AG038070, MH071984, DA028420). MPD houses and maintains data from research projects supported by over 80 funding agencies and research foundations worldwide (Funding: S22). Funding for open access charge: National Institutes of Health (DA028420). Conflict of interest statement. None declared.
  23 in total

Review 1.  A mouse phenome project.

Authors:  K Paigen; J T Eppig
Journal:  Mamm Genome       Date:  2000-09       Impact factor: 2.957

2.  Genetic analysis in the Collaborative Cross breeding population.

Authors:  Vivek M Philip; Greta Sokoloff; Cheryl L Ackert-Bicknell; Martin Striz; Lisa Branstetter; Melissa A Beckmann; Jason S Spence; Barbara L Jackson; Leslie D Galloway; Paul Barker; Ann M Wymore; Patricia R Hunsicker; David C Durtschi; Ginger S Shaw; Sarah Shinpock; Kenneth F Manly; Darla R Miller; Kevin D Donohue; Cymbeline T Culiat; Gary A Churchill; William R Lariviere; Abraham A Palmer; Bruce F O'Hara; Brynn H Voy; Elissa J Chesler
Journal:  Genome Res       Date:  2011-07-06       Impact factor: 9.043

Review 3.  The collaborative cross: a recombinant inbred mouse population for the systems genetic era.

Authors:  David W Threadgill; Darla R Miller; Gary A Churchill; Fernando Pardo-Manuel de Villena
Journal:  ILAR J       Date:  2011

4.  Interstrain differences in the liver effects of trichloroethylene in a multistrain panel of inbred mice.

Authors:  Blair U Bradford; Eric F Lock; Oksana Kosyk; Sungkyoon Kim; Takeki Uehara; David Harbourt; Michelle DeSimone; David W Threadgill; Volodymyr Tryndyak; Igor P Pogribny; Lisa Bleyle; Dennis R Koop; Ivan Rusyn
Journal:  Toxicol Sci       Date:  2010-12-06       Impact factor: 4.849

5.  Genetic analysis of complex traits in the emerging Collaborative Cross.

Authors:  David L Aylor; William Valdar; Wendy Foulds-Mathes; Ryan J Buus; Ricardo A Verdugo; Ralph S Baric; Martin T Ferris; Jeff A Frelinger; Mark Heise; Matt B Frieman; Lisa E Gralinski; Timothy A Bell; John D Didion; Kunjie Hua; Derrick L Nehrenberg; Christine L Powell; Jill Steigerwalt; Yuying Xie; Samir N P Kelada; Francis S Collins; Ivana V Yang; David A Schwartz; Lisa A Branstetter; Elissa J Chesler; Darla R Miller; Jason Spence; Eric Yi Liu; Leonard McMillan; Abhishek Sarkar; Jeremy Wang; Wei Wang; Qi Zhang; Karl W Broman; Ron Korstanje; Caroline Durrant; Richard Mott; Fuad A Iraqi; Daniel Pomp; David Threadgill; Fernando Pardo-Manuel de Villena; Gary A Churchill
Journal:  Genome Res       Date:  2011-03-15       Impact factor: 9.043

6.  Mouse Phenome Project: understanding human biology through mouse genetics and genomics.

Authors:  Molly Bogue
Journal:  J Appl Physiol (1985)       Date:  2003-10

7.  A sequence-based variation map of 8.27 million SNPs in inbred mouse strains.

Authors:  Kelly A Frazer; Eleazar Eskin; Hyun Min Kang; Molly A Bogue; David A Hinds; Erica J Beilharz; Robert V Gupta; Julie Montgomery; Matt M Morenzoni; Geoffrey B Nilsen; Charit L Pethiyagoda; Laura L Stuve; Frank M Johnson; Mark J Daly; Claire M Wade; David R Cox
Journal:  Nature       Date:  2007-07-29       Impact factor: 49.962

8.  Mouse genomic variation and its effect on phenotypes and gene regulation.

Authors:  Thomas M Keane; Leo Goodstadt; Petr Danecek; Michael A White; Kim Wong; Binnaz Yalcin; Andreas Heger; Avigail Agam; Guy Slater; Martin Goodson; Nicholas A Furlotte; Eleazar Eskin; Christoffer Nellåker; Helen Whitley; James Cleak; Deborah Janowitz; Polinka Hernandez-Pliego; Andrew Edwards; T Grant Belgard; Peter L Oliver; Rebecca E McIntyre; Amarjit Bhomra; Jérôme Nicod; Xiangchao Gan; Wei Yuan; Louise van der Weyden; Charles A Steward; Sendu Bala; Jim Stalker; Richard Mott; Richard Durbin; Ian J Jackson; Anne Czechanski; José Afonso Guerra-Assunção; Leah Rae Donahue; Laura G Reinholdt; Bret A Payseur; Chris P Ponting; Ewan Birney; Jonathan Flint; David J Adams
Journal:  Nature       Date:  2011-09-14       Impact factor: 49.962

9.  Sequence-based characterization of structural variation in the mouse genome.

Authors:  Binnaz Yalcin; Kim Wong; Avigail Agam; Martin Goodson; Thomas M Keane; Xiangchao Gan; Christoffer Nellåker; Leo Goodstadt; Jérôme Nicod; Amarjit Bhomra; Polinka Hernandez-Pliego; Helen Whitley; James Cleak; Rebekah Dutton; Deborah Janowitz; Richard Mott; David J Adams; Jonathan Flint
Journal:  Nature       Date:  2011-09-14       Impact factor: 49.962

10.  Antipsychotic-induced vacuous chewing movements and extrapyramidal side effects are highly heritable in mice.

Authors:  J J Crowley; D E Adkins; A L Pratt; C R Quackenbush; E J van den Oord; S S Moy; K C Wilhelmsen; T B Cooper; M A Bogue; H L McLeod; P F Sullivan
Journal:  Pharmacogenomics J       Date:  2010-11-16       Impact factor: 3.550

View more
  31 in total

Review 1.  Rare-disease genetics in the era of next-generation sequencing: discovery to translation.

Authors:  Kym M Boycott; Megan R Vanstone; Dennis E Bulman; Alex E MacKenzie
Journal:  Nat Rev Genet       Date:  2013-09-03       Impact factor: 53.242

2.  Pheno-Pub: a total support system for the publication of mouse phenotypic data on the web.

Authors:  Tomohiro Suzuki; Tamio Furuse; Ikuko Yamada; Hiromi Motegi; Yasuyo Kozawa; Hiroshi Masuya; Shigeharu Wakana
Journal:  Mamm Genome       Date:  2013-11-13       Impact factor: 2.957

3.  Association of Nrf2 polymorphism haplotypes with acute lung injury phenotypes in inbred strains of mice.

Authors:  Hye-Youn Cho; Anne E Jedlicka; Wesley Gladwell; Jacqui Marzec; Zackary R McCaw; Rachelle J Bienstock; Steven R Kleeberger
Journal:  Antioxid Redox Signal       Date:  2014-11-12       Impact factor: 8.401

4.  Antigen specificity can be irrelevant to immunocytokine efficacy and biodistribution.

Authors:  Alice Tzeng; Byron H Kwan; Cary F Opel; Tejas Navaratna; K Dane Wittrup
Journal:  Proc Natl Acad Sci U S A       Date:  2015-03-02       Impact factor: 11.205

Review 5.  Biological databases for behavioral neurobiology.

Authors:  Erich J Baker
Journal:  Int Rev Neurobiol       Date:  2012       Impact factor: 3.230

6.  Imputation of exome sequence variants into population- based samples and blood-cell-trait-associated loci in African Americans: NHLBI GO Exome Sequencing Project.

Authors:  Paul L Auer; Jill M Johnsen; Andrew D Johnson; Benjamin A Logsdon; Leslie A Lange; Michael A Nalls; Guosheng Zhang; Nora Franceschini; Keolu Fox; Ethan M Lange; Stephen S Rich; Christopher J O'Donnell; Rebecca D Jackson; Robert B Wallace; Zhao Chen; Timothy A Graubert; James G Wilson; Hua Tang; Guillaume Lettre; Alex P Reiner; Santhi K Ganesh; Yun Li
Journal:  Am J Hum Genet       Date:  2012-10-25       Impact factor: 11.025

7.  Methodological considerations for measuring spontaneous physical activity in rodents.

Authors:  Jennifer A Teske; Claudio E Perez-Leighton; Charles J Billington; Catherine M Kotz
Journal:  Am J Physiol Regul Integr Comp Physiol       Date:  2014-03-05       Impact factor: 3.619

8.  Long-term nonsense suppression therapy moderates MPS I-H disease progression.

Authors:  Gwen Gunn; Yanying Dai; Ming Du; Valery Belakhov; Jeyakumar Kandasamy; Trenton R Schoeb; Timor Baasov; David M Bedwell; Kim M Keeling
Journal:  Mol Genet Metab       Date:  2013-12-17       Impact factor: 4.797

Review 9.  Now you see me, now you don't: the interaction of Salmonella with innate immune receptors.

Authors:  A Marijke Keestra-Gounder; Renée M Tsolis; Andreas J Bäumler
Journal:  Nat Rev Microbiol       Date:  2015-03-09       Impact factor: 60.633

10.  Discovery and refinement of muscle weight QTLs in B6 × D2 advanced intercross mice.

Authors:  P Carbonetto; R Cheng; J P Gyekis; C C Parker; D A Blizard; A A Palmer; A Lionikas
Journal:  Physiol Genomics       Date:  2014-06-24       Impact factor: 3.107

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.