Literature DB >> 23870131

Genome-wide generation and systematic phenotyping of knockout mice reveals new roles for many genes.

Jacqueline K White¹, Anna-Karin Gerdin, Natasha A Karp, Ed Ryder, Marija Buljan, James N Bussell, Jennifer Salisbury, Simon Clare, Neil J Ingham, Christine Podrini, Richard Houghton, Jeanne Estabel, Joanna R Bottomley, David G Melvin, David Sunter, Niels C Adams, David Tannahill, Darren W Logan, Daniel G Macarthur, Jonathan Flint, Vinit B Mahajan, Stephen H Tsang, Ian Smyth, Fiona M Watt, William C Skarnes, Gordon Dougan, David J Adams, Ramiro Ramirez-Solis, Allan Bradley, Karen P Steel.

Abstract

Mutations in whole organisms are powerful ways of interrogating gene function in a realistic context. We describe a program, the Sanger Institute Mouse Genetics Project, that provides a step toward the aim of knocking out all genes and screening each line for a broad range of traits. We found that hitherto unpublished genes were as likely to reveal phenotypes as known genes, suggesting that novel genes represent a rich resource for investigating the molecular basis of disease. We found many unexpected phenotypes detected only because we screened for them, emphasizing the value of screening all mutants for a wide range of traits. Haploinsufficiency and pleiotropy were both surprisingly common. Forty-two percent of genes were essential for viability, and these were less likely to have a paralog and more likely to contribute to a protein complex than other genes. Phenotypic data and more than 900 mutants are openly available for further analysis. PAPERCLIP:

Entities: Chemical

Mesh：

Year: 2013 PMID： 23870131 PMCID： PMC3717207 DOI： 10.1016/j.cell.2013.06.022

Source DB: PubMed Journal: Cell ISSN： 0092-8674 Impact factor: 41.582

Introduction

The availability of well-annotated genome sequences for a variety of organisms has provided a strong foundation on which much biological knowledge has been assembled, including the generation of comprehensive genetic resources. This has been achieved in several model organisms, including E. coli, S. cerevisiae, S. pombe, A. thaliana, C. elegans, and D. melanogaster, greatly facilitating studies focused on single genes and enabling genome-wide genetic screens. Annotation of the human genome has identified over 20,000 protein-coding genes as well as many noncoding RNAs. Despite the dramatic increase in the knowledge of variation in human genomes, the normal function of many genes is still unknown or predicted from sequence analysis alone, and consequently, the disease significance of rare variants remains obscure. Furthermore, there remains a large bias toward research on a small number of the best-known genes (Edwards et al., 2011). Realizing the full value of the complete human genome sequence requires broadening this focus, and the availability of comprehensive biological resources will facilitate this process. The mouse is a key model organism for assessing mammalian gene function, providing access to conserved processes such as development, metabolism, and physiology. Genetic studies in mice, mostly via targeted mutagenesis in ES cells, have described a function for 7,229 genes (ftp://ftp.informatics.jax.org/pub/reports/MGI_PhenotypicAllele.rpt, February 2013). The vast majority of these studies have been directed at previously studied (known) genes, driven by previous biological knowledge. Phenotype-driven screens have also identified genes associated with specific phenotypes, although to a smaller extent. Although targeted mutagenesis has been very successful, the global distribution of the effort has resulted in significant heterogeneity in allele design, genetic background of mice used, and their phenotypic analysis. Furthermore, the biological focus of most targeted knockout experiments is constrained by the expertise of the specific research group. As a result, many phenotypes have not been detected, and consequently, the full biological function of many genes studied using knockout mice is significantly underreported. Some efforts to generate and phenotype sizeable sets of new targeted alleles of genes of interest have been reported previously (e.g., Tang et al., 2010). These studies focused on specific categories of molecules such as secreted and transmembrane proteins or other “drugable” targets. Other research centers have established mouse clinics, with the aim of carrying out a comprehensive analysis of the phenotypes of mutant lines of specific interest (e.g., Fuchs et al., 2012; Wakana et al., 2009; Laughlin et al., 2012). The genome-wide set of targeted mutations in ES cells established by the KOMP, EUCOMM, and MirKO programs (Skarnes et al., 2011; Prosser et al., 2011; Park et al., 2012) provides an opportunity to conduct systematic, large-scale gene function analysis in a mammalian system without the variables inherent in studies by individual groups. The Sanger Institute’s Mouse Genetics Project (MGP) was one of the first programs to pursue this objective, established in 2006 when the first targeted ES cells became available. The MGP later expanded to contribute to a European phenotyping effort, EUMODIC, and more recently has become a founding member of the International Mouse Phenotyping Consortium (IMPC). Summaries of the developing efforts and aspirations of the IMPC have been reported (e.g., Brown and Moore, 2012; Ayadi et al., 2012). As the first established large-scale project using the KOMP/EUCOMM ES cells, the MGP has provided pilot data to inform the design of the international effort, such as the advantages of a single pipeline design, optimum numbers of mice, and details of variance for specific phenotyping tests. To date, the MGP has generated more than 900 lines of mutants using KOMP/EUCOMM resources (http://www.sanger.ac.uk/mouseportal/), and here, we describe the analysis of 489 of these for viability and fertility and 250 lines that have passed through a systematic screen for adult phenotypes, providing a glimpse into the wealth of biological insight that will emerge from these programs. Publicly available data enable the construction of new hypotheses, and the mouse mutants provide an invaluable resource for follow-up studies.

Results

Genes and Alleles

Mice carrying targeted knockout first conditional-ready alleles from the KOMP/EUCOMM ES cell resources (Figures S1A and S1B available online; Skarnes et al., 2011) were established on a C57BL/6 genetic background. The mutants generated are listed in Tables S1 and S2, and all are available through public repositories including EMMA (http://www.emmanet.org/) and KOMP (http://www.komp.org/). Two classes of alleles are represented: those targeted with a promoter-driven selectable marker, and those with promoterless targeting vectors. Most are expected to be null alleles based on previous experience with this design (Mitchell et al., 2001; Testa et al., 2004). Data from 25 alleles showed that most (15) had <0.5% of normal transcript level detected in liver with a minority (4) showing a “leakiness” of ∼20% (column X; Table S2). The structure of each allele was confirmed when established in mice (Figure S1C).

Figure S1

Allele Design, Genotyping, and Chromosomal Distribution of Genes Selected, Related to Figure 1

(A and B) Examples of the allele designs used. Illustration of the two main alleles used, A, Nsun2 contains a promotor-driven targeting vector, and B, Smc3 contains a promotorless targeting vector [gene build Mouse NCBIM37, (Ensembl 66: Feb 2012)]. The promotorless allele design is biased toward genes that are expressed in ES cells. The alleles are expected to be null alleles, but assessment of the degree of knockdown and the extent of off-target effects on nearby genes has not been carried out systematically.

(C) Genotyping and quality control of mice. ES cells: Long-range (LR) PCR, using one primer in the cassette and another outside of the homology arms of the allele design, was used to confirm the targeting on either the 3′ or 5′ side of the vector prior to micro-injection. Mice: To determine the genotype and confirm gene identity, three short-range PCR assays were used: mutant allele-specific, wild-type allele-specific and to detect the lacZ gene. Targeting was confirmed by either LRPCR, loss of the wild-type specific short-range PCR product in homozygotes or a qPCR assay confirming loss of the wild-type allele. Presence of the 3′ LoxP site was detected by either qPCR or short-range PCR assays. Further details of the QC protocols are available from: http://www.knockoutmouse.org/kb/25/. Initially mice were genotyped using a combination of the three short-range PCR assays, but to facilitate high-throughput, we later switched to a qPCR neo cassette counting-based system. Initial genotyping was carried out using ear punches from ∼14 day old mice, so that mice of the desired genotypes for screening could be identified and weaned together. Genotyping was repeated at the far-end of the pipeline after culling, and data were only accepted from mice for which the second genotype was concordant with the 14 day genotype.

(D) Genomic distribution of genes studied. An illustration of the mouse karyotype showing the location of genes targeted (red arrowheads) across all chromosomes except Y.

Viability

Viability was assessed at postnatal day 14 (P14) by genotyping offspring of heterozygous crosses (Figure 1A). Data from 489 targeted alleles are summarized in Figure 2A. Overall, 58% were fully viable, whereas 29% produced no homozygotes at P14 and were classed as lethal, consistent with the proportion of homozygous embryonic/perinatal lethal mutants reported by MGI (2,183 of 7,229 lines of mice [30%]; ftp://ftp.informatics.jax.org/pub/reports/MGI_PhenoGenoMP.rpt, February 2013). A further 13% produced fewer than 13% homozygotes and were considered to be subviable. Genes required for survival included alleles generated with both promoter-driven and promoterless selection cassettes, but the latter were significantly more likely to be lethal (Figure 2B; Table S3) despite a greater level of persistent gene expression (11 of 14 promotor-driven compared with 4 of 11 promotorless alleles with <0.5% expression; column X; Table S2).

Figure 1

Illustration of the Phenotyping Pipelines

(A) An overview of the typical workflow from chimera to entry into phenotyping pipelines, encompassing homozygous (Hom) viability, fertility, and target gene expression profiling using the lacZ reporter. Het, heterozygous.

(B) The Sanger Institute MGP clinical phenotyping pipeline showing tests performed during each week. Seven male and seven female mutant mice are processed for each allele screened. In addition, seven male and seven female WT controls per genetic background are processed every week.

See also Figure S1 and Tables S1 and S2.

Figure 2

Homozygous Viability and Fertility Overview

(A) Homozygous viability at P14 was assessed in 489 EUCOMM/KOMP targeted alleles. A minimum of 28 live progeny were required to assign viability status. Lines with 0% homozygotes were classed as lethal, >0% and ≤13% as subviable, and >13% as viable.

(B) Comparison of homozygous viability data from targeted alleles carrying either a promoter-driven or promoterless neomycin selection cassette.

(C) Lines classed as lethal or subviable at P14 were further assessed for viability at E14.5. Of the 205 targeted alleles eligible for this recessive lethality screen, 143 are reported here. A total of 28 embryos were required to assign viability status, and outcomes were categorized by both the number and dysmorphology of homozygous offspring.

(D) A basic dysmorphology screen encompassing 12 parameters was performed on all embryos for the 75 targeted alleles classed as viable or subviable at E14.5. A total of 34 targeted alleles showed one or more abnormality, and the percentage incidence is presented.

(E–G) Examples of E14.5 dysmorphology (arrowheads indicate abnormalities) are presented. Homozygous progeny were detected at a Mendelian frequency in all three examples. Sixty-seven percent (six of nine) Mks1 embryos presented with edema, polydactyly, and eye defects (E). Sixty-two percent (five of eight) Spnb2 embryos presented with edema and hemorrhage (F). Eighty-six percent (six of seven) Psat1 embryos presented with growth retardation, exencephaly, and craniofacial abnormalities (G).

(H) Fertility was assessed in homozygous viable lines (307 mouse lines assessed from a total of 331 eligible lines). At least four independent 6-week-old mice of each sex were mated for a minimum of 6 weeks, and if progeny were born, the line was classed as fertile, regardless of if the progeny survived to weaning. Of note is the strong skew toward male (blue circle) fertility issues (15 of 16 genes) compared to 4 of 15 genes that displayed female (red circle) fertility issues.

Fertility

Fertility of heterozygotes was assessed from heterozygous intercrosses. Of 489 alleles assessed, all heterozygotes were able to produce offspring. Homozygous mutants for 307 of the viable lines were then assessed. A homozygous infertility rate of 5.2% (n = 16) was observed (Figure 2H), strongly male biased with 15 of 16 genes exhibiting male infertility. A total of 11 genes affected only males, whereas just 1 was female specific (Pabpc1l). Of these 16 genes, 7 have not previously been associated with infertility. Although some were good candidates such as Usp42, expressed during mouse spermatogenesis (Kim et al., 2007), others are novel genes such as 3010026O09Rik and may suggest new pathways or mechanisms influencing fertility.

Adult Phenotypes

We report here the results of our screen of the first 250 lines to complete all primary phenotyping pipelines. In contrast to previous focused screens by Mitchell et al. (2001) and Tang et al. (2010), a broad range of gene products was included. The 250 genes reported span all chromosomes except Y (Figure S1D) and include eight control lines published previously and 87 genes proposed by the research community. For 34 of the 250 genes, no functional information has been published. A comparison of this gene set with all mouse genes indicates minimal GO term enrichment spread over a variety of processes and underrepresentation only in sensory perception of smell, indicating that the gene set can be regarded as a reasonable sample of the genome. A series of tests was used (Figure 1B), designed to detect robust variations in phenotypes that were key indicators of a broad spectrum of disease categories. Of the 250 reported lines, 104 were either lethal or subviable; most of these were screened as heterozygotes (n = 90), and the remaining lines were screened as homozygotes and/or hemizygotes (n = 160). All mutant lines generated passed through all primary phenotypic screens. For most tests in the pipeline, seven males and seven females were used, tested in small batches so that the data for each genotype were gathered on different days (Figure S2A). Assays culminated in the collection of samples at 16 weeks of age (Table S4). The primary screen included a high-fat diet challenge to exacerbate any latent phenotypes. Separate pipelines included challenges with two infectious agents: Salmonella Typhimurium and Citrobacter rodentium (Table S4).

Figure S2

Batch Size and Baseline Variation over Time, Related to Experimental Procedures

(A) Batch size of mutant mice. Frequency distribution of cohort size of mice of the same genotype issued to the phenotyping pipeline at a time. For each mutant allele, typically 3 mice of a defined sex and zygosity were issued to the Clinical Phenotyping Pipeline at one time. However, the number ranged from 1-8 mice issued in a single batch or cohort.

(B) Baseline variation over time. Example of baseline week to week variation seen in the control data. Example shown is red blood cell count presented weekly from 02/04/09 to 29/10/10 for male mice for the strain group B6Brd;B6Dnk;B6N-Tyr. Each boxplot represents data collected from control mice in one week. The size of this effect is significant as shown by some of the box plots not overlapping each other, indicating a Cohen’s d > 3. The pale green area indicates the 95% reference range calculated from the 2.5 and 97.5 percentile values as the data accumulate. Red arrows show the cumulative total of animals contributing to the reference range from 55 mice in May 2009 up to 623 mice in October 2010. The reference range becomes stable after about 70 control mice.

Phenotypic data from the first 250 mutant alleles through the adult pipelines are summarized in Tables S1 and S2, with significant differences from the control baseline (hits) indicated by a red box. To make robust phenotypic calls, a reference range method was implemented that uses accumulated wild-type (WT) data to identify and refine the 95% reference range (Figure S2B). Mutant data were compared to the relevant reference range and variant phenotypes determined using a standardized set of rules (Figure S3). We aimed to highlight phenotypes with large effect sizes. This approach results in conservative calls and minimizes false positives. There was very little missing data (2.14% of all calls; Table S2). The maximum number of parameters collected per line was 263. Of these, 147 were categorical variables, for example normal or abnormal teeth, whereas 116, such as plasma magnesium levels, exhibited a continuous distribution from which outliers were identified. Examples of parameters with continuous variables (cholesterol, high-density lipoprotein [HDL], low-density lipoprotein [LDL], mean weights, and auditory brainstem responses [ABR]) are illustrated in Figure 3.

Figure S3

Decision-Making Process for Calling Hits, Related to Experimental Procedures

The figures show the process we used to call significant hits for three different types of data: (A) continuous, (B) time course and (C) categorical.

Figure 3

Data Distributions for Selected Parameters

(A–F) Distribution of mean total cholesterol (A and B), mean HDL cholesterol (C and D), and mean LDL cholesterol (E and F) at 16 weeks of age in both sexes for 250 unique alleles. Outliers are identified by gene name. The insets in (A)–(F) present the data for one outlier, Sec16b (red circles represent individual mice), compared to the WT controls processed during the same week (green circles), and a cumulative baseline of all WT mice of that age, sex, and genetic background (>260 WT mice) is presented as the median and 95% confidence interval.

(G and H) Distribution of mean body weight at 16 weeks in (G) female and (H) male mutant lines of mice. Outliers are identified by gene name.

(I) Distribution of mean click ABR threshold at 14 weeks (typically n = 4, independent of sex). Outliers are identified by gene name including positive controls highlighted in red.

Gene expression was examined by whole-mount lacZ reporter gene expression in 41 tissues and organs of adults, typically using heterozygotes (≥6 weeks old; n = 243 lines; Table S1). Ubiquitous expression was recorded for eight lines (3.3%) and complete absence of expression in nine lines (3.7%). Of the remaining lines, 168 (69.1%) showed expression in <20 of the tissues, suggesting a relatively specific expression pattern, whereas 58 (23.9%) were more broadly expressed (≥20 tissues with detectable lacZ expression). The data and images can be viewed on the Sanger Institute’s mouse portal, accompanied by step-by-step examples of how to access the data (http://www.sanger.ac.uk/mouseportal/). Much of the raw data can be downloaded from the MGP Phenotyping Biomart (http://www.sanger.ac.uk/htgt/biomart/martview/) for further analysis. Summaries can be found by searching for each gene of interest in Wikipedia (http://en.wikipedia.org/wiki/Category:Genes_mutated_in_mice) and Mouse Genome Informatics (http://www.informatics.jax.org/).

Many Unexpected Phenotypes Discovered

A few examples of the wide range of phenotypes we discovered are illustrated in Figure 4. Body weight and fat/lean composition were among the most common anomalies, with both overweight (n = 2) and underweight (n = 21) mutants discovered. The Kptn mutant is an example of an unexpected phenotype. Kptn is a putative actin binding protein proposed as a candidate for deafness because it is expressed in sensory hair cells (Bearer et al., 2000). Instead, the homozygous Kptn mutant has increased body weight on a high-fat diet (Figure 4A) and increased bacterial counts following Salmonella Typhimurium challenge but normal hearing (Table S2). Additional new phenotypes were detected in genes that had been published previously, such as reduced grip strength and ankylosis of the metacarpophalangeal joints in Dnase1l2 mutants (Fischer et al., 2011), delayed response in the hot plate test in Git2 mutants (Schmalzigaug et al., 2009), and small sebaceous glands in Cbx7 mutants (Forzati et al., 2012) (Figures 4B–4E, 4G, and 4H). Phenotypes were also detected in genes that had not been published previously, such as impaired hearing in Fam107b mutants and elevated plasma magnesium concentration in Rg9mtd2 mutants (Figures 4F and 4I, respectively). These examples demonstrate that many phenotypes will be missed unless they are specifically looked for and illustrate the value of carrying out a broad range of screens with all mutants going through all screens. They also reveal our collective inability to predict phenotypes based on sequence or expression pattern alone.

Figure 4

Examples of Novel Phenotypes from a Wide Range of Assays with Particular Focus on Novel Genes

(A) Elevated body weight gain of Kptn females (n = 7) fed a high-fat diet from 4 weeks of age. Mean ± SD body weight is plotted against age for Kptn females (red line) and local WT controls run during the same weeks (n = 16; green line). The median and 95% reference range (2.5% and 97.5%; dotted lines) for all WT mice of the same genetic background and sex (n = 956 females) are displayed on the pale green background.

(B) Reduced grip strength in Dnase1l2 males (n = 7) (red symbols) compared with controls (n = 8) (green symbols) and the reference range (n = 289). Each mouse is represented as a single symbol on the graph. Median, 25th and 75th percentile (box), and the lowest and highest data point still within 1.5× the interquartile range (IQR) (whiskers) are shown.

(C and D) Ankylosis of the metacarpophalangeal joints (arrowheads) shown by X-ray in Dnase1l2 mice (C) (six of seven males; five of seven females) compared with WT controls (D) correlates with reduced grip strength (B).

(E) Increased latency to respond to heat stimulus in Git2 females (n = 6) (red symbols) compared with controls (n = 4) (green symbols) and the reference range (n = 115), with box and whisker plots on the left (see Figure 4B legend).

(F) Mild hearing impairment at the middle range of frequencies in Fam107b mutants (n = 8) (red line shows mean ± SD) compared with controls (n = 10) and the reference range (n = 440).

(G) Smaller sebaceous glands (indicated by bracket) in Cbx7 mutant tail skin hairs compared with WT (H).

(I) Increased plasma magnesium levels in Rg9mtd2 males (n = 8) (red symbols) compared with local controls (n = 15) (green symbols) and the reference range (n = 241), with box and whisker plots on the left (see Figure 4B legend).

(J) Decreased lean mass in Atp5a1 females (n = 3) (blue symbols) compared with local controls (n = 15) (green symbols) and the reference range (n = 757), with box and whisker plots on the left (see Figure 4B legend).

(K and L) Histopathology showed opacities in the vitreous of eyes from Asx11 mice (K) (arrowheads; scale bar, 500 μm) compared with empty vitreous in WT (L).

(M and N) Higher magnification revealed round opacities extending from the lens into the vitreous (arrowheads; scale bar, 50 μm) in Asx11 mice (M) compared with a normal lens contained within the lens capsule in WT mice (N).

Haploinsufficient and Nonessential Genes

Haploinsufficient phenotypes were detected in 38 of 90 (42%) of these lines. Thus, haploinsufficiency is relatively common, suggesting that screening heterozygotes of knockout lines can yield valuable insight into gene function and provide models for dominantly inherited human disorders. All 90 genes screened as heterozygotes had at least 1 hit (usually viability) and together gave a total of 181 hits (ranging from 1 to 14 per line), an average of 2.0 per line, or 1.0 per line if we consider that abnormal viability is a feature of the homozygote. The distributions of phenotypic hits are shown in Figures 5A and 5B. Two examples of haploinsufficiency are illustrated in Figures 4J–4N.

Figure 5

Characteristics of Phenotypic Hits Detected

(A) Distribution of the number of phenotypic hits in each line screened as homozygotes showing the peak at no hits but a long tail of lines with multiple hits up to 41.

(B) Distribution of hits in lines screened as heterozygotes; all lines had at least one hit (for viability) with a spread up to 14 hits.

(C) Distribution of lines with hits in different disease areas showing a peak of lines with just one area affected (colors indicate which areas) but some lines with multiple disease areas involved, indicating a high degree of pleiotropy.

(D) Principal component analysis score scatterplot showing the deviation of each gene from the first two principal components to visualize the clustering in genes within the multidimensional space. The black ovoid represents the Hotelling’s T2 95% confidence limits. Colored ovoids mark four different clusters of mutant lines. The two main principal components (or latent variables) in the model are significant in explaining 19.2% and 11.7% of the variation, respectively, and are predictive.

(E) Principal component analysis contribution plot indicating the contribution of the variables to the separation between the red and green clusters compared to the blue and yellow clusters in (D). Major phenotypic contributions are labeled.

Key to variables is presented in Table S7.

A total of 837 phenotypic variants were detected in the 250 mutant lines, 1.27% of the total calls (Tables S1 and S2). Of the lines screened as homozygotes or hemizygotes, 35% (56 of 160) appeared completely normal in our screen. There are several possible explanations for the lack of a detected phenotype, such as incomplete inactivation of the gene, a subtle change in phenotype not detected by our screen, or the gene may be nonessential. So far, there is no overlap between the 56 mouse lines with no detected phenotype and genes homozygously inactivated in humans, but both data sets are limited in coverage to date (MacArthur et al., 2012). The remaining 104 homozygous/hemizygous lines gave a total of 656 hits (range 0–41 per line), an average of 6.3 hits per line.

Sensitivity of the MGP Screen

To assess the sensitivity of our screen, the phenotypes were compared with published data on alternative alleles where available. A total of 91 of 250 genes had published data reported in MGI (Table S5), and for 61 of these, our observations detected features of the published phenotypes. Importantly, for 56 genes, a new phenotype was detected by our screen (column K, Table S5). For 31 genes, features of the published phenotype were assessed but not detected by our pipeline. For example, Asxl1 mice are published as being viable (Fisher et al., 2010), but we found that Asxl1 homozygotes were lethal, with none detected among 276 progeny from heterozygous intercrosses (χ2 = 95.13, df = 2; p < 2.2 × 10−16). These discrepant cases may reflect differences in the allele and/or genetic background. In other cases (77 genes), the reported characteristics required a specialized test not included in our screen, such as the calcium signaling defect in cardiomyocytes of Anxa6 mutants (Hawkins et al., 1999).

New Mouse Models for Human Disease

The data set reported here includes 59 orthologs of known human disease genes. We compared our data with human disease features described in OMIM (Table S6). Approximately half (27) of these mutants exhibited phenotypes that were broadly consistent with the human phenotype. However, many additional phenotypes were detected in the mouse mutants suggesting additional features that might also occur in patients that have hitherto not been reported. Interestingly, a large proportion of genes underlying recessive disorders in humans are homozygous lethal in mice (17 of 37 genes), possibly because the human mutations are not as disruptive as the mouse alleles. Of the 59 genes, 26 represent the first mouse mutant with publicly available data. Three examples (Sms, Ap4e1, and Smc3) representing the first targeted mouse mutant for each gene are illustrated in Figure 6, and all show similar phenotypic features to their human counterparts.

Figure 6

Correlated Disease Characteristics in Knockouts of Three Known Human Disease Genes

(A–E) Male hemizygotes for the Sms mutation showed similar features to X-linked Snyder-Robinson syndrome.

(A) Reduced grip strength in Sms/Y mice (n = 8) (purple symbols) compared with WT controls (n = 30) (green symbols) and the reference range (n = 793). Each mouse is represented as a single symbol on the graph, with box and whisker plots on the left (see Figure 4B legend).

(B and C) Decreased lean mass (B) and bone mineral density (C) in Sms/Y mice (n = 8) (purple symbols) compared with controls (n = 27) (green symbols) and the reference range (n = 753), with box and whisker plots on the left (see Figure 4B legend).

(D and E) Lumbar lordosis shown by X-ray (seven of eight males) in Sms/Y (E) compared with WT (D).

(F–J) Ap4e1 mice displayed similarities to spastic quadriplegic cerebral palsy 4.

(F–I) Increased lateral ventricle area (arrowheads in F and G) and decreased corpus callosum span (solid lines in F and G) in Ap4e1 mice (G) compared with WT mice (F) with measurements plotted (mean ± SD) in (H) and (I), respectively (∗p < 0.05, ∗∗ p < 0.01; n = 3 mutant males and 34 WT males). Error bars in (H) and (I) are SD.

(J) Decreased rearing in Ap4e1 females (n = 7) (red symbols) compared with WT controls (n = 8) (green symbols) and the reference range (n = 180), with box and whisker plots on the left (see Figure 4B legend).

(K–O) Surviving Smc3 mice showed similar features to Cornelia de Lange syndrome 3.

(K) Decreased body weight in Smc3 females (n = 7) fed on high-fat diet. Mean ± SD body weight is plotted against age for Smc3 females (blue line), WT mice (n = 24; green line), and the reference range (n = 948).

(L and M) Distinct craniofacial abnormalities in Smc3 mice including upturned snout (M) (three of seven males, one of seven females), which was not observed in WTs (L) (n = 850 male and 859 female).

(N and O) The lacZ reporter gene revealed a distinct Smc3 expression pattern including (N) hair follicles and (O) key brain substructures, noteworthy because of the hirsutism and neurodevelopmental delay aspects of Cornelia de Lange syndrome 3.

Pleiotropic Effects of Mutations

The phenotypes detected in this study vary from discrete specific defects (e.g., decreased platelet cell number in Crlf3 mutants) to complex phenotypes in which many organ systems are involved (e.g., Spns2 homozygotes show eye, hearing, and immune defects; Nijnik et al., 2012). The distribution of phenotypic hits is shown in Figures 5A and 5B for homozygous and heterozygous mutants, respectively. The peak for homozygotes was the category with no detected abnormalities, whereas the second biggest group consists of mutants with just one phenotypic call. The lines examined as heterozygotes all have at least one hit (viability), but 20 lines have in addition one other abnormal phenotype, and a handful have several. Classifying parameters into five disease categories, we analyzed the distribution of disease areas represented across all 250 mouse lines. The most common phenotypic call was in the category reproduction, development, and musculoskeletal (Figure 5C). Some abnormal phenotypes are clearly not primary effects; for example, reduced weight may be a secondary consequence of a number of different primary defects. Given that certain phenotypic features would be expected to co-occur frequently, reflecting physiological or developmental associations, a principal component analysis was conducted to look for correlated patterns in the data. Plotting principal component 1 against 2 revealed four main clusters of mouse lines (colored ovoids in Figure 5D). The separation along principal component 2 arises from viability. The remaining separation of clusters marked by red and green from clusters marked blue and yellow (Figure 5D) arises from body weight and associated variables, including DEXA measurements and energy use (Figure 5E). Body weight is a common covariable in disease (Reed et al., 2008), so it is not surprising that it dominates the principal component analysis.

Features of Essential Genes

Genes are generally defined as essential if they are required for survival or fertility. Studies in yeast and worms suggest that genes with paralogs are much less likely to be essential, presumably because the paralog can compensate for the function of the inactivated gene (Gu et al., 2003; Conant and Wagner, 2004). Previous analyses of published data on mouse knockouts did not find a significant difference in essential genes between singleton and duplicated genes (Liang and Li, 2007; Liao and Zhang, 2007). However, the published gene set is biased toward genes involved in development (Makino et al., 2009). In contrast, we found that genes in our set without a paralog were more than twice as likely to be essential, a significant effect (Table S3; Figure 7A).

Figure 7

Features Associated with Essential Genes

Essential genes (black bars) are compared with genes that are not essential for viability (red bars). The asterisk (∗) indicates significant difference. ns, no significant difference in proportion of essential genes between the two categories. Statistics are presented in Table S3.

(A) Genes with no paralog show a significantly larger proportion of essential lines than genes with at least one paralog.

(B) Genes predicted to contribute to protein complexes showed a significantly larger proportion of essential lines than genes not predicted to contribute to a complex.

(D) Genes known to underlie human disease were no more likely to be essential than genes not yet associated with human disease.

We next asked if the essential genes in our gene set are more likely to be involved in a protein complex, using an experimentally validated data set of human protein complexes from the CORUM database (Ruepp et al., 2010). We found that genes with a human ortholog that is part of a complex were significantly more likely to be essential (Table S3; Figure 7B). Finally, we asked if there were certain types of gene products that were more likely to be important for viability/fertility than others. In humans, transcription factor mutations appear enhanced in prenatal disease, and enzymes are overrepresented in diseases with onset in the first year after birth (Jimenez-Sanchez et al., 2001). We investigated four classes of protein identified by GO terms: transcription factors (n = 7), transmembrane proteins (n = 50), enzymes (n = 131), and chromatin-associated proteins (n = 24). Numbers of each were limited, but there was no significant enrichment for essential genes among any of the four groups (Tables S2 and S3). In summary, we found that essential genes were less likely to have a paralog and more likely to be part of a protein complex, but no specific class of protein appeared more likely to be predictive of essentiality.

Annotating the Function of Novel Genes

There is a large bias in the literature toward analysis of known genes (Edwards et al., 2011), but are genes that have yet to be examined experimentally less likely to underlie disease? Genes in our set that had no associated publications (other than high-throughput genome-wide reports) were compared with genes where some aspect of their function had been described. The proportion of essential genes among the novel set was not significantly different from the known genes (Figure 7C). Furthermore, there was no significant difference in the number of hits observed per line between known and novel genes (Tables S2 and S3). As a second test, we asked if genes with orthologs involved in human disease (having an OMIM disease ID) were enriched in essential genes or the number of phenotypic hits compared with genes not (yet) ascribed to human disease, but there was no significant difference (Tables S2 and S3; Figure 7D). Finally, we compared genes that had been proposed for inclusion by the community (n = 87) with those with no specific request to ask if genes of interest to the community were more likely to be essential or to have detected phenotypes. There was no significant difference between the two groups (Table S3). Thus, known genes are no more likely to be involved in disease than novel genes, emphasizing that much new biology will be uncovered from the analysis of mutations in novel genes.

Discussion

Genetic studies in mice via targeted mutagenesis of ES cells have been successful at illuminating selected aspects of the function of more than 7,000 mammalian genes. However, until recently, these studies have been conducted by individual laboratories and largely directed at previously studied genes. The focused collection of phenotypic information from these mutants has been very information rich, but many aspects remain undetected because they are outside the area of interest of the laboratory generating the mutant. Individual endeavors have led to wide variation in allele design and genetic backgrounds used, and all too often, the mutant is not available to other groups for further analysis. In contrast, the mutant mice described here have the advantage of a common genetic background and a standard allele design with the option of generating conditional mutations, and all are available from public repositories. The phenotyping described here was not intended to provide an exhaustive characterization of the phenotype of the mutant lines but, rather, to place mutant alleles into broad categories by using screens, generating a pool of genetic resources from which individual mutants can be selected based on their phenotype for secondary follow-up studies. Several of the mutants have been analyzed further following an initial phenotypic observation in the screen, and these add to the depth of our knowledge of biological mechanisms of disease (e.g., Nijnik et al., 2012; Crossan et al., 2011). As the assembled data expands, it will become possible to discern patterns between phenotypes and come to more holistic conclusions about categories of genes. Genes linked by common phenotypes can be grouped together to test for regulatory or other functional interactions and ultimately placed into pathways that in turn will implicate other genes in the disease process. For example, of the four genes associated with abnormal fasting glucose levels in our data set, Slc16a2 can be linked to Ldha via regulation of L-triiodothyronine (Friesema et al., 2006; Miller et al., 2001), but the other two genes, Nsun2 and Cyb561, have no reported regulatory links apart from in vitro protein-protein interactions, so these represent candidates to investigate further. Already some broad conclusions can be drawn from the data set, such as the value of analyzing novel genes, the increased incidence of essentiality in genes with no paralog, and the increased number of genes required for male compared to female fertility. Many completely unexpected associations between genes and phenotypes have been discovered, illustrating the value of a broad-based screen. Another aspect of our study was the examination of heterozygous mutants, a genotype that often is not studied by individual laboratories. Although this was restricted to mutants that displayed lethality or subviability of homozygotes, it revealed a number of genes with haploinsufficiency, a feature commonly associated with mutations in the human genome but rarely described in mouse knockouts. The tests used in screening varied considerably in their complexity, cost, and suitability in a high-throughput scenario. The performance of these tests across 250 alleles provided insight into those that should be included or excluded in the efforts to examine 5,000 alleles through the activities of the IMPC. Key considerations are variance in the control group, specificity, sensitivity, effect size, and redundancy. The major contribution of null alleles will be an improved understanding of biological processes and molecular mechanisms of disease. The null allele will give insight into the temporal and spatial requirements for the gene and will contribute to the establishment of gene networks involved in mammalian disease processes. Furthermore, our data set demonstrates that many features of human Mendelian diseases can be found in the corresponding mouse mutant. The mouse alleles studied here are expected to be null alleles or strong hypomorphs, which may not always reflect the consequence of the human mutation. However, null alleles should reveal haploinsufficiency and recessive effects due to deleterious mutations such as frameshift and nonsense mutations. Null alleles in the mouse are likely to make the largest impact upon understanding human diseases caused by rare variants of large effect size. Complex multifactorial diseases, which may depend on human-specific variants with small effect size or more specific molecular effects such as gain-of-function mutations, will require more customized approaches such as knockin of specific human mutations. Alternative approaches using the mouse for discovering loci underlying complex disease include the Hybrid Mouse Diversity panel and the Collaborative Cross (reviewed by Flint and Eskin, 2012). These allow interrogation of many different loci simultaneously and study of epistatic interactions and can lead to identification of single gene variants causing disease (e.g., Orozco et al., 2012; Andreux et al., 2012), when variants affecting the trait of interest are present in the founders. ENU mutagenesis is another powerful technique that can be used to produce allelic series of mutations with differing effects upon function of single genes (e.g., Andrews et al., 2012). However, the null alleles that we describe here are a complement to these alternative approaches and will be invaluable for defining mechanisms of gene function on a standard genetic background. The study described in this report builds on the large KOMP/EUCOMM resource of targeted mutations in mouse ES cells (Skarnes et al., 2011) and illustrates the breadth of phenotypic information that can be garnered from an organized effort. The Clinical Phenotyping Pipeline optimized here has been adopted by several other programs within the IMPC; multiple groups are now working together to extend what is described in this report for 250 genes to 5,000 genes over the next 4 years with the vision that this will eventually cover all protein-coding genes. The primary phenotypes and genetic resources emerging from these programs will make a significant contribution to our understanding of mammalian gene function.

Experimental Procedures

Animals

Mice carrying knockout first conditional-ready alleles (Figures S1A and S1B) were generated from the KOMP/EUCOMM targeted ES cell resource using standard techniques. Eight in-house lines were included as known mutant controls. Details of the 250 lines can be found in Table S2. All lines are available from http://www.knockoutmouse.org/; or mouseinterest@sanger.ac.uk. Mice were maintained in a specific pathogen-free unit under a 12 hr light, 12 hr dark cycle with ad libitum access to water and food. The care and use of mice were in accordance with the UK Home Office regulations, UK Animals (Scientific Procedures) Act of 1986.

Genotyping and Allele Quality Control

Short-range, long-range and quantitative PCR strategies (http://www.knockoutmouse.org/kb/25/) were used to evaluate the quality of each allele (Figure S1C). A subset of these assays was used to genotype offspring. The degree of knockdown in homozygotes was assessed by qRT-PCR of adult liver in a subset known to show expression in liver. Details are given in Extended Experimental Procedures.

Phenotyping Pipeline and Tests

The typical workflow from chimera to primary phenotyping pipelines and an outline of the clinical phenotyping pipeline are presented in Figure 1. Details of batch size are given in Figure S2A. These pipelines include established tests used to characterize systematically every line of mice as described in Table S4.

Histochemical Analysis of the lacZ Reporter

Adult whole-mount lacZ reporter gene expression was carried out essentially as described by Valenzuela et al. (2003).

Statistical and Bioinformatic Analysis

For continuous data, including time course, a reference range approach was used to identify phenotypic variants as detailed in Figures S3A and S3B. Fisher’s exact test was used to assess categorical data (Figure S3C). These automated calls were complemented by a manual assessment made by biological experts. An example of the establishment of the reference range is given in Figure S2B. Downstream data analysis was performed using SPSS (version 17.0.2), R, and SIMCA-P (V-12.0, Umetrics). The data structure and biological question determined the statistical test used; details are in Table S3. Principal components analysis was performed in SIMCA-P (http://www.umetrics.com). Further details of analyses and gene annotations are given in Extended Experimental Procedures.

Online Database

Results can be accessed at http://www.sanger.ac.uk/mouseportal/, accompanied by step-by-step examples of how to navigate the data. Alternatively, much of the raw data can be downloaded from the MGP Phenotyping Biomart at http://www.sanger.ac.uk/htgt/biomart/martview/. Advice on navigating this Biomart is provided at ftp://ftp.sanger.ac.uk/pub/mgp/extracting_mouse_genetic_program_raw_phenotyping_data.docx. Results have also been summarized in Wikipedia (http://en.wikipedia.org/wiki/Category:Genes_mutated_in_mice) and Mouse Genome Informatics (http://www.informatics.jax.org/).

Animals

The mutant lines screened are listed in Table S2. We generated most of the mutant lines (242/250) reported in detail here using the EUCOMM/KOMP knockout first conditional-ready targeted ES cell resource on a C57BL/6N background (Skarnes et al., 2011) (Figures S1A and S1B). We maintained the mice on a consistent inbred C57BL/6N background (n = 47 lines), or for early lines on mixed C57BL/6 backgrounds (e.g., 190 lines were maintained on a C57BL/6N;C57BL/6Brd-Tyr background), to minimize variation in screening results due to strain variation and to facilitate comparison across mutant lines. Eight mutant lines with known phenotypes from other sources were included in early screening as positive controls. All lines are available from http://www.knockoutmouse.org/; or mouseinterest@sanger.ac.uk. Mice were maintained ad libitum on Mouse Breeders Diet (LabDiets 5021-3, IPS, Richmond, USA) unless otherwise stated.

Genotyping and Allele Quality Control

ES cell QC was performed as described (Skarnes et al., 2011). Furthermore, extensive quality control of each allele was performed in mice using a panel of short-range PCR assays (specific for the mutant or wild-type allele, the lacZ reporter gene, 5′ FRT site or 3′ loxP site), quantitative (q) PCR assays (neo cassette and loss of wild-type allele counting systems), and long-range (LR) PCR assays (5′ and 3′ using one primer in the cassette and another outside of the homology arms of the allele design) as summarized in Figure S1C. Further details of the QC protocols are given at http://www.knockoutmouse.org/kb/25/. Typically, mice were genotyped at postnatal day (P)14 using a combination of the three short-range PCR assays or the qPCR neo cassette allele-counting assay. Upon completion of phenotyping, genotyping was repeated and data were only accepted from mice for which the second genotype was concordant with the P14 genotype. A subset of 25 lines, selected because of previously-reported gene expression in liver, was used to assess the degree of knockdown resulting from the targeting event. A TaqMan assay was devised to detect wild-type splicing between the exons on either side of the mutagenic cassette. Samples showing wild-type expression by qRTPCR were confirmed by end-point RTPCR and sequencing. ∼350ng of total RNA extracted from liver was used in each reaction, performed in triplicate as duplex reactions using the RNA-to-Ct one step kit (Applied Biosystems) with Gapdh or B2m as endogenous controls and analyzed using a 7900HT qPCR machine with RQ manager software v1.2 (Applied Biosystems).

Phenotyping Pipeline and Tests

Viability was assessed at P14 and, for those lines classed as lethal or sub-viable at P14, again at E14.5 (Figure 1A). A minimum of 28 genotype-confirmed, live progeny from heterozygous intercrosses were required to assess viability. Based on exact binomial probability calculations, zero homozygotes from 28 progeny gave 95% confidence that the probability of homozygote survival was ≤ 40%. Lines were classed as homozygous fertile if offspring were born from homozygous parents, regardless of whether the offspring survived to weaning. At 4 weeks of age, mice undergoing the Clinical Phenotyping Pipeline (Figure 1B) were transferred from Mouse Breeders Diet to a high fat (21.4% fat by crude content; 42% calories provided by fat) dietary challenge (Special Diet Services Western RD 829100, SDS, Witham, UK) for the remainder of the pipeline. This pipeline included the established phenotyping tests described in Table S4. For most tests in this pipeline 7 male and 7 female mutants were used, with the exceptions of the Auditory Brainstem Response (n = 4, independent of gender) and erythrocyte micronuclei (7 males), both deemed to be sufficient with the reduced numbers, and indirect calorimetry (7 males) and biobanking of 41 tissues and organs in paraffin blocks (2 males, 2 females), both limited by operational constraints. If homozygotes were lethal, difficult to produce due to sub-viability or were unsuitable for screening due to welfare concerns, we used heterozygotes for screening. For each genetic background, control cohorts (7M + 7F) were run each week. A second primary pipeline included challenges with two infectious agents, Citrobacter rodentium (8 females) and Salmonella Typhimurium (8 males), with matched controls run simultaneously (Table S4). In both challenges we looked at colonization of target tissues at 14 and 28 days post infection. Tissue was biobanked in paraffin and serum taken from the Salmonella challenged animals to measure antigen specific IgG (and subclass) antibodies.

Statistical and Bioinformatic Analysis

For GO term enrichment using TermFinder (Boyle et al., 2004) only high quality experimental evidence codes were included (EXP, IDA, IMP, IPI, IGI, IEP and IC). Positive control lines were excluded from the gene set for this analysis. For GO term enrichment using FuncAssociate v2.0 (Berriz et al., 2009) (http://llama.mshri.on.ca/funcassociate/), the evidence code IEA (Inferred from Electronic Annotation) was excluded. This software used a gene association file downloaded from ftp.geneontology.org on 26th September 2011. Of the 14139 GO terms queried, 83 (0.59%) were classed as being over-represented and 6 (0.04%) under-represented in our gene set. Revigo was used to reduce the GO term redundancy to give a representative subset of terms (Supek et al., 2011). The gene set was found to be under-represented in only one area: “Sensory perception of smell.” The gene set was found to be over-represented in 20 spread over a variety of processes with no one area dominating. Principal components analysis, performed in SIMCA-P (http://www.umetrics.com), was used to assess gene clustering based on variation across the phenotypic hits. The methodology was as default, except that data rescaling was turned off as it was not required for this binary data set. Model predictivity was confirmed by a cross validation procedure (Wold, 1978). The differences between the clusters were investigated using a contribution plot, which showed the differences in scaled units between the highlighted groups for all variables in the model. Using G∗Power (Version 3.1.3) sensitivity approach (Erdfelder et al., 1996) we calculated the power of our gene set to ask if specific types of protein were more likely to be important for viability and fertility. For a target power of 80%, with the numbers of genes we had in each category, the transcription factor data set needed an effect size (ES) of 45%, the transmembrane proteins needed an ES of 19.3%, the enzyme data set needed an ES of 19% and the chromatin-associated proteins needed an ES of 47% to be detected at the 0.05 threshold. For protein characterization, amino acid sequences were obtained from Ensembl, release 59, and the longest protein product was chosen as a representative for each gene. Paralog assignments were also obtained from Ensembl (Flicek et al., 2011). Transcription factor annotations were obtained from the DBD database (Wilson et al., 2008) and transmembrane regions were predicted with Phobius (Käll et al., 2004). Involvement in a protein complex was inferred from CORUM, an experimentally-validated data set of human protein complexes (Ruepp et al., 2010).

47 in total

1. Next generation software for functional trend analysis.

Authors: Gabriel F Berriz; John E Beaver; Can Cenik; Murat Tasan; Frederick P Roth
Journal: Bioinformatics Date: 2009-08-28 Impact factor: 6.937

2. Systems genetics of metabolism: the use of the BXD murine reference panel for multiscalar integration of traits.

Authors: Pénélope A Andreux; Evan G Williams; Hana Koutnikova; Riekelt H Houtkooper; Marie-France Champy; Hugues Henry; Kristina Schoonjans; Robert W Williams; Johan Auwerx
Journal: Cell Date: 2012-08-30 Impact factor: 41.582

3. Loss-of-function Additional sex combs like 1 mutations disrupt hematopoiesis but do not cause severe myelodysplasia or leukemia.

Authors: Cynthia L Fisher; Nicolas Pineault; Christy Brookes; Cheryl D Helgason; Hideaki Ohta; Caroline Bodner; Jay L Hess; R Keith Humphries; Hugh W Brock
Journal: Blood Date: 2009-10-27 Impact factor: 22.113

Review 4. Introduction to the Japan Mouse Clinic at the RIKEN BioResource Center.

Authors: Shigeharu Wakana; Tomohiro Suzuki; Tamio Furuse; Kimio Kobayashi; Ikuo Miura; Hideki Kaneda; Ikuko Yamada; Hiromi Motegi; Hideaki Toki; Maki Inoue; Osamu Minowa; Tetsuo Noda; Kazunori Waki; Nobuhiko Tanaka; Hiroshi Masuya; Yuichi Obata
Journal: Exp Anim Date: 2009-10

5. NIH Mouse Metabolic Phenotyping Centers: the power of centralized phenotyping.

Authors: Maren R Laughlin; K C Kent Lloyd; Gary W Cline; David H Wasserman
Journal: Mamm Genome Date: 2012-09-01 Impact factor: 2.957

6. A systematic survey of loss-of-function variants in human protein-coding genes.

Authors: Daniel G MacArthur; Suganthi Balasubramanian; Adam Frankish; Ni Huang; James Morris; Klaudia Walter; Luke Jostins; Lukas Habegger; Joseph K Pickrell; Stephen B Montgomery; Cornelis A Albers; Zhengdong D Zhang; Donald F Conrad; Gerton Lunter; Hancheng Zheng; Qasim Ayub; Mark A DePristo; Eric Banks; Min Hu; Robert E Handsaker; Jeffrey A Rosenfeld; Menachem Fromer; Mike Jin; Xinmeng Jasmine Mu; Ekta Khurana; Kai Ye; Mike Kay; Gary Ian Saunders; Marie-Marthe Suner; Toby Hunt; If H A Barnes; Clara Amid; Denise R Carvalho-Silva; Alexandra H Bignell; Catherine Snow; Bryndis Yngvadottir; Suzannah Bumpstead; David N Cooper; Yali Xue; Irene Gallego Romero; Jun Wang; Yingrui Li; Richard A Gibbs; Steven A McCarroll; Emmanouil T Dermitzakis; Jonathan K Pritchard; Jeffrey C Barrett; Jennifer Harrow; Matthew E Hurles; Mark B Gerstein; Chris Tyler-Smith
Journal: Science Date: 2012-02-17 Impact factor: 47.728

7. Essential role of the keratinocyte-specific endonuclease DNase1L2 in the removal of nuclear DNA from hair and nails.

Authors: Heinz Fischer; Sandra Szabo; Jennifer Scherz; Karin Jaeger; Heidemarie Rossiter; Maria Buchberger; Minoo Ghannadan; Marcela Hermann; Hans-Christian Theussl; Desmond J Tobin; Erwin F Wagner; Erwin Tschachler; Leopold Eckhart
Journal: J Invest Dermatol Date: 2011-02-10 Impact factor: 8.551

8. Ensembl 2011.

Authors: Paul Flicek; M Ridwan Amode; Daniel Barrell; Kathryn Beal; Simon Brent; Yuan Chen; Peter Clapham; Guy Coates; Susan Fairley; Stephen Fitzgerald; Leo Gordon; Maurice Hendrix; Thibaut Hourlier; Nathan Johnson; Andreas Kähäri; Damian Keefe; Stephen Keenan; Rhoda Kinsella; Felix Kokocinski; Eugene Kulesha; Pontus Larsson; Ian Longden; William McLaren; Bert Overduin; Bethan Pritchard; Harpreet Singh Riat; Daniel Rios; Graham R S Ritchie; Magali Ruffier; Michael Schuster; Daniel Sobral; Giulietta Spudich; Y Amy Tang; Stephen Trevanion; Jana Vandrovcova; Albert J Vilella; Simon White; Steven P Wilder; Amonida Zadissa; Jorge Zamora; Bronwen L Aken; Ewan Birney; Fiona Cunningham; Ian Dunham; Richard Durbin; Xosé M Fernández-Suarez; Javier Herrero; Tim J P Hubbard; Anne Parker; Glenn Proctor; Jan Vogel; Stephen M J Searle
Journal: Nucleic Acids Res Date: 2010-11-02 Impact factor: 16.971

9. A resource of vectors and ES cells for targeted deletion of microRNAs in mice.

Authors: Haydn M Prosser; Hiroko Koike-Yusa; James D Cooper; Frances C Law; Allan Bradley
Journal: Nat Biotechnol Date: 2011-08-07 Impact factor: 54.908

10. DBD--taxonomically broad transcription factor predictions: new content and functionality.

Authors: Derek Wilson; Varodom Charoensawan; Sarah K Kummerfeld; Sarah A Teichmann
Journal: Nucleic Acids Res Date: 2007-12-11 Impact factor: 16.971

257 in total

1. Experimental Modeling Supports a Role for MyBP-HL as a Novel Myofilament Component in Arrhythmia and Dilated Cardiomyopathy.

Authors: David Y Barefield; Megan J Puckelwartz; Ellis Y Kim; Lisa D Wilsbacher; Andy H Vo; Emily A Waters; Judy U Earley; Michele Hadhazy; Lisa Dellefave-Castillo; Lorenzo L Pesce; Elizabeth M McNally
Journal: Circulation Date: 2017-08-04 Impact factor: 29.690

2. Scaling up phenotyping studies.

Authors: Karen L Svenson
Journal: Nat Biotechnol Date: 2015-11 Impact factor: 54.908

3. FAM92A Underlies Nonsyndromic Postaxial Polydactyly in Humans and an Abnormal Limb and Digit Skeletal Phenotype in Mice.

Authors: Isabelle Schrauwen; Arnaud Pj Giese; Abdul Aziz; David Tino Lafont; Imen Chakchouk; Regie Lyn P Santos-Cortez; Kwanghyuk Lee; Anushree Acharya; Falak Sher Khan; Asmat Ullah; Deborah A Nickerson; Michael J Bamshad; Ghazanfar Ali; Saima Riazuddin; Muhammad Ansar; Wasim Ahmad; Zubair M Ahmed; Suzanne M Leal
Journal: J Bone Miner Res Date: 2018-11-05 Impact factor: 6.741

4. Quantitative reduction of the TCR adapter protein SLP-76 unbalances immunity and immune regulation.

Authors: Owen M Siggs; Lisa A Miosge; Stephen R Daley; Kelly Asquith; Paul S Foster; Adrian Liston; Christopher C Goodnow
Journal: J Immunol Date: 2015-02-06 Impact factor: 5.422

Review 5. PRMT7 as a unique member of the protein arginine methyltransferase family: A review.

Authors: Kanishk Jain; Steven G Clarke
Journal: Arch Biochem Biophys Date: 2019-02-22 Impact factor: 4.013

6. Region-specific Expression of NMDA Receptor GluN2C Subunit in Parvalbumin-Positive Neurons and Astrocytes: Analysis of GluN2C Expression using a Novel Reporter Model.

Authors: Aparna Ravikrishnan; Pauravi J Gandhi; Gajanan P Shelkar; Jinxu Liu; Ratnamala Pavuluri; Shashank M Dravid
Journal: Neuroscience Date: 2018-03-17 Impact factor: 3.590

7. Drug repurposing for glioblastoma based on molecular subtypes.

Authors: Yang Chen; Rong Xu
Journal: J Biomed Inform Date: 2016-09-30 Impact factor: 6.317

8. Wash exhibits context-dependent phenotypes and, along with the WASH regulatory complex, regulates Drosophila oogenesis.

Authors: Jeffrey M Verboon; Jacob R Decker; Mitsutoshi Nakamura; Susan M Parkhurst
Journal: J Cell Sci Date: 2018-04-13 Impact factor: 5.285

Review 9. Usher syndrome: Hearing loss, retinal degeneration and associated abnormalities.

Authors: Pranav Mathur; Jun Yang
Journal: Biochim Biophys Acta Date: 2014-12-04

10. Combining Human Disease Genetics and Mouse Model Phenotypes towards Drug Repositioning for Parkinson's disease.

Authors: Yang Chen; Xiaoshu Cai; Rong Xu
Journal: AMIA Annu Symp Proc Date: 2015-11-05