Literature DB >> 20092628

Systematic analysis, comparison, and integration of disease based human genetic association data and mouse genetic phenotypic information.

Yonqing Zhang1, Supriyo De, John R Garner, Kirstin Smith, S Alex Wang, Kevin G Becker.   

Abstract

BACKGROUND: The genetic contributions to human common disorders and mouse genetic models of disease are complex and often overlapping. In common human diseases, unlike classical Mendelian disorders, genetic factors generally have small effect sizes, are multifactorial, and are highly pleiotropic. Likewise, mouse genetic models of disease often have pleiotropic and overlapping phenotypes. Moreover, phenotypic descriptions in the literature in both human and mouse are often poorly characterized and difficult to compare directly.
METHODS: In this report, human genetic association results from the literature are summarized with regard to replication, disease phenotype, and gene specific results; and organized in the context of a systematic disease ontology. Similarly summarized mouse genetic disease models are organized within the Mammalian Phenotype ontology. Human and mouse disease and phenotype based gene sets are identified. These disease gene sets are then compared individually and in large groups through dendrogram analysis and hierarchical clustering analysis.
RESULTS: Human disease and mouse phenotype gene sets are shown to group into disease and phenotypically relevant groups at both a coarse and fine level based on gene sharing.
CONCLUSION: This analysis provides a systematic and global perspective on the genetics of common human disease as compared to itself and in the context of mouse genetic models of disease.

Entities:  

Mesh:

Year:  2010        PMID: 20092628      PMCID: PMC2822734          DOI: 10.1186/1755-8794-3-1

Source DB:  PubMed          Journal:  BMC Med Genomics        ISSN: 1755-8794            Impact factor:   3.063


Background

Common complex diseases such as cardiovascular disease, cancer, and autoimmune disorders; metabolic conditions such as diabetes and obesity, as well as neurological and psychiatric disorders make up a majority of health morbidity and mortality in developed countries. The specific genetic contributions to disease etiology and relationships to environmental factors in common disorders are unclear; complicated by many factors such as gene-gene interactions, the balance between susceptibility and protective alleles, copy number variation, low relative risk contributed by each gene, and a myriad of complex environmental inputs. Genetic association studies using a candidate gene approach and more recently whole genome association studies (GWAS) have produced a large and rapidly increasing amount of information on the genetics of common disease. In parallel, mouse genetic models for human disease have provided a wealth of genetic and phenotypic information. While not always perfect models for human common complex disorders, the genetic purity and experimental flexibility of mouse disease models have produced valuable insights relevant to human disease. Gene nomenclature standardization[1], database efforts [2-4], and phenotype ontology projects[5] in both human and mouse over the past decade have provided the foundation for integration of information on genetic contributions to disease and phenotypes. This allows the opportunity for systematic comparison and higher order systems analysis of disease and phenotypic information. In this report, we summarize and integrate large scale information on human genetic association information and mouse genetically determined phenotypic information with the goal of identifying fundamental relationships in human disease and mouse models of human disease.

Methods

The Genetic Association Database

The Genetic Association Database [2] (GAD) http://geneticassociationdb.nih.gov is an archive of summary data of published human genetic association studies of many common disease types. GAD is primarily focused on archiving information on common complex human disease rather than rare Mendelian disorders as found in the Online Mendelian Inheritance in Man (OMIM)[6]. GAD contains curated information on candidate gene studies and more recently on genome wide association studies. It builds on the curation of the CDC HuGENet info literature database [3] in part by adding molecular and ontological annotation creating a bridge between epidemiological and molecular information. This allows the large-scale integration of disease based genetic association information with genomic and molecular information as well as with the software tools and computational approaches and that use genomic information [7-12]. This report is a summary and analysis of the genes and diseases with positive associations in the Genetic Association Database with regard to replication, comparisons between diseases, and within broad phenotypic disease classes. Although GAD contains information on gene variation, this report is at the gene level only and does not consider specific gene variation or genetic polymorphism. The Genetic Association Database (GAD) currently contains approximately 40,000 individual gene records of genetic association studies taken from over 23,000 independent publications. Importantly, a large number (11,568) of the records in GAD have a designation of whether the gene of record was reported to be associated (Y) or was not (N) associated with the disease phenotype for that specific record. Many records, for various reasons, do not have such a designation. In addition, a portion of the database records have been annotated with standardized disease phenotype keywords from the MeSH http://www.nlm.nih.gov/mesh/ vocabulary. The GAD summations shown below are a subset of the records in GAD. They only include those records that are both; a) positively associated with a disease phenotype, and b) have a MeSH disease phenotype annotation. This represents a subset of 10,324 records having both positive associations to disease and records with MeSH annotations. Records designated as not associated (N) with a disease phenotype and those without MeSH disease annotation are not considered at this time in this report.

Mouse phenotypic database

The mouse phenotypic information described here was obtained from the Mouse Genome Informatics (MGI) database [4]http://www.informatics.jax.org/ Phenotypes, Alleles and Disease Models section. The file used for mouse phenotypic information (see methods) is comprised of 5011 unique genes and 5142 unique phenotypic terms derived from information from specific gene mutations in multiple mouse strains. The mouse phenotypic information had been annotated to the mouse gene mutation records using Mammalian Phenotype terms and codes in the mouse phenotype database as a component of the Mouse Phenotyping Project [5,13].

Quantitation of genes and disease phenotypes

Quantitation of how often a disease phenotype was positively associated with a gene was performed as follows. GAD records having both recorded positive associations and annotated MeSH disease keywords were extracted and stored in a database according to their relationships. Using a perl script, the number of times of co-occurrence of a MeSH disease keyword was positively associated with a specific gene was recorded as found in the GAD database. These counts were sorted in declining order for each unique gene grouped by the disease MeSH term with which they are associated.

Mouse phenotypic information

The mouse phenotypic information described here was obtained from the Mouse Genome Informatics (MGI) http://www.informatics.jax.org/; Phenotypes, Alleles and Disease Models section; ftp://ftp.informatics.jax.org/pub/reports/index.html#pheno Using these three files downloaded on 4-4-2008 ftp://ftp.informatics.jax.org/pub/reports/MPheno_OBO.ontology ftp://ftp.informatics.jax.org/pub/reports/MGI_PhenotypicAllele.rpt ftp://ftp.informatics.jax.org/pub/reports/MGI_PhenoGenoMP.rpt The mouse phenotype files were extracted using a perl script annotating each gene with the phenotype term associated with each Mammalian Phenotype (MP) code.

Venn Diagram overlap of individual gene lists

Individual GAD primary gene sets were analyzed using Venny[14]http://bioinfogp.cnb.csic.es/tools/venny/index.html. Pathway Venn Diagram comparisons were performed by placing individual GAD primary gene sets into WebGestalt [15]http://bioinfo.vanderbilt.edu/webgestalt/ to identify KEGG pathways, then placing the resulting pathway names into Venny.

Dendrogram analysis of gene sets

Relationships between diseases were identified by a unique method similar to phyologenetic classification. First the distance between the diseases were calculated by pairwise comparison of the diseases by finding the common genes between the pairs and dividing it by the smallest group of the pair. This number was then subtracted from 1. This step was done because if two lists are identical (100% match) then the resultant distance should be 0. This is represented in the formula: Where: C: Genes in each disease set (where k = i, j); N(C): Number of genes in each disease set (where k = i, j); dij is the pairwise distance; i, j: index of genes in each disease set where; i = 1, 2, 3, ........., n; j = 1, 2, 3, ........., m The disease relationships were calculated from the distance matrix using the Fitch program from the Phylip package[16]. It calculates the relationships based on the Fitch and Margoliash method of constructing the phylogenetic trees[17] using the following formula (from the Phylip manual): where D is the observed distance between gene sets i and j and d is the expected distance, computed as the sum of the lengths of the segments of the tree from gene set i to gene set j. The quantity n is the number of times each distance has been replicated. In simple cases n is taken to be one. If n is chosen more than 1, the distance is then assumed to be a mean of those replicates. The power P is what distinguished between the Fitch and Neighbor-Joining methods. For the Fitch-Margoliash method P is 2.0 and for Neighbor-Joining method it is 0.0. As running Fitch took a long time when the gene-set size was huge (weeks for the human gene-sets and months for the mouse gene-sets), Neighbor-Joining method was used to create the replicate dendrograms (not shown) after randomizing the input order for greater confidence. The resulting coefficient matrix files were displayed using the Phylodraw graphics program[18].

Hierarchical clustering of gene sets

Ward's minimum variance method[19] was used to find the distance between two diseases. The distance between the clusters is the ANOVA sum of squares between the two clusters added up over all the variables. At each generation, the within-cluster sum of squares is minimized over all partitions obtainable by merging two clusters from the previous generation. Ward's method joins clusters to maximize the likelihood at each level of the hierarchy under the assumptions of multivariate normal mixtures, spherical covariance matrices, and equal sampling probabilities. Distance for Ward's method is: (taken from JMP Manual) where NK is the number of observations in CK (which is the Kth cluster, subset of {1, 2, ..., n) where n is the number of observations). is the mean vector for cluster CK.

Results

Each record in GAD represents a specific gene from a unique publication of a human population based genetic association study and is categorized into one of 24 general disease classes corresponding to broad MeSH disease or disease phenotypic groupings. Table 1 is a summary of the number of positively associated human genes in each MeSH human disease class. As represented by these disease classes the GAD database covers a broad selection of diseases falling into major disease classes including; aging studies, cancer, immune disorders, psychiatric diseases, metabolic conditions, pharmacogenomic studies, and studies of chemical dependency, among others. Similarly, each record in the phenotype files from the MGI phenotype database represents a unique mouse gene specific genetic model. Table 2 shows the general categories represented by the mouse phenotype summary files and the number of mouse genes found in each top level phenotype class. The mouse files contain a greater number of intermediate developmental and morphological phenotypes (e.g. insulin resistance, absent CD4+ T cells, abnormal spatial learning) while the human files tend to comprise a greater number of end stage clinical disease phenotypes (e.g. Type 2 Diabetes, multiple sclerosis, autism).
Table 1

Number of human genes associated in each Disease Class

DISEASE CLASS# of human genes in each disease class
Neoplasms1835
Cardiovascular Diseases1112
Pathological Conditions, Signs and Symptoms938
Nervous System Diseases902
Nutritional and Metabolic Diseases838
Mental Disorders554
Digestive System Diseases407
Male Urogenital Diseases396
Musculoskeletal Diseases366
Respiratory Tract Diseases362
Bacterial Infections and Mycoses256
Disorders of Environmental Origin243
Female Urogenital Diseases and Pregnancy Complications226
Virus Diseases224
Skin and Connective Tissue Diseases212
Hemic and Lymphatic Diseases183
Eye Diseases176
Congenital, Hereditary, and Neonatal Diseases and Abnormalities142
Stomatognathic Diseases130
Immune System Diseases116
Endocrine System Diseases98
Parasitic Diseases57
Otorhinolaryngologic Diseases35
Animal Diseases4
Table 2

Number of Mouse genes in each General Phenotypic Class

PHENOTYPIC CLASS# of Mouse genes in each class
unassigned top level19186
nervous system phenotype8149
immune system phenotype6414
homeostasis/metabolism phenotype5976
skeleton phenotype5559
growth/size phenotype5556
behavior/neurological phenotype5417
cardiovascular system phenotype5221
hematopoietic system phenotype5163
reproductive system phenotype4762
lethality-prenatal/perinatal4409
embryogenesis phenotype3416
skin/coat/nails phenotype3048
vision/eye phenotype2710
hearing/vestibular/ear phenotype2447
muscle phenotype2370
cellular phenotype2335
normal phenotype2120
renal/urinary system phenotype2104
endocrine/exocrine gland phenotype1871
life span-post-weaning/aging1857
respiratory system phenotype1832
digestive/alimentary phenotype1780
lethality-postnatal1777
liver/biliary system phenotype1498
limbs/digits/tail phenotype1282
tumorigenesis1268
adipose tissue phenotype1067
craniofacial phenotype1016
pigmentation phenotype634
touch/vibrissae phenotype625
no phenotypic analysis403
other phenotype343
taste/olfaction phenotype156
Number of human genes associated in each Disease Class Number of Mouse genes in each General Phenotypic Class Table 3 introduces examples of human genes from fundamental biological pathways that have been consistently associated with major disease phenotypes highlighting the sometimes-broad pleiotropic effects that major regulatory molecules have on multiple disease phenotypes. Genes such as NOS3, nitric oxide synthase 3, regulating nitrous oxide production; HLA-DQB1, the MHC class II molecule DQ beta 1, involved in antigen presentation; ACE, angiotensin I converting enzyme, central to the renin-angiotensin system and PPARG, peroxisome proliferator-activated receptor gamma, regulating transcription in pathways important in lipid metabolism are examples of genes that affect multiple tissues and different organ systems through the complex course of disease progression. Importantly, all the mouse orthologs of the human genes in Table 3 have experimentally determined phenotypes that are similar or broadly overlapping with human clinical disease phenotypes (see below).
Table 3

Selected Major Genes and Disease Phenotypes

GeneGene
APOEALZHEIMER DISEASE (70)VDRPROSTATIC NEOPLASMS (10)
CORONARY DISEASE (8)OSTEOPOROSIS, POSTMENOPAUSAL (8)
CARDIOVASCULAR DISEASES (7)BREAST NEOPLASMS (7)
MYOCARDIAL INFARCTION (6)DIABETES MELLITUS, TYPE 1 (6)
DIABETES MELLITUS, TYPE 2 (6)OSTEOPOROSIS (6)
ACEHYPERTENSION (47)MTHFRNEURAL TUBE DEFECTS (6)
DIABETES MELLITUS, TYPE 2 (25)COLORECTAL NEOPLASMS (5)
MYOCARDIAL INFARCTION (17)DIABETES MELLITUS, TYPE 2 (5)
CORONARY DISEASE (16)ESOPHAGEAL NEOPLASMS (5)
DIABETIC NEPHROPATHIES (15)ADENOCARCINOMA (4)
HLA-DQB1DIABETES MELLITUS, TYPE 1 (30)CYP17A1BREAST NEOPLASMS (10)
PAPILLOMAVIRUS INFECTIONS (7)PROSTATIC NEOPLASMS (9)
CELIAC DISEASE (6)PROSTATIC HYPERPLASIA (4)
AUTOIMMUNE DISEASES (5)OSTEOPOROSIS, POSTMENOPAUSAL (3)
TUBERCULOSIS, PULMONARY (5)ENDOMETRIAL NEOPLASMS (2)
DRD2ALCOHOLISM (17)ADRB2ASTHMA (12)
SCHIZOPHRENIA (14)OBESITY (10)
PERSONALITY DISORDER (2)HYPERTENSION (8)
DEPRESSIVE DISORDER (2)DIABETES MELLITUS, TYPE 2 (4)
DYSKINESIA, DRUG INDUCED (2)BRONCHIAL HYPERREACTIVITY (4)
PPARGDIABETES MELLITUS, TYPE 2 (18)NOS3HYPERTENSION (20)
OBESITY (11)MYOCARDIAL INFARCTION (18)
DIABETES MELLITUS (6)CORONARY ARTERY DISEASE (15)
INSULIN RESISTANCE (4)CORONARY DISEASE (12)
GLUCOSE INTOLERANCE (2)DIABETES MELLITUS, TYPE 2 (10)
Selected Major Genes and Disease Phenotypes

Summaries of genes and phenotypes in human and mouse

The majority of this report is built upon large non-redundant general summary lists for both human and mouse, shown below. These lists take two complimentary forms in both human and mouse. The first sets are GENE-to-Disease/Phenotype lists. These are non-redundant lists of genes showing the diseases or phenotypes that have been associated with each gene (Table 4 human, table 5 mouse, and table 6 human-mouse). The second sets of basic lists are DISEASE/PHENOTYPE-to-Gene lists. These are non redundant lists of diseases or phenotypes with the genes that have been associated with that disease or phenotype (Table 7 human and table 8 mouse).
Table 4

Selected Human Genes and Disease Phenotype (MeSH counts), positive associations

Gene IDHUGO Gene Sym.MESH TERM 1MESH TERM 2MESH TERM 3MESH TERM 4
348APOEAlzheimer Disease(70)Coronary Disease(8)Cardiovascular Diseases(7)Diabetes Mellitus, Type 2(6)
1636ACEHypertension(47)Diabetes Mellitus, Type 2(25)Myocardial Infarction(17)Coronary Disease(16)
3119HLA-DQB1Diabetes Mellitus, Type 1(30)Papillomavirus Infections(7)Celiac Disease(6)Tuberculosis, Pulmonary(5)
1493CTLA4Diabetes Mellitus, Type 1(28)Graves Disease(21)Thyroiditis, Autoimmune(10)Autoimmune Diseases(8)
183AGTHypertension(24)Coronary Disease(6)Diabetic Nephropathies(5)Myocardial Infarction(5)
1814DRD3Schizophrenia(24)Dyskinesia, Drug-Induced(6)Psychotic Disorders(5)Alcoholism(2)
4846NOS3Hypertension(20)Myocardial Infarction(18)Coronary Artery Disease(15)Coronary Disease(12)
3075CFHMacular Degeneration(19)Choroidal Neovascularization(3)Hemolytic-Uremic Syndrome(2)Atrophy(2)
3077HFEHemochromatosis(18)Cardiovascular Diseases(1)Colorectal Neoplasms(1)Liver Cirrhosis(1)
3356HTR2ASchizophrenia(18)Alzheimer Disease(4)Depressive Disorder(4)Depressive Disorder, Major(4)
1585CYP11B2Hypertension(18)Cardiovascular Diseases(2)Ventricular Dysfunction, Left(2)Cardiomyopathy, Dilated(2)
5468PPARGDiabetes Mellitus, Type 2(18)Obesity(11)Diabetes Mellitus(6)Insulin Resistance(4)
2784GNB3Hypertension(18)Insulin Resistance(4)Diabetes Mellitus, Type 2(3)Obesity(3)
1815DRD4Attention Def. Dis. with Hyperact. (17)Schizophrenia(8)Substance-Related Disorders(4)Mood Disorders(4)
1813DRD2Alcoholism(17)Schizophrenia(14)Personality Disorders(2)Depressive Disorder(2)
155ADRB3Obesity(17)Diabetes Mellitus, Type 2(9)Insulin Resistance(6)Endometrial Neoplasms(2)
9370ADIPOQDiabetes Mellitus, Type 2(17)Insulin Resistance(11)Obesity(8)Hypertension(4)
3123HLA-DRB1Arthritis, Rheumatoid(16)Diabetes Mellitus, Type 1(16)Multiple Sclerosis(8)Lupus Erythematosus, Systemic(7)
118ADD1Hypertension(16)Cardiovascular Diseases(3)Cerebral Hemorrhage(1)Diabetic Angiopathies(1)
3117HLA-DQA1Diabetes Mellitus, Type 1(15)Graves Disease(4)Autoimmune Diseases(4)Celiac Disease(4)
1956EGFRLung Neoplasms(15)Carcinoma, Non-SC Lung(10)Adenocarcinoma(6)Neoplasm Recurrence, Local(3)
6690SPINK1Pancreatitis(15)Chronic Disease(11)Acute Disease(3)Pancreatitis, Alcoholic(3)
6934TCF7L2Diabetes Mellitus, Type 2(15)Insulin Resistance(4)Diabetes Mellitus(2)Liver Neoplasms(1)
1234CCR5HIV Infections(14)Diabetes Mellitus, Type 2(4)Diabetic Nephropathies(4)Asthma(3)
5663PSEN1Alzheimer Disease(14)Down Syndrome(2)Dementia(1)Cerebral Amyloid Angiopathy(1)
11132CAPN10Diabetes Mellitus, Type 2(14)Insulin Resistance(3)Polycystic Ovary Syndrome(2)Obesity(2)
3553IL1BStomach Neoplasms(14)Helicobacter Infections(6)Alzheimer Disease(5)Periodontitis(3)
6532SLC6A4Depressive Disorder, Major(13)Depressive Disorder(13)Bipolar Disorder(10)Alcoholism(8)
4210MEFVFamilial Mediterranean Fever(13)Amyloidosis(4)Behcet Syndrome(3)Colitis, Ulcerative(2)
3172HNF4ADiabetes Mellitus, Type 2(13)Glucose Intolerance(2)Birth Weight(1)Fetal Macrosomia(1)
7157TP53Carcinoma, Squamous Cell(13)Lung Neoplasms(12)Breast Neoplasms(10)Carcinoma, Non-SC Lung(9)
672BRCA1Breast Neoplasms(12)Ovarian Neoplasms(5)Carcinoma, Endometrioid(1)DNA Damage(1)
185AGTR1Hypertension(12)Myocardial Infarction(3)Coronary Disease(3)Pregnancy Comp., Cardiovascular(2)
154ADRB2Asthma(12)Obesity(10)Hypertension(8)Diabetes Mellitus, Type 2(4)
3953LEPRObesity(12)Body Weight(4)Insulin Resistance(4)Glucose Intolerance(3)
2169FABP2Diabetes Mellitus, Type 2(12)Insulin Resistance(10)Obesity(7)Hyperlipidemias(4)
929CD14Asthma(12)Myocardial Infarction(5)Arteriosclerosis(4)Colitis, Ulcerative(4)
26191PTPN22Arthritis, Rheumatoid(11)Diabetes Mellitus, Type 1(9)Lupus Erythematosus, Systemic(5)Arthritis, Psoriatic(2)
3596IL13Asthma(11)Hypersensitivity, Immediate(4)Pulmonary Dis., Chronic Obstr. (4)Respiratory Hypersensitivity(2)
1080CFTRCystic Fibrosis(10)Pancreatitis(5)Chronic Disease(3)Acute Disease(2)
Table 5

Selected Mouse Genes-Disease Phenotypes

Mouse Gene Sym.Human Ortholog Gene Sym.Mouse Phenotype 1Mouse Phenotype 2Mouse Phenotype 3Mouse Phenotype 4Mouse Phenotype 5
A4galtA4GALTabnormal induced morb./mort.abnormal resp./metab. to xenobioticslife span-post-weaning/aginghomeostasis/metab. phenotype
Abca2ABCA2tremorsdecreased body weightbehavior/neurological phenotypehyperactivityincreased startle reflex
Abcc2ABCC2abnormal blood chemistryabnormal liver physiologyabnormal urine chemistryabnormal kidney physiologyAbn. resp./metabolism to xenobiotics
Abi2ABI2abn. corpus callosum morph.abnormal cerebral cortex morph.abnormal hippocampus morph.abnormal dentate gyrus morph.microphthalmia
AcacaACACAabnormal liver physiologyabnormal lipid levelincr. circulating free fatty acid levelhyperglycemiaembryonic growth arrest
AcadsACADShypoglycemiabehavior/neurological phenotypeabnormal drinking behaviorabnormal food preferenceabnormal urine chemistry
Accn1ACCN1retinal degenerationvision/eye phenotypeabnormal eye electrophysiology
Adad1ADAD1impaired fertilizationmale infertilityasthenozoospermiaoligozoospermiareproductive system phenotype
Adam23ADAM23tremorsbehavior/neurological phenotypeataxiapostnatal lethalitylethality-postnatal
Adarb1ADARB1behavior/neurological phenot.seizurespostnatal lethalitybehavior/neurological phenotypenormal phenotype
AdipoqADIPOQvasculature congestionincreased body weightdecreased body weightabnormal CNS syn. transmissionabnormal coat appearance
Adora1ADORA1behavior/neurological phenot.increased anxiety-related responseabnormal body temperature regulationabnormal angiogenesisabnormal nervous system electrophys.
AgerAGERincreased bone densityabnormal cancellous bone morph.abnormal blood chemistryreproductive system phenotypeabnormal cell proliferation
Akap1AKAP1reduced female fertilitydecreased litter sizeabnormal female meiosisincreased cholesterol level
Apoc1APOC1abnormal circ. cholesterol levelabnormal lipid levelincreased circulating triglyceride levelabnormal immune sys. Morph.abnormal bile composition
B2mB2Mdecreased hematocritabnormal interleukin-10 physiologyrectal prolapseabnormal dorsal root gang. morph.enlarged spleen
BaxBAXenlarged spleenincreased thymocyte numberabnormal motor neuron morph.short snoutabnormal sympathetic neuron morph.
Bcl2BCL2small earsabsent melanin granules in hair follicleabnormal snout morph.herniated abdominal wallabnormal small intestine morph.
Bmp1BMP1abnormal heart morph.abnormal aorta morph.abnormal ventricular septum morph.abnormal awl hairprenatal lethality
Brca1BRCA1abnormal cell deathincreased cell proliferationdecreased cell proliferationdecreased anxiety-related resp.kinked tail
Capn10CAPN10abnormal pancreas physiologyendocrine/exocrine gland phenotypedigestive/alimentary phenotypedecreased inflammatory response
Casp1CASP1abnormal apoptosisabnormal induced morbidity/mortalityabnormal inflammatory responsedecr. suscep. to endotoxin shocktumorigenesis
Ccr4CCR4immune system phenotypedecreased tumor necrosis factor secr.decreased interleukin-1 beta secretionabnormal induced morbid./mort.
Dusp1DUSP1thick alveolar septumabnormal circ. alanine transaminasehypotensionincreased thymocyte numberlung inflammation
E2f1E2F1abnormal cell deathdecreased salivationenlarged thymuspale liverexencephaly
EpoEPOabnormal erythropoiesisabnormal pericardium morph.small liverpostnatal growth retardationabnormal hepatocyte morph.
Ercc4ERCC4abnormal cell content/morph.abnormal liver morph.decreased body weightabsent blood islandsliver/biliary system phenotype
F5F5behavior/neurological phenot.abnormal somite developmentabnormal yolk sac morph.increased suscep. to bact. Infect.hemorrhage
Fcgr1FCGR1Aimpaired macrophage phagocyt.abnormal inflammatory responsedecreased inflammatory responseabnormal yolk sac morph.abnormal cell-mediated immunity
Foxo1FOXO1absent organized vascular net.abnormal looping morphogenesisabnormal vasculatureexencephalyabsent vitelline blood vessels
Gadd45aGADD45Adecreased leukocyte cell num.increased cell proliferationincreased thymocyte numberpostnatal lethalityskin irradiation sensitivity
Gap43GAP43decreased body weightabnormal optic nerve innervationabsent optic tractabnormal erythropoiesisnervous system phenotype
Gata1GATA1decreased hematocritabnormal thrombopoiesisextramedullary hematopoiesisoverexpanded resp. alveoliliver hypoplasia
Grin1GRIN1abn. trigeminal nerve morph.atelectasislung hemorrhageabnormal tympanic ring morph.decreased body weight
Hoxa1HOXA1small earsabnormal inner ear morph.abnormal malleus morph.increased susceptibility to injuryabnormal cochlea morph.
Hspa1aHSPA1Adecreased body weightincreased cell. Sens. to gamma-irrad.chromosome breakageincreased body weighthomeostasis/metabolism phenotype
Icam1ICAM1increased leukocyte cell numberincreased neutrophil cell numberincreased monocyte cell numberabnormal spatial learningabnormal retina morph.
Igbp1IGBP1decreased thymocyte numberbehavior/neurological phenotypeabnormal cued conditioning behaviorintestinal ulcerabnormal thymus lobule morph.
Table 6

Selected Human-Mouse Phenotype Overlap

Mouse Gene SymHuman Gene SymHuman Gene ID #Human Disease MeshTermMouse Phenotype Term
Npc1l1NPC1L129881Hypercholesterolemia(1)abnormal circulating LDL cholesterol level;decreased circulating HDL cholesterol level;abnormal triglyceride level;abnormal lipid homeostasis; ...
Nkx2-5NKX2-51482Heart Defects, Congenital(1),Heart Block(1)abnormal heart development;abnormal looping morphogenesis;abnormal heart tube morphology;abnormal heart shape;thin ventricular wall; ...
Oprm1OPRM14988Alcoholism(9),Substance-Related Disorders(5),Heroin Dependence(2), Pain,Postoperative(2),Epilepsy, Generalized(1),Substance Withdrawal Syndrome(1),Cocaine-Related Disorders(1),Diabetes Mellitus, Type 2(1),Kidney Failure, Chronic(1),Pain(1), Ischemia(1),Opioid-Related Disorders(1),Postoperative Nausea and Vomiting(1)abnormal response to addictive substance;preference for addictive substance;abnormal touch/nociception;abnormal pain threshold;decreased chemically-elicited antinociception;sensitivity to addictive substance;excitatory postsyn. potential;resistance to addictive substance;altered response to anesthetics; ...
Homer1HOMER19456Cocaine-Related Disorders(1)cocaine preference;abnormal conditioning behavior;abnormal response to addictive substance;nervous system phenotype;abnormal nervous system physiology;behavior/neurological phenotype, ...
Insl3INSL33640Cryptorchidism(3),Abnormalities, Multiple(1),Hypospadias(1),Gonadal Dysgenesis(1),Infertility, Male(1),Testicular Diseases(1)abnormal male reproductive anatomy;small testis;abnormal spermatogenesis;behavior/neurological phenotype;male infertility;female infertility;abnormal estrous cycle;abnormal gametogenesis;decreased germ cell number;cryptorchism; ...
Stat6STAT66778Asthma(3),Hypersensitivity(3),Dermatitis, Atopic(2),Anaphylaxis(2),Nut Hypersensitivity(1),Nephrotic Syndrome(1),Infertility(1),Hypersensitivity, Immediate(1),Graves Disease(1),Endometriosis(1), ...abnormal humoral immune response;decreased IgM level;decreased IgA level;decreased susceptibility to viral infection;decreased IgE level;increased IgG level;increased IgA level;abnormal interleukin physiology;abnormal interferon physiology;abnormal CD8-positive T cell morphology; ...
En2EN22020Autistic Disorder(1),Asperger Syndrome(1)abnormal social investigation;abnormal spatial learning;abnormal social/conspecific interaction;abnormal cerebellum morphology;abnormal cerebellar foliation;abnormal vermis morphology;abnormal cerebellar granule layer;abnormal colliculi morphology;hyperactivity;impaired coordination;abnormal grooming behavior; ...
Hsd11b1HSD11B13290Diabetes Mellitus, Type 2(2),Obesity(2),Hypertension(2),Insulin Resistance(2),Polycystic Ovary Syndrome(1),Hyperandrogenism(1)abnormal abdominal fat pads;abnormal circulating cholesterol level;decreased circulating LDL cholesterol level;enlarged adrenal glands;increased circulating HDL cholesterol level;abnormal glucose homeostasis;decreased circulating triglyceride level;abnormal corticosterone level;improved glucose tolerance; ...
Msh3MSH34437Lung Neoplasms(1),Head and Neck Neoplasms(1),Colonic Neoplasms(1),Carcinoma, Squamous Cell(1),Carcinoma, Small Cell(1)tumorigenesis;increased tumor incidence;premature death;life span-post-weaning/aging
Crb1CRB123418Optic Atrophies, Hereditary(1),Blindness(1)abnormal retinal photoreceptor morphology;abnormal retina morphology;retinal degeneration;decreased retinal photoreceptor cell number;photosensitivity;abnormal ocular fundus morphology;nervous system phenotype;abnormal retinal photoreceptor layer;abnormal photoreceptor inner segment morph; ...
Chrna7CHRNA71139Schizophrenia(3),Auditory Perceptual Disorders(1),Memory Disorders(1)pharmacologically induced seizures;decreased anxiety-related response;abnormal spatial learning;abnormal hippocampus function;abnormal tumor necrosis factor physiology;homeostasis/metabolism phenotype
InhaINHA3623Ovarian Failure, Premature(2),Amenorrhea(1)kyphoscoliosis;abnormal liver morphology;abnormal ovarian follicle morphology;enlarged testes;abnormal spermatogenesis;increased circulating follicle stimulating hormone;male infertility;female infertility;tumorigenesis;ovary hemorrhage;cachexia;diffuse hepatic necrosis;pancytopenia;liver/biliary system phenotype; ...
Slc6a3SLC6A36531Attention Deficit Disorder w/Hyp.(7),Tobacco Use Disorder(3),Schizophrenia(2),Alcohol Withdrawal Delirium(2),Eating Disorders(1),Substance Withdrawal Syndrome(1),Stress Disorders, Post-Traumatic(1),Child Behavior Disorders(1),Bulimia(1),Alcoholism(1),abnormal maternal nurturing;hyperactivity; hypoactivity;impaired coordination;increased exploration in new environment;decreased exploration in new environment;abnormal spatial learning;abnormal pituitary secretion;abnormal lactation;increased dopamine level;cocaine preference; ...
Cyp11b2CYP11B21585Hypertension(18),Cardiovascular Diseases(2),Ventricular Dysfunction, Left(2),Cardiomyopathy, Dilated(2),Arteriosclerosis(1),Acromegaly(1),Fibrosis(1),Arthritis, Rheumatoid(1),Polycystic Ovary Syndrome(1),Metabolic Syndrome X(1),decreased body size;hypotension;increased circulating corticosterone level;decreased circulating aldosterone level;decreased circulating chloride level;increased circulating renin level;abnormal enzyme/coenzyme level;lethality-postnatal;homeostasis/metabolism phenotype;growth/size phenotype; ...
Ptpn22PTPN2226191Arthritis, Rheumatoid(11),Diabetes Mellitus, Type 1(9),Lupus Erythematosus, Systemic(5),Arthritis, Psoriatic(2),Autoimmune Diseases(2),Arthritis, Juvenile Rheumatoid(2),Nephritis(1),Multiple Sclerosis(1),Asthma(1),Cholangitis, Sclerosing(1),enlarged spleen;enlarged lymph nodes;abnormal Peyer's patch germinal center morph;abnormal T cell physiology;increased IgE level;increased B cell number;immune system phenotype;hematopoietic system phenotype;increased follicular B cell number;increased spleen germinal center number;increased IgG1 level;increased IgG2a level; ...
Table 7

Selected Human Disease Phenotypes (MeSH) and Gene counts, positive associations

Disease Mesh TermGene Rank 1Gene Rank 2Gene Rank 3Gene Rank 4Gene Rank 5Gene Rank 6Gene Rank 7Gene Rank 8
DISEASE CLASS - CARDIOVASCULAR
HypertensionACE(47)AGT(24)NOS3(20)CYP11B2(18)GNB3(18)ADD1(16)AGTR1(12)ADRB2(8)
Myocardial InfarctionNOS3(18)ACE(17)SERPINE1(11)ITGA2(7)LPL(6)APOE(6)GP1BA(5)F7(5)
Coronary DiseaseACE(16)NOS3(12)PON1(11)APOB(11)APOE(8)LPL(7)AGT(6)SERPINE1(6)
Coronary Artery DiseaseNOS3(15)PON1(9)ACE(7)APOA5(6)APOE(5)AGT(4)ABCA1(4)APOA1(4)
Hypertrophy, Left VentricularACE(15)GNB3(3)AGTR2(2)EDN1(2)TNNT2(2)NOS3(2)ENPP1(1)ACE2(1)
Venous ThrombosisF5(8)F2(5)SERPINE1(4)MTHFR(3)ABO(2)F8(2)JAK2(2)PROCR(2)
Cardiovascular DiseasesAPOE(7)CETP(6)ACE(5)NOS3(5)PON1(4)APOA5(4)APOC3(4)SERPINE1(3)
Myocardial IschemiaACE(6)LPL(5)NOS3(2)ITGB3(2)APOB(2)AGT(1)AGTR1(1)SELPLG(1)
ArteriosclerosisACE(5)CD14(4)PON1(3)FGB(3)MTHFR(3)NOS3(3)APOE(3)TLR4(2)
CardiomyopathiesTTR(5)HFE(1)APOA1(1)HLADQB1(1)SOD2(1)CCR2(1)SELE(1)MMP9(1)
Heart FailureADRA2C(5)ADRB1(5)ACE(3)NOS3(3)ADRB2(3)AMPD1(2)SCNN1B(1)EDN1(1)
DISEASE CLASS - DIGESTIVE SYS. DISEASES
PancreatitisSPINK1(15)CFTR(5)PRSS1(3)HLA-DRB1(2)HLA-A(1)TLR4(1)UGT1A7(1)KRT8(1)
Cystic FibrosisCFTR(10)NOS1(2)SERPINA1(2)SPINK1(1)CAPN10(1)SFTPA2(1)GCLC(1)FCGR2A(1)
Celiac DiseaseHLADQB1(6)CTLA4(6)HLADQA1(4)TNF(2)PTPN22(1)IFNG(1)TIPARP(1)IL21(1)
Crohn DiseaseIL23R(6)NOD2(5)TNF(5)ABCB1(4)CD14(4)IBD5(3)DLG5(3)MIF(3)
Liver Cirrhosis, AlcoholicALDH2(6)ACE(1)TNF(1)SOD2(1)ADH1C(1)ADH1B(1)DRD2(1)CYP2E1(1)
Colitis, UlcerativeABCB1(5)IL23R(5)TNF(4)CD14(4)TLR4(3)ICAM1(3)IL1RN(3)CTLA4(3)
Gastritis, AtrophicMPO(3)TLR4(1)IL13(1)PTPN11(1)TNF(1)ABO(1)CMA1(1)IL1B(1)
Inflammatory Bowel DiseasesTNF(3)ABCB1(3)IL23R(3)ITPA(2)NOD2(2)DLG5(2)HP(2)PON1(1)
Cholangitis, SclerosingHLADRB1(2)HP(2)PTPN22(1)MMP1(1)HLADQA1(1)TNF(1)HLADQB1(1)MMP3(1)
DISEASE CLASS - DIS. OF ENVIRONMENTAL ORIGIN
AlcoholismDRD2(17)OPRM1(9)SLC6A4(8)ALDH2(7)MAOA(6)GABRA2(4)NPY(4)ADH1B(3)
DNA DamageXRCC1(7)TP53(3)CYP1A1(3)GSTM1(3)OGG1(3)LIG4(2)APEX1(2)BRCA2(2)
Substance-Related DisordersSLC6A4(5)OPRM1(5)DRD4(4)DRD5(2)BDNF(2)ADH4(2)CNR1(2)DRD2(2)
Fractures, BoneESR1(4)ESR2(2)COL1A1(2)CYP19A1(1)IGF1(1)TNFRSF11B(1)P2RX7(1)TGFB1(1)
Tobacco Use DisorderCYP2A6(3)SLC6A3(3)CHRNA4(2)TH(2)BDNF(1)PPP1R1B(1)SLC18A2(1)PTEN(1)
Cocaine-Related DisordersPDYN(2)HOMER1(1)TTC12(1)ANKK1(1)DBH(1)GSTP1(1)OPRM1(1)
Heroin DependenceOPRM1(2)BDNF(1)OPRD1(1)SLC6A4(1)COMT(1)MAOA(1)
Spinal FracturesCOL1A1(2)CYP19A1(1)TNFRSF11B(1)GC(1)PLXNA2(1)AR(1)NOS3(1)
DISEASE CLASS - IMMUNE SYSTEM
Autoimmune DiseasesCTLA4(8)HLADQB1(5)HLADRB1(4)HLADQA1(4)PTPN22(2)HLA-A(2)CYP2D6(2)CIITA(2)
Hypersensitivity, ImmediateIL4R(8)IL13(4)CD14(4)IL4(2)SERPINE1(2)CCL5(2)NOS2A(2)CTLA4(2)
Graft vs Host DiseaseIFNG(3)TNF(2)TLR4(1)NOD2(1)HLA-DPB1(1)HLA-A(1)IL10(1)IL1R1(1)
HypersensitivityIL4(3)STAT6(3)IL4R(2)IFNG(1)IFNGR1(1)TLR2(1)FADS1(1)IL13(1)
Antiphospholipid SyndromeF2(2)SELPLG(1)SERPINE1(1)FCGR2A(1)HLADMA(1)
Food HypersensitivityIL4(1)IL4R(1)STAT6(1)IL13(1)HLADQB1(1)CD14(1)
DISEASE CLASS - MENTAL DISORDERS
SchizophreniaDRD3(24)HTR2A(18)DRD2(14)COMT(10)HTR2C(8)BDNF(8)DRD4(8)NOTCH4(8)
Attention Deficit Disorder with HyperactivityDRD4(17)SLC6A3(7)SLC6A4(6)ADRA2A(4)MAOA(3)SNAP25(3)SLC6A2(2)DRD5(2)
Depressive DisorderSLC6A4(13)HTR2A(4)TPH1(3)CYP2D6(2)CYP2C19(2)MAOA(2)BDNF(2)DRD2(2)
Depressive Disorder, MajorSLC6A4(13)TPH1(5)HTR2A(4)TPH2(3)BDNF(2)DRD2(2)GNB3(2)DTNBP1(1)
Bipolar DisorderSLC6A4(10)BDNF(6)MAOA(5)COMT(5)XBP1(3)GABRA5(3)HTR2A(3)TPH2(3)
Anxiety DisordersSLC6A4(7)MAOA(3)PLXNA2(1)BDNF(1)DBI(1)MED12(1)GABRB3(1)DRD2(1)
Mood DisordersSLC6A4(5)DRD4(4)CLOCK(2)MAOA(2)BDNF(2)ACE(1)CRH(1)DRD3(1)
Psychotic DisordersDRD3(5)SLC6A4(3)DRD4(3)HTR2A(3)DTNBP1(2)DISC1(2)DRD2(2)MAOA(1)
Obsessive-Compulsive DisorderSLC6A4(4)HTR2A(3)COMT(3)SLC1A1(2)DRD4(2)HTR1B(1)BDNF(1)NRCAM(1)
Panic DisorderCCK(4)HTR1A(2)MAOA(2)HTR2A(2)DBI(1)CCKAR(1)ADORA2A(1)PGR(1)
Cognition DisordersAPOE(3)BDNF(3)DRD4(2)COMT(2)HMGCR(1)DTNBP1(1)SLC6A4(1)NQO1(1)
DISEASE CLASS - NERVOUS SYSTEM DISEASES
Alzheimer DiseaseAPOE(70)PSEN1(14)A2M(10)CYP46A1(8)ACE(7)BCHE(7)IL1A(7)BDNF(6)
Parkinson DiseasePARK2(9)LRRK2(9)CYP2D6(7)MAOB(7)BDNF(5)SNCA(5)PON1(4)PINK1(4)
Multiple SclerosisHLADRB1(8)APOE(5)CTLA4(4)PTPRC(4)MBP(3)HLA-DQB1(3)IFNG(2)CRYAB(2)
Amyotrophic Lateral SclerosisSOD1(6)PON1(2)PON2(2)VEGFA(2)SMN1(1)MAPT(1)MT-ND5(1)PON3(1)
Brain IschemiaFGB(5)PDE4D(3)NOS3(3)ACE(2)PON1(2)MTHFR(2)ITGB3(2)TLR4(1)
Cerebrovascular AccidentNOS3(5)APOE(5)FGB(5)PON1(4)SERPINE1(3)ALOX5AP(3)ACE(2)KL(2)
Carotid Artery DiseasesNOS3(4)PON1(3)MTHFR(3)CCL2(2)IL6(2)APOE(2)CD14(2)ACE(1)
DementiaAPOE(4)MAPT(3)MT-ND1(1)PRNP(1)PSEN1(1)TNF(1)CDC2(1)IGF1R(1)
DISEASE CLASS - NUTR. AND METABOLIC DISEASES
Diabetes Mellitus, Type 2ACE(25)PPARG(18)ADIPOQ(17)TCF7L2(15)CAPN10(14)HNF4A(13)FABP2(12)NOS3(10)
ObesityADRB3(17)LEPR(12)MC4R(11)PPARG(11)UCP2(11)ADRB2(10)UCP1(8)ADIPOQ(8)
Insulin ResistanceADIPOQ(11)FABP2(10)INSR(7)IRS1(7)ENPP1(7)ADRB3(6)NOS3(6)ACE(5)
Diabetes MellitusPPARG(6)ACE(3)INS(3)NOS3(3)PON1(2)UBL5(2)IRS1(2)TCF7L2(2)
HyperlipidemiasAPOA5(5)FABP2(4)LPL(3)APOE(3)ACE(2)APOA1(2)PPARA(2)PPARG(2)
HypertriglyceridemiaAPOC3(5)APOA5(4)LPL(3)APOE(3)ADRB2(2)APOA4(2)GP1BA(1)LTA(1)
Glucose IntoleranceLEPR(3)ADIPOQ(3)IGF1(2)KCNJ11(2)PTPN1(2)PPARG(2)HNF4A(2)NEUROG3(1)
HypercholesterolemiaAPOA1(3)APOB(3)F12(3)ACE(2)LDLR(2)LPL(2)PCSK9(2)ABCG8(2)
Metabolic SyndromeAPOC3(3)UBL5(2)NOS3(2)ACE(1)PPARD(1)NPY5R(1)ACE2(1)RGS2(1)
DISEASE CLASS - EYE DISEASES
Macular DegenerationCFH(19)APOE(4)PON1(2)C2(1)CFB(1)ABCA1(1)HTRA1(1)MELAS(1)
Diabetic RetinopathyVEGFA(7)AKR1B1(4)PON1(3)RAGE(3)AGER(3)ACE(2)ITGA2(2)ICAM1(2)
GlaucomaCYP1B1(3)OPTN(2)OPA1(2)OPTC(1)EDNRA(1)MYOC(1)
Ocular HypertensionOPTN(2)CYP1B1(1)OLFM2(1)OPA1(1)
CataractGALT(1)AIPL1(1)IFNGR1(1)GCNT2(1)
Retinal DegenerationNDP(1)GUCA1A(1)AIPL1(1)COL2A1(1)RHO(1)GUCA1B(1)ABCA4(1)
MyopiaHLADPB1(1)LUM(1)COL2A1(1)NYX(1)MYOC(1)
Table 8

Selected Mouse Disease Related Phenotypes

PhenoCodePhenoType
DISEASE CLASS CARDIOVASCULARGene 1Gene 2Gene 3Gene 4Gene 5Gene 6Gene 10
MP:0005048thrombosisAbca5Actc1Adamts13AhrAlox12Anxa2F2rl2
MP:0005341decreased sus. to atherosclerosisAPOA1ApoeArtlesAth17Ath29Ath37Icam1
MP:0000231hypertensionAbcc9Ace2Add2AgtAlb1-RenBpq5Chga
MP:0004181abnormal carotid artery morphologyAldh1a2ChrdCrkEdnraFgf8Foxm1Shc1
MP:0004111abnormal coronary artery morph.AdmAhrFgf8Gja1Hspg2Itga4Vegfa
MP:0005338atherosclerotic lesionsAorls1Aorls2ApoeAth29Ath6Ath8Fabp4-Aebp1
MP:0000343altered resp. to myocardial infarctionAgtr2Aifm1Ak1Bnip3CMV-Abcc9Ccr1Ckm-Prkaa2
MP:0006058decreased cerebral infarction sizeACTB-NgbEGFPAdora2aCx3cl1F11F12Plat
MP:0003037increased infarction sizeAifm1Fgf2Hmox1KitMapk1Myh6-tTAThbd
MP:0004875Inc. mean arterial blood pressureDdah1Edn1EdnrbKcnn3Ptger1Tagln-tTA
MP:0005339Inc. susceptibility to atherosclerosisApoa1ApoeArtlesAscla1Ascla2Ascla3Ath18
DISEASE CLASS - DIGESTIVE SYSTEM DISEASESGene 1Gene 2Gene 3Gene 4Gene 5Gene 6Gene 10
MP:0003119abnormal digestive system dev.Cdkn1cCyp26a1Foxp4Mapk7Mcm4Nckap1Tbx6
MP:0000462abnormal digestive system morph.ApcBmp5Cdcs1Cdkn1cCftrCtnnbip1Gast
MP:0001663abnormal digestive system phys.ApoeCd44CftrClec7aCol2a1Fut2Gpx1
MP:0000474abnormal foregut morphologyApcFoxa2Gata4Gdf1HgsLdb1Otx2
MP:0000488abnormal intestinal epithelium morphAtrB4galt1B9d2Bdkrb2Cbfa2t2Col1a1Elf3
MP:0003449abnormal intestinal goblet cellsAregCbfa2t2CftrClca3Ctnnb1E2f4Il13
MP:0006001abnormal intestinal transit timeDrd2Gfra2Gucy1b3Hmox2Mrvi1Smtn
MP:0000470abnormal stomach morphologyAhrAireBarx1Celsr3Cfc1Col1a1Gdf11
DISEASE CLASS - DIS. OF ENVIRONMENTAL ORIGINGene 1Gene 2Gene 3Gene 4Gene 5Gene 6Gene 10
MP:0001425abnormal alcohol consumptionAaq1Alcp1Alcp19Alcp2Ap7qAp8qPpp1r1b
MP:0005443abnormal ethanol metabolismAdh1Adh7Afteq1Afteq2Alcw3Htas2
MP:0002552abnormal response to addictive sub.Adora2aAdra1dAlcw1Alcw2Alcw3Alcw4Chrna4
MP:0001987alcohol preferenceAlcp1Alcp25Alcp3Alcp4AlprfAp1qAp5q
MP:0001988cocaine preferenceGrm2Homer1Homer2Per2Slc6a3Slc6a4
MP:0003546decreased alcohol consumptionCamk2aGnasGria3Prkcetmgc55
MP:0004048resistance to addictive substanceAdora2aAdra1bApba1Aqp4Btbd14bChrna4Slc6a3
DISEASE CLASS - IMMUNE SYSTEM DISEASESGene 1Gene 2Gene 3Gene 4Gene 5Gene 6Gene 10
MP:0001844autoimmune responseTcraTcrbACTBAireCd1d1FasIkzf3
MP:0005016decreased lymphocyte cell numberAtmBcl2Bcl6bBirc2C3ar1Ccr9Ctsd
MP:0008088abnormal T-helper 1 cell diff.CbfbIfngr2Il2Il4Irf4Mapk8Sit1
MP:0002499chronic inflammationCcr7Gstz1Hmox1Il10Il1rnJak3Plcg2
MP:0004804dec. sus. to autoimmune diabetesHLA-DQA1HLA-DQB1Art2aB2mCd4Cd4DsRedCdk4
MP:0002411decreased sus. to bacterial infectionAnthAnth2B2mC4bCasp1Cd97Dcn
MP:0005597dec. sus. to type I hypers-reactionAlox5Alox5apCysltr1Cysltr2Fcer1aFcer1gOrai1
MP:0003725increased autoantibody levelTcraTcrbAcla1Acla2AireCd276Cia38
MP:0005014increased B cell numberBCL2Bak1BaxBcl11bBcl2l11Bst1Cdkn2c
MP:0005013increased lymphocyte cell numberAxlB4galt1Bak1Casp8Cd19Ewsr1Galnt1
MP:0004803Inc. sus. to autoimmune diabetesIns1-CatTyrB2mCd274Cd28Cd38Cdk2
MP:0005350Inc. sus. to autoimmune disorderTcraTcrbAds1Ads2Ads3Ads4Bak1
MP:0002412increased sus. to bacterial infectionAdamts13Adcyap1r1Adh5Atf2Bbaa21Bcl10C3
DISEASE CLASS - MENTAL DISORDERS DISEASE CLASS - MENTAL DISORDERSGene 1Gene 2Gene 3Gene 4Gene 5Gene 6
MP:0001412excessive scratchingAtp2b4BdnfCtslEIF1AXLck-Il31raMaptGene 10
MP:0001362abnormal anxiety-related responseAppArafAxtofd1Axtofd3Axtofd4Axtofd5
MP:0001458abnormal object recognition memoryGabbr1GalGrin1PrnpPrnp-AppPsen1Crhr1
MP:0001360abnormal social investigationAvpr1aAvpr1bCadps2En2Gnao1Grin1
MP:0002557abnormal social/conspecific int.ArCadps2Disc1En2Grin1Grin3bMaoa
MP:0002065abnormal fear/anxiety-related beh.APPV717IAppAtp1a2CrebbpEgr1Gnai1Oxt
MP:0001364decreased anxiety-related responseAPPAdcy8Adcyap1Adcyap1r1Avpr1aB3galt2Nos3
MP:0002573behavioral despairAdra2cB3gnt2Cacna1cCrhr2Desp1Desp2Camk2a
MP:0001462abn. avoidance learning behaviorAalAapDcxIduaNtrk2Nr3c1
DISEASE CLASS - NUTR. AND METABOLIC DISEASESGene 1Gene 2Gene 3Gene 4Gene 5Gene 6
MP:0005560decreased circulating glucose levelIns1-CatTyrAcadmAdipoqApcs-LepApoeGene 10
MP:0004185abnormal adipocyte glucose uptakeAkt2Bglap1CebpaPik3r1PrkciPtprvCd36
MP:0000188abnormal circulating glucose levelAdipor1CideaCiitaCkmCrhDbm3
MP:0001560abnormal circulating insulin levelCacna1cCebpaFoxa1GalGckIGFBP2Irs2
MP:0003383abnormal gluconeogenesisAdipoqAdipor1CebpaCebpbLpin1Mc2rMgat4a
MP:0005291abnormal glucose toleranceAdipoqFstl3Irs4LepPcsk1Pnpla2Smarcb1
MP:0003564abnormal insulin secretionEif2ak3GastGckGjd2Ins2Lep
MP:0002727decreased circulating insulin levelAdcyap1r1Adipor2AhsgAkt2Apcs-LepApoa2
MP:0002711decreased glucagon secretionCacna1eDbhKcnj11Nkx2-2Pcsk2Bglap1
MP:0003059decreased insulin secretionAbcc8Anxa7Bglap1Cacna1eCartptChrm3
MP:0001548hyperlipidemiaAPOC1Acox1ApcApoeCdkn1bCpt1cEif2s1
MP:0005293impaired glucose toleranceAPPswePSEN1dE9Abcc8AcadvlAdcyap1r1AdipoqLepr
MP:0005292improved glucose toleranceAdipor2AhsgBcat2CblCrebbpCxcl14Akt2
MP:0004892increased adiponectin levelActbAdipor2CidebCrebbpPde3bPtenGcgr
MP:0002575Inc. circulating ketone body levelAcacbAdcyap1GckAZIPIns2Ins2-Nos2Scd1
MP:0003645Inc. pancreatic beta cell numberACTBAkt2ArxHnf4aCdkn1bFoxo1Ins2-rtTA
MP:0001759increased urine glucose levelAqp1Aqp7Cdk4Cdk4Cryaa-TAgDnajc3aIns1
MP:0005331insulin resistanceAPOBAdipoqAdipor1Clcn5Adra1bAkt2Bglap1
DISEASE CLASS - EYE DISEASESGene 1Gene 2Gene 3Gene 4Gene 5Gene 6Gene 10
MP:0001299abnormal eye distance/positionDstEdg2Hectd1Hesx1Itgb1Nrtn
MP:0000776abnormal inferior colliculusAtg5En1Ext1Fgf17Fgf8Fgfr1
MP:0003236abnormal lens capsule morphologyAbi2Cdkn2aCryaaCrygaHsf1Hsf4Otx2
MP:0002864abnormal ocular fundus morphologyCrb1Gpr143MitfPitx3Rd9Rp1h
MP:0002638abnormal pupillary reflexCat4Cnga3Cry1EccpFoxe3Iactmgc25
MP:0002699abnormal vitreous bodyAldh1a1Aldh1a3Bmp4Cdkn2aFzd4Gas1
MP:0001314corneal opacityAlmApoAregBmp4Cat4Col4a1Lim2
MP:0001851eye inflammationAdam17Atf2EdaFignITGA2ITGA5Dsc1
MP:0005542corneal vascularizationDstnEdaFignFlt1Foxe3Ifnar1Plg
MP:0003011delayed dark adaptationRbp1Rdh11Rdh12Rdh5Rdh8Rlbp1Pgf
MP:0005172reduced eye pigmentationAp3b1Ap3d1Hps5Hps6MitfNf1Sema4a
Selected Human Genes and Disease Phenotype (MeSH counts), positive associations Selected Mouse Genes-Disease Phenotypes Selected Human-Mouse Phenotype Overlap Selected Human Disease Phenotypes (MeSH) and Gene counts, positive associations Selected Mouse Disease Related Phenotypes

Human

Table 4 shows examples of selected genes in each row that have been positively associated with specific disease phenotype keywords. Each human gene symbol is followed by a specific MeSH disease term and the number of times that gene has been positively associated with the term, in declining order. A major feature of Table 4 is that individual genes have been positively associated with sometimes overlapping disease phenotypes over a broad range from more frequently to less frequently. Table 4 is a small representative subset, truncated in the number of genes (rows) and the number of MeSH terms (columns). The complete list of 1,584 human genes with additional information can be found in Table S1a [20]. An interactive version of the same list can be found in Table S1b[21]. Quite often the resulting list of phenotypes associated with a specific gene may include the major disease phenotype followed by specific sub-phenotypes of the disease that contribute distinct aspects to the overall clinical disease phenotype. For example, IL13 has been associated with asthma at least 11 times as well as to the asthma sub-phenotype immediate hypersensitivity 4 times. Similarly, the gene CFH has been associated with macular degeneration at least 19 times, as well as to the endo-phenotype of macular degeneration, choroidal neovascularization 3 times. Although replication in genetic association studies has been widely debated[22], consistent replication by independent groups, although sometimes with both modest risk and significance values[23], suggests a fundamental measure of scientific validity. This is true for both candidate gene as well as GWAS studies. In other cases, individual genes have been associated with independent but related disorders that may share fundamental biological pathways in disease etiology, such as HLA-DQB1, CTLA4, and PTPN22 as in the case of autoimmune disorders. This gene overlap emphasizes the fundamental, often step-wise biochemical role each gene plays in shared disease etiology [24-27]. That is, HLA-DQB1 in antigen presentation, CTLA4 in regulation of the expansion of T cell subsets, and PTPN22 in T cell receptor signaling, all contributing to immunological aberrations and progression to clinical disease, as in rheumatoid arthritis, systemic lupus erythematosus, and type 1 diabetes. In other cases, the same gene has been associated with quite different clinical phenotypes, suggesting sharing of complex biological mechanisms at a more underlying level. For example, the gene CFTR, widely recognized as the cause of cystic fibrosis, has been consistently associated with pancreatitis, may be implicated in chronic rhinitis [28], and may play a protective role in gastrointestinal disorders [29].

Mouse

Tables 5 and S2 are the mouse equivalents of the human GENE-to-Disease/Phenotype lists (tables 4 and S1 for human). These were developed from the mouse phenotype table of genes with mouse phenotype ontological codes ftp://ftp.informatics.jax.org/pub/reports/index.html#pheno, downloaded on 4-4-08. To build tables 5 and S2, the matching phenotypic terms were exchanged for each Mammalian Phenotype code (MP:#). This resulted in the mouse GENE-to-Disease/Phenotype tables (tables 5 and S2) similar in structure to human GENE-to-Disease/Phenotype tables (tables 4 and S1). Unlike the human tables, the mouse GENE-to-Disease/Phenotype tables come from individual mouse experimental knockout or other genetic studies. They are not based on population based epidemiological studies. They also do not have the quantitative aspect of the human tables with publication frequency counts tagged to each record. In addition, although they include a wide variety of physiological, neurological, and behavioral phenotypes, they do emphasize developmental studies and observational morphological phenotypes common in mouse knockout studies. Table 5 is a small representative subset, truncated in the number of genes (rows) and the number of Phenotype terms (columns). The complete list of 5011 mouse genes with annotated phenotypes and additional information can be found in Table S2a[30]. An interactive version of the same list can be found in Table S2b[31].

Direct comparison of human and mouse genes disease/phenotypes

We can now compare these tables directly, thereby allowing gene-by-gene comparison of human disease phenotypes and mouse genetic phenotypes. Tables 6 and S3 are comparisons of the genes that overlap between the human and mouse gene lists (Table S1 and Table S2) showing mouse gene symbols and their human orthologs. Table 6 is a small subset of selected gene-phenotype cross species comparisons. Even though in some cases the human studies have not been replicated, there is often a striking concordance between human disease phenotypes and mouse genetically determined phenotypes. For example, the human gene inhibin alpha (INHA) has been associated with premature ovarian failure[32], and shows mouse phenotypes of abnormal ovarian follicle morphology, female infertility, and ovarian hemorrhage[33], among other phenotypes relevant to human disease. Similarly, in humans the engrailed homeobox 2 gene (EN2) has been associated with autistic disorder[34] while the comparison to mouse En2 has genetic mutations involved in abnormal social integration, spatial learning, and social/consecutive interaction, among others[35]. Importantly, the few mouse studies highlighted above, and many found in the main table S3, were published after the corresponding human genetic population based epidemiological studies. Given concerns of false positives and publication bias in human genetic association studies, direct comparisons to related mouse phenotypes may provide supporting evidence that a given gene may be relevant to a specific human disease phenotype. Table S3[36] is a full listing of the 1104 shared genes between the human disease and mouse phenotype summaries.

Summaries of phenotypes and genes in human and mouse

The second type of main summary tables are DISEASE/PHENOTYPE-to-Gene lists. Disease/Phenotype gene summaries are essentially transposed versions of the GENE-to-Disease/Phenotype summaries (Tables S1 & S2) that allow different types of comparisons. These are non-redundant lists of phenotype keywords, MeSH disease terms in the case of human and Mammalian Phenotype Terms (MP) in the case of mouse, followed by the genes associated or annotated to those disease phenotype keywords. Table 7 shows examples of selected human disease phenotypes in each row positively associated with specific human genes for 8 major MeSH disease classes including cardiovascular, digestive system diseases, diseases of environmental origin, immune system diseases, mental disorders, nervous system diseases, nutritional and metabolic diseases, and eye diseases. Each Mesh phenotype term is followed by the number of times that a specific disease term has been positively associated with a particular gene in each row, in decreasing order. Table 7 is a small representative set, truncated in the number of disease phenotypes (rows) and the number of genes (columns). The complete list of 1,318 MeSH disease phenotype terms with additional information can be found in Table S4a[37]. An interactive version of the complete list can be found in Table S4b[38]. Tables 8 and S5 constitute the mouse DISEASE/PHENOTYPE-to-Gene summaries. Table 8 consists of selected mouse phenotypes which fall into similar general classes of the human table 7 followed by 6 representative genes that have been assigned to the appropriate phenotypic term due to a specific mouse genetic model. Unlike the human Disease/Phenotype-to-gene tables 7 and S4, the mouse tables 8 and S5 do not have quantitative information. Table 8 is also a small representative set, truncated in the number of disease phenotypes (rows) and the number of genes (columns). The complete list of 5,142 mouse phenotype terms with their corresponding Mammalian PhenoCode designations can be found in Table S5a[39]. An interactive version of the complete list can be found in Table S5b[40].

Using disease and gene lists

The purpose of this project is not simply to generate lists and information. It is to provide a distillation of disease and phenotype information that can be used in dissecting the complexities of human disease and mouse biology. Now that we have generated GENE-to-disease/phenotype summaries and DISEASE/PHENOTYPE-to-gene summaries for both mouse and human, they can be used for systematic analysis, comparison, and integrating of orthologous data with the goal of providing higher order interpretations of human disease and mouse genetically determined phenotypes.

Human disease and mouse phenotype based gene sets

Gene sets have been defined simply as groups of genes that share common biological function, chromosomal location, or regulation[41]. Gene sets are used in high-throughput systematic analysis of microarray data using a priori knowledge. Unlike previously defined gene sets based on biological pathways or differentially expressed genes[41], GAD disease gene sets are unique in that they are composed of genes that have been previously shown to be both polymorphic and have been determined to be genetically positively associated with a specific disease phenotype in a human population based genetic association study. Similarly, Table S5a[39] the mouse DISEASE/PHENOTYPE-to-Gene list is used as a source for gene sets for mouse phenotypes (MP gene sets) comprised of unique gene based mouse genetic models. These gene set files are currently the largest set of gene set files publicly available and the only gene sets files where each gene is based on direct human or mouse genetic studies.

Comparison of individual GAD disease gene sets

One aspect of common complex disease is that the development of disease and disease phenotypes quite often present along a broad spectrum of symptoms and share clinical characteristics, endo-phenotypes, or quantitative traits with closely related disorders [25]. This is evident in gene sharing, as mentioned above, and equally in the overlap of biological pathways between related disorders. Using GAD disease gene sets, Venn diagram comparisons among related disorders shows modest gene sharing. However, when gene sets are then placed into biological pathways and compared by Venn analysis, there is a marked increase in the overlap in pathways between related disorders. This was not found in gene sets from unrelated disorders. For example, major autoimmune disorders quite often share endophenotypes of lymphoproliferation, autoantibody production, and alterations in apoptosis, as well as other immune cellular and biochemical aberrations. As shown in Figure 1a, genes that have been positively associated with type 1 diabetes, rheumatoid arthritis, and Crohn's disease show a modest overlap. However, when individual gene sets are fitted into biological pathways, then compared for overlap of pathway membership, there is a striking increase in the overlap at the pathway level. This is true in a comparison of gene and pathways for type 2 diabetes, insulin resistance, and obesity as well (Figure 1b). This pattern of major pathway overlap does not seem to occur between unrelated disorders, such as insulin resistance, rheumatoid arthritis and bipolar disorder (Figure 1c). This disease related sharing at the pathway level suggests common regulatory mechanisms between these disorders and that the original positive associations are not necessarily due to random chance alone.
Figure 1

Venn Diagram analysis of individual GAD disease gene sets (circles) versus pathways (rectangles) produced from the corresponding gene set. All Venn Diagrams were produced with Venny http://bioinfogp.cnb.csic.es/tools/venny/index.html.

Venn Diagram analysis of individual GAD disease gene sets (circles) versus pathways (rectangles) produced from the corresponding gene set. All Venn Diagrams were produced with Venny http://bioinfogp.cnb.csic.es/tools/venny/index.html.

Group analysis of GAD disease gene sets between major classes of disease/phenotypes

Dendrogram analysis of human disease gene sets

As archival information grows, analysis of complex molecular and genetic datasets using clustering or network approaches has become increasingly more useful [13,42-45]. Therefore, in addition to comparisons between individual diseases using human and mouse gene sets, we analyzed large gene groups using dendrogram and clustering approaches based on gene sharing between gene sets. Figure 2 shows a broad based dendrogram comparison based on gene sharing between 480 GAD disease gene sets, using gene sets each containing at least 3 genes. A striking feature of this analysis is that at a coarse level, major disease groups cluster together in space demonstrating shared genes between major clinically important disease groups. Disease domains are represented by groups such as cardiovascular disorders, metabolic disorders, cancer, immune and inflammatory disorders, vision, and chemical dependency. At finer detail within a specific broader group, it becomes clear that individual diseases with overlapping phenotypes are found close in space, such as asthma, allergic rhinitis, and atopic dermatitis. This overlap due to gene sharing recapitulates an overlap in clinical characteristics between these related disorders. Similarly, phenotypes within the metabolic group related to diabetes are closely aligned in space including; insulin resistance, hyperglycemia, hyperinsulinemia, and hyperlipidemia. This close apposition of related disease phenotypes and sub-phenotypes at both a coarse and fine level is a consistent feature of the overall display. The human gene sets used in creating this tree diagram can be found in Table S6[46]. It is important to emphasize that this display and the distance relationships between diseases are calculated through an unbiased gene-sharing algorithm independent of disease phenotype labels and not as a result of an imposed logical hierarchy or an ontological annotation system. This grouping of major disease phenotypes based solely on gene sharing provides supporting evidence that the underlying disease based gene sets may have a fundamental relevance to disease and may not be reported in the literature by chance alone.
Figure 2

Human dendrogram comparison of 480 GAD disease gene sets based on gene sharing. The input GAD gene set file for this figure can be found in Table S6[46].

Human dendrogram comparison of 480 GAD disease gene sets based on gene sharing. The input GAD gene set file for this figure can be found in Table S6[46].

Dendrogram analysis of mouse phenotypic gene sets

Figure 3 is a similar dendrogram to the human tree using 1056 mouse phenotypic gene sets, using gene sets each containing at least 10 genes. This was produced using the same gene sharing algorithm as for the human gene sets in Figure 2. As with the human dendrogram, the mouse tree displays informative groupings at both a coarse and fine level. This tree groups into major groupings nominally assigned as brain development and brain function, embryonic development, cardiovascular, reproduction, inflammation, renal function, bone development, metabolism, and skin/hair development. The identification of major groupings emphasizing developmental processes reflects the emphasis of gene knockouts and developmental models resulting in observable morphological traits and less so with regard to end stage clinical diseases as in the human dendrogram. Like the human dendrogram (Figure 2) discrete major functional groupings in the mouse dendrogram suggests that individual experimental observations are not random. Fundamental complex processes such as metabolism, cardiovascular phenomena, and developmental processes are integrated by extensive sharing of related pliotropic genes. Moreover, like the human tree, fine structure in the mouse tree shows related mouse phenotypes are closely positioned in space. For example, in the metabolism major grouping, the individual phenotypes of body mass, adipose phenotypes, and weight gain are closely positioned. Similarly, in the brain function group, the behavioral phenotypes of anxiety, exploration, and responses to novel objects are found next to one another. This pattern is a fundamental feature of this tree. Like the human tree, the mouse dendrogram shown here is based solely on a gene sharing algorithm using genes assigned to individual phenotypes. It is not based on an imposed predetermined hierarchy or ontology. Importantly, unlike the human tree, the information contained in the mouse tree is derived from individual independent mouse genetic studies and phenotypic observations and not from large case controlled population based epidemiological studies. Controversial issues such as publication bias or study size which confound human genetic association studies are not as relevant here in the context of studies of experimentally determined individual mouse gene knockouts and related studies. The mouse gene sets used in creating this tree diagram can be found in Table S7[47].
Figure 3

Mouse dendrogram comparison of 1056 mouse phenotype (MP) gene sets based on gene sharing. The input MP gene set file for this figure can be found in Table S7[47].

Mouse dendrogram comparison of 1056 mouse phenotype (MP) gene sets based on gene sharing. The input MP gene set file for this figure can be found in Table S7[47].

Hierarchical clustering of human and mouse gene sets

Hierarchical clustering has become a common tool in the analysis of large molecular data sets[48] allowing identification of similar patterns in a scalable fashion from the whole experiment down to a level of fine structure. To provide further evidence of disease relevance and biological content contained in both the human and mouse gene sets hierarchical clustering was performed on both human and mouse. Four hundred and eighty human gene sets were clustered producing 46 major disease clusters. In the mouse, clustering was performed on 2067 mouse phenotype gene sets, using gene sets containing at least 3 genes. This resulted in 165 major subgroups of functional phenotypic specificity. Hierarchical clustering is shown for human [Additional file 1 and Additional file 2] and for mouse [Additional file 3 and Additional file 4]. Like the human and mouse dendograms, this hierarchical clustering showed functional disease grouping at both a coarse group level and at a fine level within major phenotypic groupings. These clusters in both human and mouse falling into closely defined broad functional groups as well as closely related clinical, physiological, and developmental phenotypes demonstrates a general pattern of relevance to disease in their original underlying genetic associations. As in the dendrogram displays, this suggests that the genes nominally positively associated to these disorders, drawn from the medical literature, are not pervasively randomly assigned or due to a widespread pattern of random false positives associations.

Discussion and Conclusion

This report describes a summary of the positive genetic associations to disease phenotypes found in the Genetic Association Database as well as a summary of mouse genetically determined phenotypes from the MGI phenotypes database. The genes and disease lists described here were derived from a broad literature mining approach. We have shown disease relevance in three distinct ways; a) in comparing individual gene lists and pathways, b) comparing between species and, c) in broad based comparative analysis utilizing complex systems approaches. Moreover, we identify disease based genes sets for 1,317 human disease phenotypes as well as 5,142 mouse experimentally determined phenotypes. These resources are the largest gene set files currently publicly available and the only gene set files derived from population based human epidemiological genetic studies and mouse genetic models of disease. Each individual GAD disease gene set (i.e. a single disease term followed by a string of genes) or mouse phenotype gene set becomes a candidate for a number of uses and applications including: a) contributing to complex (additive, multiplicative, gene-environment) statistical models for any given disease phenotype [49-53]; b) use in comparative analysis of disease between disease phenotypes; c) use in interrogating other related data types, such as microarray (see below), proteomic, or SNP data [54-56]; and d) integration into annotation engines[57] or genome browsers[58] or other analytical software to add disease information in comparative genomic analysis. In a sense, each individual human or mouse disease/phenotype gene set becomes a unique hypothesis, testable in a variety of ways. Increasingly, combinations of genes may have important predictive value as combinatorial biomarkers in predicting disease risk as opposed to single candidate genes [59,60]. In addition, in an ongoing parallel set of experiments, using a Gene Set Analysis (GSA) approach using the web tool Disease/Phenotype web-PAGE, in the analysis of orthologous microarray data (De S, Zhang Y, Garner JR, Wang SA, Becker KG: Disease and phenotype gene set analysis of disease based gene expression, unpublished), both the human and mouse disease/phenotype gene sets defined above demonstrate striking disease specificity in PAGE[61] gene set analysis of previously published microarray based gene expression studies from numerous independent laboratories in both a species specific and cross species manner. This was true when studying gene expression studies of type 2 diabetes, obesity, myocardial infarction and sepsis, among others, providing further evidence of the disease and clinical relevance of both the human and mouse gene sets. This approach is limited in a number of ways. In particular, the GAD database compares the results of human population based epidemiological studies performed using different sample sizes, populations, statistical models, and at different times over approximately the last 16 years. In addition, the GAD database draws on association studies of broad quality with different degrees of detail provided. Although all human genetic association studies discussed here have been individually determined to be positively associated with a disease or phenotype in a peer reviewed journal, we make no assertion that any individual study is correct and we recognize the controversy in the genetics community regarding statistical and biological significance of genetic association studies. Moreover, although the GAD database contains information on polymorphism and variation, and each GAD record is fundamentally based on polymorphism, this report does not consider variation or polymorphism in the summaries shown. Likewise, mouse genetic models in many cases are weighted to gene knockouts which may not be necessarily be directly representative of multifactorial human common complex disease. However, even with these limitations, we believe valuable insights can be gained from broad based literature assessments of the genetic contribution in human common complex disease and in mouse phenotypic biology. More importantly, this suggests greater opportunities for systematic mining and analysis of published data and in cross comparison of archival molecular databases in both human and animal models of disease with regard to genetic variation, population comparisons, and integration with many different types of orthologous information.

Abbreviations

GAD: Genetic Association Database; MGI: Mouse Genome Informatics; MeSH: Medical Subject Headings; GWAS: Genome Wide Association Study; CDC: Centers for Disease Control and Prevention; HuGENet: Human Genome Epidemiology Network.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

YZ performed statistical analysis, gene set assembly, and contributed to the manuscript. SD performed dendrogram and clustering analysis and contributed to the manuscript. JG, KS, and SAW did database curation and analysis. KGB organized the project, did database curation, performed comparisons, and wrote the manuscript. All authors read and approved the manuscript.

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1755-8794/3/1/prepub

Additional file 1

Hierarchical clustering of 480 Human GAD disease gene sets. This file contains a display of hierarchical clustering of 480 Human GAD disease gene sets, each gene set contain at least 3 genes each. Click here for file

Additional file 2

Individual human disease functional clusters. This file contains selected subsets of Additional File 1 including; a. tumorigenesis, b. autoimmune, c. cardiovascular, d. metabolism, and e. behavior. Click here for file

Additional file 3

Hierarchical clustering of 2067 Mouse phenotypic gene sets. This file contains a display of hierarchical clustering of 2067 Mouse phenotypic gene sets, each gene set contain at least 10 genes each. Click here for file

Additional file 4

Individual mouse phenotypic functional clusters. This file contains selected subsets of Additional File 2 including; a. immune function, b. metabolism, c. neurological function/behavior, d. DNA replication/tumorigenesis, e. development and f. cardiovascular. Click here for file
  47 in total

1.  The human disease network.

Authors:  Kwang-Il Goh; Michael E Cusick; David Valle; Barton Childs; Marc Vidal; Albert-László Barabási
Journal:  Proc Natl Acad Sci U S A       Date:  2007-05-14       Impact factor: 11.205

Review 2.  Construction of phylogenetic trees.

Authors:  W M Fitch; E Margoliash
Journal:  Science       Date:  1967-01-20       Impact factor: 47.728

3.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

Review 4.  The PTPN22 C1858T functional polymorphism and autoimmune diseases--a meta-analysis.

Authors:  Y H Lee; Y H Rho; S J Choi; J D Ji; G G Song; S K Nath; J B Harley
Journal:  Rheumatology (Oxford)       Date:  2006-06-07       Impact factor: 7.580

5.  Replication of putative candidate-gene associations with rheumatoid arthritis in >4,000 samples from North America and Sweden: association of susceptibility with PTPN22, CTLA4, and PADI4.

Authors:  Robert M Plenge; Leonid Padyukov; Elaine F Remmers; Shaun Purcell; Annette T Lee; Elizabeth W Karlson; Frederick Wolfe; Daniel L Kastner; Lars Alfredsson; David Altshuler; Peter K Gregersen; Lars Klareskog; John D Rioux
Journal:  Am J Hum Genet       Date:  2005-11-01       Impact factor: 11.025

6.  Multifactor dimensionality reduction-phenomics: a novel method to capture genetic heterogeneity with use of phenotypic variables.

Authors:  H Mei; M L Cuccaro; E R Martin
Journal:  Am J Hum Genet       Date:  2007-10-23       Impact factor: 11.025

7.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

8.  PAGE: parametric analysis of gene set enrichment.

Authors:  Seon-Young Kim; David J Volsky
Journal:  BMC Bioinformatics       Date:  2005-06-08       Impact factor: 3.169

9.  GLOSSI: a method to assess the association of genetic loci-sets with complex diseases.

Authors:  High-Seng Chai; Hugues Sicotte; Kent R Bailey; Stephen T Turner; Yan W Asmann; Jean-Pierre A Kocher
Journal:  BMC Bioinformatics       Date:  2009-04-03       Impact factor: 3.169

10.  Network-based analysis of affected biological processes in type 2 diabetes models.

Authors:  Manway Liu; Arthur Liberzon; Sek Won Kong; Weil R Lai; Peter J Park; Isaac S Kohane; Simon Kasif
Journal:  PLoS Genet       Date:  2007-06       Impact factor: 5.917

View more
  45 in total

1.  Disease and phenotype gene set analysis of disease-based gene expression in mouse and human.

Authors:  Supriyo De; Yongqing Zhang; John R Garner; S Alex Wang; Kevin G Becker
Journal:  Physiol Genomics       Date:  2010-08-03       Impact factor: 3.107

Review 2.  PATRIC: the comprehensive bacterial bioinformatics resource with a focus on human pathogenic species.

Authors:  Joseph J Gillespie; Alice R Wattam; Stephen A Cammer; Joseph L Gabbard; Maulik P Shukla; Oral Dalay; Timothy Driscoll; Deborah Hix; Shrinivasrao P Mane; Chunhong Mao; Eric K Nordberg; Mark Scott; Julie R Schulman; Eric E Snyder; Daniel E Sullivan; Chunxia Wang; Andrew Warren; Kelly P Williams; Tian Xue; Hyun Seung Yoo; Chengdong Zhang; Yan Zhang; Rebecca Will; Ronald W Kenyon; Bruno W Sobral
Journal:  Infect Immun       Date:  2011-09-06       Impact factor: 3.441

3.  In silico prediction of physical protein interactions and characterization of interactome orphans.

Authors:  Max Kotlyar; Chiara Pastrello; Flavia Pivetta; Alessandra Lo Sardo; Christian Cumbaa; Han Li; Taline Naranian; Yun Niu; Zhiyong Ding; Fatemeh Vafaee; Fiona Broackes-Carter; Julia Petschnigg; Gordon B Mills; Andrea Jurisicova; Igor Stagljar; Roberta Maestro; Igor Jurisica
Journal:  Nat Methods       Date:  2014-11-17       Impact factor: 28.547

Review 4.  Mouse genetic and phenotypic resources for human genetics.

Authors:  Paul N Schofield; Robert Hoehndorf; Georgios V Gkoutos
Journal:  Hum Mutat       Date:  2012-05       Impact factor: 4.878

Review 5.  The integrin adhesome: from genes and proteins to human disease.

Authors:  Sabina E Winograd-Katz; Reinhard Fässler; Benjamin Geiger; Kyle R Legate
Journal:  Nat Rev Mol Cell Biol       Date:  2014-04       Impact factor: 94.444

6.  A knowledge-based approach for predicting gene-disease associations.

Authors:  Hongyi Zhou; Jeffrey Skolnick
Journal:  Bioinformatics       Date:  2016-06-09       Impact factor: 6.937

7.  Integration of genome-wide approaches identifies lncRNAs of adult neural stem cells and their progeny in vivo.

Authors:  Alexander D Ramos; Aaron Diaz; Abhinav Nellore; Ryan N Delgado; Ki-Youb Park; Gabriel Gonzales-Roybal; Michael C Oldham; Jun S Song; Daniel A Lim
Journal:  Cell Stem Cell       Date:  2013-04-11       Impact factor: 24.633

8.  Laboratory mouse models for the human genome-wide associations.

Authors:  Georgios D Kitsios; Navdeep Tangri; Peter J Castaldi; John P A Ioannidis
Journal:  PLoS One       Date:  2010-11-01       Impact factor: 3.240

9.  Population genetics of rare variants and complex diseases.

Authors:  M Cyrus Maher; Lawrence H Uricchio; Dara G Torgerson; Ryan D Hernandez
Journal:  Hum Hered       Date:  2013-04-11       Impact factor: 0.444

10.  Inverted low-copy repeats and genome instability--a genome-wide analysis.

Authors:  Piotr Dittwald; Tomasz Gambin; Claudia Gonzaga-Jauregui; Claudia M B Carvalho; James R Lupski; Paweł Stankiewicz; Anna Gambin
Journal:  Hum Mutat       Date:  2012-10-11       Impact factor: 4.878

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.