| Literature DB >> 22139925 |
Mulin Jun Li1, Panwen Wang, Xiaorong Liu, Ee Lyn Lim, Zhangyong Wang, Meredith Yeager, Maria P Wong, Pak Chung Sham, Stephen J Chanock, Junwen Wang.
Abstract
Recent advances in genome-wide association studies (GWAS) have enabled us to identify thousands of genetic variants (GVs) that are associated with human diseases. As next-generation sequencing technologies become less expensive, more GVs will be discovered in the near future. Existing databases, such as NHGRI GWAS Catalog, collect GVs with only genome-wide level significance. However, many true disease susceptibility loci have relatively moderate P values and are not included in these databases. We have developed GWASdb that contains 20 times more data than the GWAS Catalog and includes less significant GVs (P < 1.0 × 10(-3)) manually curated from the literature. In addition, GWASdb provides comprehensive functional annotations for each GV, including genomic mapping information, regulatory effects (transcription factor binding sites, microRNA target sites and splicing sites), amino acid substitutions, evolution, gene expression and disease associations. Furthermore, GWASdb classifies these GVs according to diseases using Disease-Ontology Lite and Human Phenotype Ontology. It can conduct pathway enrichment and PPI network association analysis for these diseases. GWASdb provides an intuitive, multifunctional database for biologists and clinicians to explore GVs and their functional inferences. It is freely available at http://jjwanglab.org/gwasdb and will be updated frequently.Entities:
Mesh:
Year: 2011 PMID: 22139925 PMCID: PMC3245026 DOI: 10.1093/nar/gkr1182
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.The overview of GWASdb database design. GWASdb consists of three main functions: precise scientific curation and resources integration on GWAS, comprehensive annotation of genetic variants and disease-oriented analysis in terms of DOLite and HPO.
Description of annotations organized in GWASdb
| Level | Item | Description | Reference |
|---|---|---|---|
| Snp Summary | General information | dbSNP 132 annotation for each GV | dbSNP-Q ( |
| Genome-wide association | Manual curation and collection | GWASdb | |
| 1000 Genome SNP | SNPs and indels in 1000 Genomes Project 1049 subjects (May 2011 release) | 1000 genome project | |
| LD plot | LD data from HapMap Phase II+III | HapMap | |
| Genomic mapping | Reference gene | Gene annotation from NCBI Refseq | NCBI Refseq |
| Ensemble gene | Gene annotation from Ensemble | Ensemble | |
| Known gene | Gene annotation from UCSC | UCSC | |
| Small RNA | snoRNA and miRNA annotations from UCSC | UCSC | |
| MicroRNA target | TargetScan generated miRNA target site predictions | UCSC | |
| Transcriptional factor binding site | Transcription factor binding sites conserved in the human/mouse/rat alignment, based on transfac Matrix Database (v7.0) | UCSC | |
| Enhancer | Human Enhancer verified by experiment | VISTA Enhancer DB ( | |
| Insulator | CTCF binding site database for characterization of human genomic insulators | CTCFBSDB ( | |
| Regulatory effects | Transcriptional factor binding site affinity | GV affinity of TFBS prediction based on fold energy change with PWM scanning | GWASdb, TRANSFAC ( |
| MicroRNA target site affinity (for Pita) | GV affinity of miRNA target prediction based on fold and hybrid energy change for PITA top targets | GWASdb, PITA ( | |
| MicroRNA target site affinity (for Miranda) | GV affinity of miRNA target prediction based on hybrid energy change for miRanda targets | GWASdb, miRanda ( | |
| Splicing site affinity | GV affinity of splicing site prediction | ssSNPTarget ( | |
| Amino acid substitution | Non-synonymous SNP functional prediction | Non-synonymous GV deterioration prediction | dbNSFP ( |
| Evolution | SNP positive selection | The estimation of FST and heterozygosity of GV for positive selection | SNP@Evolution ( |
| Gene positive selection | The estimation of FST and heterozygosity of gene for positive selection | SNP@Evolution | |
| Conserved functional RNA | Conserved functional RNA, through RNA secondary structure predictions made with the EvoFold program | UCSC | |
| Conserved elements | Conserved elements produced by the PhastCons program based on a whole-genome alignment of vertebrates | UCSC | |
| Gene expression | Three way SNP expression association | Gene co-expression relationships with GV effect | SNPxGE2 ( |
| Disease association | OMIM | Online Mendelian Inheritance in Man | OMIM |
| DGV | Curated catalogue of structural variation in the human genome | Database of Genome Variants | |
| GAD | Archive of human genetic association studies of complex diseases and disorders | Genetic Association Database |
Figure 2.Classifications of GVs from the genic regions and according to the traits/diseases in GWASdb. (a) The proportion of GV/gene transcripts with different functional properties in the genic regions (total representing 43.5% of all GVs in GWASdb). (b) The Top 15 traits/diseases which have the most significant GVs in database based on DOLite catalog.
Figure 3.Illustration of the circular GWAS plot. (a) Overview of the circular GWAS plot, dots show the top two GVs for each study. (b) A description of each of the components in the plot.