| Literature DB >> 27154141 |
Jiwen Xin1, Adam Mark1,2, Cyrus Afrasiabi1, Ginger Tsueng1, Moritz Juchler3, Nikhil Gopal3, Gregory S Stupp1, Timothy E Putman1, Benjamin J Ainscough4, Obi L Griffith4, Ali Torkamani5,6, Patricia L Whetzel7, Christopher J Mungall8, Sean D Mooney3, Andrew I Su9,10, Chunlei Wu11.
Abstract
Efficient tools for data management and integration are essential for many aspects of high-throughput biology. In particular, annotations of genes and human genetic variants are commonly used but highly fragmented across many resources. Here, we describe MyGene.info and MyVariant.info, high-performance web services for querying gene and variant annotation information. These web services are currently accessed more than three million times permonth. They also demonstrate a generalizable cloud-based model for organizing and querying biological annotation information. MyGene.info and MyVariant.info are provided as high-performance web services, accessible at http://mygene.info and http://myvariant.info . Both are offered free of charge to the research community.Entities:
Keywords: API; Annotation; Cloud; Database; Gene; Repository; Variant
Mesh:
Year: 2016 PMID: 27154141 PMCID: PMC4858870 DOI: 10.1186/s13059-016-0953-9
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Schematic design of the MyGene.info architecture. Colors depict different update frequencies. Small gray circles indicate multiple nodes for scalability. MyVariant.info shares the same architecture as MyGene.info except for different sets of annotation data sources and update frequencies. Additional file 2: Figure S1 shows the exact architecture of MyVariant.info
Fig. 2The demo workflow for candidate gene prioritization using MyVariant.info and MyGene.info web services. We reimplemented five filtering steps in this workflow to prioritize candidate genes from a Miller syndrome study [17]. Selected R code is displayed for each filter step, using myvariant and mygene Bioconductor packages. The number of candidate genes left at each filtering step is displayed at the left side. The full code is available at https://github.com/sulab/myvariant.info/blob/master/docs/ipynb/myvariant_R_miller.ipynb, and also in Additional file 2: Supplementary Note 1
The list of data sources for MyGene.info. Column 1 lists the names of all eight data sources included in MyGene.info. Column 2 lists the version of each data source. Column 3 lists the URL for each data source
| Source | Version | URL | Reference |
|---|---|---|---|
| NCBI Entrez | 2015-10-24 |
| [ |
| Ensembl | 82 |
| [ |
| UniProt | 2015-10-15 |
| [ |
| NetAffx | na35 |
| [ |
| PharmGKB | 2015-10-05 |
| [ |
| UCSC | 2015-10-20 |
| [ |
| CPDB | 31 |
| [ |
| RefSeq | 68 |
| [ |
The list of data sources for MyVariant.info. Column 1 lists the names of all 14 data sources included in MyVariant.info. Column 2 lists the version of each data source. Column 3 shows the number of variants from each data source included in MyVariant.info. Column 4 lists the URL for each data source
| Source | Version | No. of variants | URL | Reference |
|---|---|---|---|---|
| dbNSFP | v3.0c | 82,030,830 |
| [ |
| dbSNP | v144 | 145,132,257 |
| [ |
| ClinVar | 2015-09 | 114,627 |
| [ |
| EVS | v2 | 1,977,300 |
| [ |
| CADD | v1.2 | 163,690,986 |
| [ |
| MutDB | - | 420,221 |
| [ |
| GWAS Catalog | From UCSC | 15,243 |
| [ |
| COSMIC | v68 from UCSC | 1,024,498 |
| [ |
| DOCM | - | 1119 |
| [ |
| SNPedia | - | 5907 |
| [ |
| EMVClass | - | 12,066 |
| [ |
| Wellderly | - | 21,240,519 |
| [ |
| ExAC | v0.3 | 10,195,872 |
| [ |
| GRASP | v2.0.0.0 | 2,212,148 |
| [ |