| Literature DB >> 28535294 |
Javier Lopez1, Jacobo Coll1, Matthias Haimel2,3,4, Swaathi Kandasamy2, Joaquin Tarraga5, Pedro Furio-Tari1, Wasim Bari1, Marta Bleda2,3,4, Antonio Rueda1, Stefan Gräf2,3,4, Augusto Rendon1,2, Joaquin Dopazo6,7,8, Ignacio Medina5.
Abstract
High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets.Entities:
Mesh:
Year: 2017 PMID: 28535294 PMCID: PMC5570161 DOI: 10.1093/nar/gkx445
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
List of studies and versions available at HGVA
| Project name | Studies | Version/date |
|---|---|---|
| Reference GRCh37 | 1000 genomes project GRCh37 | Phase 3 2016–05 |
| Exome Sequencing Project ( | 2016-05 | |
| Exome Aggregation Consortium | 0.3.1 2016-05 | |
| Genome of the Netherlands | Release 5 2016–05 | |
| UK10K project | 2016–05 | |
| Spanish Medical Genome Project | 2016–12 | |
| Reference GRCh38 | 1000 genomes project GRCh38 | Phase 3 2016–10 |
| Cancer GRCh37 | QIMR Berghofer melanoma | 2016–12 |
| Chronic myeloid leukemia–Russian Academy of Medical Sciences ( | 2016–12 | |
| Platinum | Illumina platinum | 2015–08 |
Figure 1.Pipeline for processing and loading variants into the Human Genome Variation Archive (HGVA).
Figure 2.HGVA infrastructure/architecture.
Figure 3.(A) Variant grid. Shows a summary of variant annotation data. (B) Detailed annotation for the corresponding variant selected in the variant grid. (C) Filters menu enabling filtering options from genomic region or gene names to CADD scores, HPO or Gene Ontology terms. (D) Filter bar display to facilitate quick editing/removal of selected filters.