| Literature DB >> 32596782 |
Peter D Stenson1, Matthew Mort2, Edward V Ball2, Molly Chapman2, Katy Evans2, Luisa Azevedo2,3, Matthew Hayden2, Sally Heywood2, David S Millar2, Andrew D Phillips2, David N Cooper2.
Abstract
The Human Gene Mutation Database (HGMD®) constitutes a comprehensive collection of published germline mutations in nuclear genes that are thought to underlie, or are closely associated with human inherited disease. At the time of writing (June 2020), the database contains in excess of 289,000 different gene lesions identified in over 11,100 genes manually curated from 72,987 articles published in over 3100 peer-reviewed journals. There are primarily two main groups of users who utilise HGMD on a regular basis; research scientists and clinical diagnosticians. This review aims to highlight how to make the most out of HGMD data in each setting.Entities:
Mesh:
Year: 2020 PMID: 32596782 PMCID: PMC7497289 DOI: 10.1007/s00439-020-02199-3
Source DB: PubMed Journal: Hum Genet ISSN: 0340-6717 Impact factor: 4.132
Numbers and types of different variants and genes present in HGMD Professional release 2020.2 and the publicly available version of the database (as of June 7th 2020)
| Mutation type | Number of Mutations in HGMD Professional 2020.2 (disease-associated/functional polymorphism sub-total) | Number of Mutations (publicly available via |
|---|---|---|
| Missense substitutions | 136,383 (6435) | 85,225 |
| Nonsense substitutions | 31,407 (392) | 20,779 |
| Splicing substitutions (intronic and exonic) | 24,976 (735) | 17,183 |
| Regulatory (5′ and 3′ and intergenic) | 4723 (3006) | 3544 |
| Small deletions (≤ 20 bp) | 41,749 (369) | 28,155 |
| Small insertions/duplications (≤ 20 bp) | 17,760 (212) | 11,745 |
| Small indels (≤ 20 bp) | 3813 (70) | 2679 |
| Gross deletions (> 20 bp) | 20,448 (170) | 14,186 |
| Gross insertions/duplications (> 20 bp) | 5219 (98) | 3445 |
| Complex rearrangements | 2299 (138) | 1747 |
| Repeat variations | 569 (331) | 498 |
| All HGMD data | 289,346 (11,954) | 189,186a |
| HGVS nomenclature providedb | 263,452 (10,923) | 0 |
| Genomic coordinates/Variant Call Format (VCF) providedc | 263,160 (10,845) | 168,473d |
DM disease-causing mutation, DM? Likely disease-causing, but with questionable pathogenicity, DP disease-associated polymorphism, DFP disease-associated polymorphism with supporting functional evidence, FP in vitro/laboratory or in vivo functional polymorphism
aMutations available via the HGMD Public Website (http://www.hgmd.org)
bAs described in den Dunnen et al. (2016)
cAs described by Danecek et al. (2011)
dThe Ensembl HGMD_PUBLIC release (https://www.ensembl.org/) contains hg19/hg38 genomic coordinates and HGMD accession numbers only
eTotal excludes mitochondrial genes (searchable but no variant data) and retired records
Fig. 1Mutation totals by year of publication subdivided by variant class. *Figures for 2019 and 2020 not yet complete. DM disease-causing mutation, DM? Likely disease-causing, but with questionable pathogenicity
Fig. 2Top 20 journals by number of mutation entries (HGMD Professional release 2020.2 June 7th 2020) in relation to both primary and additional (secondary) references
Fig. 3Example, online batch result set from HGMD Professional 2020.2
HGMD variant classes
| HGMD variant class | Relevance | Clinical diagnostic setting | NGS research setting |
|---|---|---|---|
| DM—Disease-causing mutation | Literature indicates causal (or likely causal) link with disease | Most important. These data should be prioritized | Depending on the user’s remit, these should be looked at first |
| DM?—Likely disease-causing, but with additional uncertainty | As for DM, but the authors, curators or other literature evidence indicate that further caution is warranted | If no DM variants are found, these should be looked for next, or they should be ranked lower priority if there are DM results | These data may also be of interest, depending upon requirements (e.g. gene ontology or disease concept stratification) |
| DP—Disease-associated polymorphism | Significant statistical association with a clinical phenotype. Likely to be functionally relevant | Likely to be irrelevant in a clinical diagnostic setting | These should be included if personal disease risk is being assessed |
| DFP—DP with supporting functional evidence | As for DP, but definitive functional evidence exists (e.g. via an in vitro luciferase assay) | Potentially important in terms of calculating disease risk (e.g. venous thrombosis risk and Factor 5 Leiden). Other relevant disease risk or drug response variants are also present in this class | If the aim were to look at personal disease risk, or for disease modifiers, then these should be included |
| FP—functional polymorphism with no reported disease association | Functional effect has been demonstrated, but no disease association has been reported as yet | Likely to be irrelevant in a clinical diagnostic setting, although drug response variants may be present | Interesting from a research perspective as potential risk modifier variants |
| R—retired from HGMD | Record has been retired and is no longer considered to be phenotypically relevant | Potentially relevant for the purpose of variant exclusion | Potentially relevant if the researcher is interested in variant re-annotation etc |
Fig. 4Example of an NGS/diagnostic workflow
Fig. 5HGMD vs ClinVar vs OMIM comparison (as of March 2020)