| Literature DB >> 29704307 |
Gillian S Townend1, Friederike Ehrhart1,2, Henk J van Kranen1,3, Mark Wilkinson4, Annika Jacobsen5, Marco Roos5, Egon L Willighagen2, David van Enckevort6, Chris T Evelo1,2, Leopold M G Curfs1.
Abstract
Rett syndrome (RTT) is a monogenic rare disorder that causes severe neurological problems. In most cases, it results from a loss-of-function mutation in the gene encoding methyl-CPG-binding protein 2 (MECP2). Currently, about 900 unique MECP2 variations (benign and pathogenic) have been identified and it is suspected that the different mutations contribute to different levels of disease severity. For researchers and clinicians, it is important that genotype-phenotype information is available to identify disease-causing mutations for diagnosis, to aid in clinical management of the disorder, and to provide counseling for parents. In this study, 13 genotype-phenotype databases were surveyed for their general functionality and availability of RTT-specific MECP2 variation data. For each database, we investigated findability and interoperability alongside practical user functionality, and type and amount of genetic and phenotype data. The main conclusions are that, as well as being challenging to find these databases and specific MECP2 variants held within, interoperability is as yet poorly developed and requires effort to search across databases. Nevertheless, we found several thousand online database entries for MECP2 variations and their associated phenotypes, diagnosis, or predicted variant effects, which is a good starting point for researchers and clinicians who want to provide, annotate, and use the data.Entities:
Keywords: FAIR data; MECP2; Rett syndrome; databases; genetic variation; phenotype
Mesh:
Substances:
Year: 2018 PMID: 29704307 PMCID: PMC6033003 DOI: 10.1002/humu.23542
Source DB: PubMed Journal: Hum Mutat ISSN: 1059-7794 Impact factor: 4.878
MECP2 mutations selected for test database searches
| Variant 1 | Variant 2 | Variant 3 | Variant 4 | Variant 5 | |
|---|---|---|---|---|---|
|
| MBD hotspot mutation from (Zappella et al., | Frequently reported nonsense mutation | Frequently reported missense mutation | WES variant from (Rauch et al., | WGS variant from (Gilissen et al., |
|
| g.153296882G>A | g.153296777G>A | g.153296363G>A | g.153296093_153296115del | g.153295929_153296514del |
|
| c.397C>T | c.502C>T | c.916C>T | c.1200_1222del | c.765_1350del |
|
| p.(Arg133Cys) | p.(Arg168*) | p.(Arg306Cys) | p.(Pro401Argfs*8) | p.(Lys256Asnfs*31) |
The current genome build at the time of writing this article is GRCh38, but most databases were using GRCh37. For MECP2, there is a difference ranging from735 to 659 kbp.
Overview of databases included in the review
| Database and link | Contact | How to cite (literature reference for database) | Short description |
|---|---|---|---|
|
| |||
|
|
| Christodoulou et al. ( |
Specific focus on RTT. Database of genetic information about RTT patients. Contains mutation information about MECP2 as well as CDKL5 and FOXG1 which cause different syndromes (formerly named Rett‐like syndromes). |
|
| |||
|
| Contact via | – |
Genotype‐disease database. Collection of disease‐causing variants in genes. |
|
|
| Landrum et al. ( |
Genotype–phenotype database. Focus on disease‐causing variants in genes. |
| HGMD “professional” |
| Stenson et al. ( | Commercial genotype–phenotype database |
|
| |||
|
| MECP2 curator: | Fokkema et al. ( |
Genetic variants database. Locus/gene specific, all genes. |
|
|
| Firth et al. ( |
Genotype–phenotype database. All genes. |
|
|
| – |
Genetic variants database. Originally those which contribute to heart, lung and blood disorders. Now open to all genes, linked to dbSNP and dbGAP. |
|
|
| Lek et al. ( |
Database/project to collect and harmonize whole exome sequencing data. Allows search for variants at certain locations or single genes, and direct search for variants. |
|
|
| Kitts et al. ( |
Genetic variation database. Collection of single nucleotide polymorphism (SNP) and an effect predictor score. |
|
| |||
|
|
| Lappalainen et al. ( | Database for genomic structural variations, including indels, mobile element insertions, duplications, inversions, translocations, and complex chromosomal arrangements. |
|
|
| – |
Variant browser. Allows search for variants of specific locations or genes. |
|
|
| Lancaster et al. ( |
Meta‐database for genetic variants, genotype‐phenotype databases. Links to 1000 Genomes Project, dbSNP, Diagnostic Variants, Diagnostic Mutation Database, The Frequency of Inherited Disorders Database, Finnish Disease, FORGE Canada Consortium, PhenCode, UniProt, Human Gene Mutation Database, Locus‐specific Databases. Freely available, but some of the linked databases content is only available after registration. |
|
|
| Pinero et al. ( |
Database for gene‐disease and variant‐disease associations. Imports data from curated databases like Uniprot, ClinVar, GWAS Catalog, and so on. |
Description of database structure and information types
| Database | ↑ Up‐ and↓ Download of dataAPI (if available) | Phenotype information available | Genotype information available |
|---|---|---|---|
| RettBase |
↑ Submission of data by mail possible ↓No download function, Web interface No API or similar | Information on whether RTT or not, distinguishes between classical, atypical, preserved speech, and forme fruste RTT, mental retardation (not Rett), Autism | According to HGVS change on the mRNA/cDNA level, RefSeq NM_004992 unless stated otherwise |
| KMD |
↑ Submission of data by registered users ↓No download function, Web interface No API or similar | Diagnosed with RTT using the OMIM identifier (= RTT/RTT preserved speech variant) | According to HGVS change on the mRNA/cDNA level and RefSeq |
| ClinVar |
↑ Possible, detailed submission templates and instructions available ↓ Download/export of search results in form of text files or UI lists possible API available | Information on whether Pathogenic or not, Diagnosis, for example, RTT, Autism, X‐linked mental retardation | According to HGVS change on the mRNA/cDNA level (mostly) and RefSeq |
| HGMD “professional” |
↑ Not possible, HGMD has its own data acquisition resources ↓ Download and export possible (for registered paying users) |
UMLS (ontology) HPO (ontology) OMIM SNOMED CT MeSH | Descriptive: e.g. 11 kb deletion in exon 1–2, HGVS format in the detailed description |
| LOVD |
↑ Upload possible after registration with Submitter clearance ↓ Download of complete database possible, not for specific genes/search results, API available for LOVD 2.0, for LOVD 3.0 | Variant effect predictor: “+” indicating the variant affects function, “+?” probably affects function, “‐” does not affect function, “‐?” probably does not affect function, “?” effect unknown, “.” effect not classified. | According to HGVS change on the mRNA/cDNA, DNA and protein level and RefSeq |
| DECIPHER |
↑Open upload, bulk upload templates ↓ Web interface, and “Anonymised consented DECIPHER data can be made available in the form of a downloadable encrypted file from a secure server under a data access agreement. Please see the section on data access agreement on the Data Sharing page.” API available | Detailed phenotype description, using HPO annotations | According to HGVS change on the mRNA/cDNA level and Ensemble ID of transcript used (includes RefSeq) |
| EVS |
↑ Data is exclusively from NHLBI GO Exome Sequencing Project (ESP) ↓ Bulk download files, download of specific gene variant information search results as text or VCF No API or similar | Variant effect prediction by PolyPhen2 | According to HGVS on the mRNA/cDNA and protein level, rs IDs |
| ExAC Browser |
↑ No upload possible, ExAC includes data from a list of projects ↓ Export of variation table as CSV possible API available | Variant effect prediction: Consequence of variation, for example, intronic variation, and consequence of protein aa change | rs IDs, genomic position, RefSeq and allele |
| dbSNP |
↑ Submission possible either directly or via EVA, dbGAP or ClinVar ↓ Possible, “Send to file” function for search results, batch query function for machine readability API at | Variant effect prediction, consequences like, for example, intronic variation, and consequence of phenotype, for example, increased susceptibility to diseases, is given. No RTT mutations are yet available. | rs IDs, HGVS (mRNA/cDNA) |
| dbVAR |
↑ Possible, no clinical data (ClinVar), no sensitive data (dbGAP) ↓”Send‐to‐file” function API at | Clinical Assertion: pathogenic/uncertain significance | rs ID and allele |
| EVA |
↑ Open to everyone, submission guidelines ↓ Free ‐ Export function (CSV), API available | Variant effect prediction by PolyPhen2/SIFT | rs IDs and allele |
| Cafe Variome |
↑ Upload direct to Café Variome “hosted” or “in‐a‐box” ↓ Export of search results in different formats (CSV, html, LOVD…) API available |
dbSNP: “phenotype” column, no entries HGMD: no phenotype data Locus specific: no phenotype data PhenCode: phenotype entry for 1/5 of entries: Diagnosis (RTT, X‐linked mental retardation) Uniprot: same as PhenCode |
dbSNP: HGVS (mRNA/cDNA) allele and RefSeq, HGMD: no variant data visible Locus specific: HGVS (mRNA/cDNA) allele and RefSeq PhenCode: HGVS (mRNA/cDNA), Reference links to original data source, Uniprot: HGVS (mRNA/cDNA), reference links to UniProt ID |
| DisGeNET |
↑ No submission, adding of data by text mining and other databases ↓Download of search results possible in different formats (download page | Diagnosis | rs IDs |
Findable at FairSharing.org.
Number of database entries for MECP2 or RTT in general and five specific variants (status March 2018). Number: variant present in this number, + variant present, displayed without details, − variant not found
| Database | Total number of MECP2 variant entries | Variant 1 c.397C>T missense | Variant 2 c.502C>T nonsense | Variant 3 c.916C>T missense | Variant 4 c.1200_1222del | Variant 5 c.765_1350del |
|---|---|---|---|---|---|---|
| RettBase | 4738 (897 unique) | 217 | 363 | 246 | − | − |
| Korean Mutation Database | 35 | 1 | 1 | 1 | − | − |
| ClinVar | 1145 | 1 | 13 | 13 | − | − |
| HGMD “professional” | 975 | − | + | + | − | + |
| LOVD3.0 MECP2 | 4588 (807 unique) | 197 | 335 | 218 | − | + |
| DECIPHER | 203 | 6 | 4 | 2 | − | − |
| EVS | 117 | − | − | − | − | − |
| ExAC | 599 | − | − | − | − | − |
| dbSNP | 4229 | + | + | + | − | − |
| dbVAR | 469 | + | + | + | − | − |
| EVA | 378 | + | + | + | − | − |
| Cafe Variome – dbSNP | 500 | − | 1 | 1 | − | − |
| Cafe Variome – PhenCode | 809 | 1 | 1 | 1 | − | − |
| Cafe Variome – UniProt | 71 | 1 | − | 1 | − | − |
| Cafe Variome – HGMD | 249 | − | − | − | − | − |
| Cafe Variome – Locus‐specific Databases | 10 | − | − | 1 | − | − |
| DisGeNET | + | + | + | + | − | − |
dbSNP and the Café Variome request to dbSNP provided different numbers for MECP2 entries, the same applies for LOVD and HGMD. As the Café Variome link uses the public version of HGMD the exact variants are not shown.
Search was done via rs number which does not give the exact variation, only position.
The numbers for dbSNP and dbVAR are from NCBI's Variation Viewer for MECP2 (GRCh37.p13).