| Literature DB >> 31317185 |
Chien-Yueh Lee1, Amrita Chattopadhyay2, Li-Mei Chiang1, Jyh-Ming Jimmy Juang3,4, Liang-Chuan Lai5, Mong-Hsun Tsai2,6,7, Tzu-Pin Lu8,9, Eric Y Chuang1,2.
Abstract
Integrated analysis of DNA variants and gene expression profiles may facilitate precise identification of gene regulatory networks involved in disease mechanisms. Despite the widespread availability of public resources, we lack databases that are capable of simultaneously providing gene expression profiles, variant annotations, functional prediction scores and pathogenic analyses. VariED is the first web-based querying system that integrates an annotation database and expression profiles for genetic variants. The database offers a user-friendly platform and locates gene/variant names in the literature by connecting to established online querying tools, biological annotation tools and records from free-text literature. VariED acts as a central hub for organized genome information consisting of gene annotation, variant allele frequency, functional prediction, clinical interpretation and gene expression profiles in three species: human, mouse and zebrafish. VariED also provides a novel scoring scheme to predict the functional impact of a DNA variant. With one single entry, all results regarding queried DNA variants can be downloaded. VariED can potentially serve as an efficient way to obtain comprehensive variant knowledge for clinicians and scientists around the world working on important drug discoveries and precision treatments.Entities:
Mesh:
Year: 2019 PMID: 31317185 PMCID: PMC6637258 DOI: 10.1093/database/baz075
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1Overview of VariED database.
Comparison of functions and query results offered by existing databases
| Population allele frequency | GE | VCF file | Batch search | Functional prediction scores | Clinical interpretation | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1000 Genomes | ESP | IJGVD | TWB | ExAC | gnomAD | REVEL | GERP++ | CADD | prediction | ||||
| VariED | + | + | + | + | + | + | + | + | + | + | + | + | + |
| ANNOVAR | + | + | + | + | + | + | + | + | + | ||||
| InterVar | + | + | + | +/−a | +/−a | + | + | ||||||
| ProteinAtlas | + | ||||||||||||
| HGMD | +/−b | +/−b | +/−b | ||||||||||
| Uniprot | +/−c | + | |||||||||||
| GeneCard | + | +/−d | |||||||||||
| FUMA | + | + | + | ||||||||||
| UCSC | + | + | + | + | + | + | + | ||||||
| HaploReg | + | + | + | ||||||||||
| NCBI | + | + | + | + | + | ||||||||
| Ensembl | + | + | + | + | + | + | + | + | + | ||||
Notes: ESP, NHLBI Exome Sequencing Project; IJGVD, Integrative Japanese Genome Variation Database; TWB, Taiwan Biobank; ExAC, Exome Aggregation Consortium; GE, gene expression; REVEL, Rare Exome Variant Ensemble Learner; GERP, Genomic Evolutionary Rate Profiling; CADD, Combined Annotation Dependent Depletion; +, complete information support; +/−, partial information support.
aScript version only.
bProfessional version only.
cSimple declaration only.
d100 genes per query or pay for getting an annual unlimited license.
Indices and their functional consequences
|
|
|
|
|---|---|---|
| 0 | Intronic or intergenic or Synonymous | Non-pathogenic |
| 1 | Low precedencea or REVEL score < thresholdb | Less-pathogenic |
| 2 | High precedencec or splicing or (low precedence & REVEL score > thresholdb) | Moderately pathogenic |
| 3 | Index = 2, and (expression value (TPM) > = 0.5 or GERP++ score > 2) | Highly pathogenic |
aLow precedence: non-frameshift insertion, non-frameshift deletion, non-frameshift substitution and nonsynonymous SNV.
bThreshold = 0.5 (default), for higher specificity the threshold can be set to 0.75.
cHigh precedence: frameshift insertion, frameshift deletion, frameshift substitution, stop-gain, stop-loss.
Figure 2Screenshot of the ‘Variants search’ input page. This page is used to (i) input user queried variants (using chromosomal coordinates or VCF files) and (ii) select output information including allele frequency/counts, reference populations (1000 Genomes/IJGVD/ESP/TWB/ExAC/gnomAD), tissue of interest (any), gene expression profile score (TPM or Rank), functional prediction scores (REVEL, CADD and PolyPhen2), clinical interpretation (ClinVar) and dbSNP versions (Build 151 or Build 152). These options dictate the gene description, allele frequency, functional prediction and clinical interpretation for one or more user-queried variants.
Figure 3Screenshots of outputs from the ‘Variants search’ function. (A) Gene annotation and description of the queried variant, the ‘rsID’ column provides a hyperlink to dbSNP; the ‘External’ column provides hyperlinks to Ensembl, NCBI gene and NIH Genetics Home Reference (NIH—GHI). (B) Allele frequencies of queried variants for each of the chosen reference populations. (C) Functional prediction along with scores such as VariED index, REVEL, CADD_raw, CADD_Phred and GERP++. (D) Clinical significance of the queried variant, the ‘AlleleID’ column provides a hyperlink to ClinVar.
Figure 4Screenshot of the ‘Expression profiles’ page. This page is used to (i) input user-specified gene names, (ii) select reference populations/species and (iii) indicate the tissue/organ of interest for obtaining gene expression profiles.
Figure 5Screenshots of outputs from the ‘Expression profiles’ function. (A) Gene expression profiles for heart and testis in human and mouse for genes SCN5A, MYBPC3, GK2 and GAPDH. (B) Mouse orthologs for the SCN5A gene.