| Literature DB >> 32338743 |
Manuel Holtgrewe1,2, Oliver Stolpe1,2, Mikko Nieminen1,3, Stefan Mundlos4,5, Alexej Knaus6, Uwe Kornak4,5, Dominik Seelow4,7, Lara Segebrecht4, Malte Spielmann5,8, Björn Fischer-Zirnsak4,5, Felix Boschann4, Ute Scholl7,9, Nadja Ehmke4, Dieter Beule1,3.
Abstract
VarFish is a user-friendly web application for the quality control, filtering, prioritization, analysis, and user-based annotation of DNA variant data with a focus on rare disease genetics. It is capable of processing variant call files with single or multiple samples. The variants are automatically annotated with population frequencies, molecular impact, and presence in databases such as ClinVar. Further, it provides support for pathogenicity scores including CADD, MutationTaster, and phenotypic similarity scores. Users can filter variants based on these annotations and presumed inheritance pattern and sort the results by these scores. Variants passing the filter are listed with their annotations and many useful link-outs to genome browsers, other gene/variant data portals, and external tools for variant assessment. VarFish allows users to create their own annotations including support for variant assessment following ACMG-AMP guidelines. In close collaboration with medical practitioners, VarFish was designed for variant analysis and prioritization in diagnostic and research settings as described in the software's extensive manual. The user interface has been optimized for supporting these protocols. Users can install VarFish on their own in-house servers where it provides additional lab notebook features for collaborative analysis and allows re-analysis of cases, e.g. after update of genotype or phenotype databases.Entities:
Mesh:
Year: 2020 PMID: 32338743 PMCID: PMC7319464 DOI: 10.1093/nar/gkaa241
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Feature comparison of state-of-the-art web-based tools for variant filtering and prioritization. The tools are ordered by the date of the most recent update, ties are broken by number of features
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
| 2013 | ✓ | ✓ | ✓ | ✓ | ||||||||||||||
|
| 2014 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||
|
| 2015 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||||
|
| 2017 | ✓ | ✓ | ✓ | ✓ | ||||||||||||||
|
| 2017 | ✓ | ✓ | ✓ | ✓ | ✓ | (✓) | ||||||||||||
|
| 2018 | (✓) | (✓) | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ||||||
|
| 2019 | ✓ | ✓ | ✓ | ✓ | ||||||||||||||
|
| 2020 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | (✓) | ✓ | ||||||||||
|
| 2020 | ✓ | ✓ | ✓ | ✓ | ||||||||||||||
|
| 2020 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | |||||||||
|
| 2020 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓* | ✓* | ✓* |
A tick mark (✓) indicates that the feature is present. ‘Interactivity’ allows users to interactively change filter and sorting options. ‘Tabular downloads’ allows users to download a spreadsheet (e.g. Excel) file with their results. ‘User annotations’ allow users to leave flag, color coded, or free-text comments, ‘ACMG-AMPs support’ assists users in creating variant assessments following the ACMG guidelines. ‘Quality control’ allows users to perform visual quality control, e.g. for depth of coverage, while ‘sanity checks’ allow to compare the relatedness or sex inferred from the data with user-provided meta data. ‘Multi-sample VCFs’ supports cases with more than one sample while ‘multiple VCFs’ supports querying multiple cases at the same time. ‘Bring your server’ allows users to create their own installations on their own server. ‘In-house DB’ allows to build a database of variants identified at the user's institution. An asterisk (*) indicates that the feature is only available on installations on the user's own server. Parentheses around the tick mark in ‘multi-sample VCF’ row indicates that filtering is restricted to predefined models of inheritance. Parentheses for GeneTalk indicate that the feature is only available when using the PEDIA tool.
Figure 1.Illustration of the VarFish workflow. Sub figure (A) shows the preprocessing and import step that can be either triggered by computational on the command line (e.g. in parallel on a compute cluster) or by non-computational staff by upload to VarFish Kiosk. The annotated files are then imported into the VarFish database. Sub figure (B) shows the query construction and execution step. The form values are converted into a SQL query that is sent to the database server for execution. After execution, the results are reported to the user.
Figure 2.Quality control plots following the Peddy (26) approach. The plots are described in the main text.
A selection of the most important databases whose data is integrated into VarFish or that VarFish links out to. A full list can be found in the online manual available at https://varfish-server.rtfd.io contains an updated version
| Category | Database | |
|---|---|---|
| Integrated | Frequency | gnomAD |
| Clinical | ClinVar | |
| Phenotype | HPO | |
| Gene description | NCBI Gene & GeneRIF | |
| Constraint scores | gnomAD pLI/LOEUF | |
| Conservation | UCSC 100 Vertebrates | |
| Link-Out | Gene database | NCBI Entrez, GeneCards, PubMed, PanelApp |
| Variant score/tool | MutationTaster, varSEAKSplicing, VariantValidator | |
| Variant database | Beacon Network, VarSome | |
| Genome browser | Locus in local IGV, Public UCSC, DGV, ENSEMBL |
Figure 3.User annotation of variants. Users can apply flags and color codes to variants and leave free-text annotations. Flags include ‘bookmark’, ‘reported as candidate’ and ‘final causative variant’ as well as ‘no phenotype linked to gene’. Color codes can be assigned in categories ‘raw data visual inspection’, ‘gene clinical/phenotype match’ and ‘validation results’ as well as an overall summary color. Also see Section S2 in the Supplemental Material.
Figure 4.The filtering interface. The ‘Quick Presets’ control allows for the coarsest (yet easiest to use) update of filter criteria. The other fields in the top row allow presets for each category while the tabs in the form below allow to fine-tune filter and priorization options where necessary.
Figure 5.Catel-Manzke cohort filtering results (first 15 variants shown) to reproduce the finding that TGDS is the most likely candidate for being the disease gene.