| Literature DB >> 24174537 |
Jeffrey R MacDonald1, Robert Ziman, Ryan K C Yuen, Lars Feuk, Stephen W Scherer.
Abstract
Over the past decade, the Database of Genomic Variants (DGV; http://dgv.tcag.ca/) has provided a publicly accessible, comprehensive curated catalogue of structural variation (SV) found in the genomes of control individuals from worldwide populations. Here, we describe updates and new features, which have expanded the utility of DGV for both the basic research and clinical diagnostic communities. The current version of DGV consists of 55 published studies, comprising >2.5 million entries identified in >22,300 genomes. Studies included in DGV are selected from the accessioned data sets in the archival SV databases dbVar (NCBI) and DGVa (EBI), and then further curated for accuracy and validity. The core visualization tool (gbrowse) has been upgraded with additional functions to facilitate data analysis and comparison, and a new query tool has been developed to provide flexible and interactive access to the data. The content from DGV is regularly incorporated into other large-scale genome reference databases and represents a standard data resource for new product and database development, in particular for copy number variation testing in clinical labs. The accurate cataloguing of variants in DGV will continue to enable medical genetics and genome sequencing research.Entities:
Mesh:
Year: 2013 PMID: 24174537 PMCID: PMC3965079 DOI: 10.1093/nar/gkt958
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Content of the DGV. Increase in variants reported in DGV since inception, highlighting the recent transition towards NGS-based approaches for variant discovery (numbers based on year of publication).
DGV content
| Database content | Number of entries |
|---|---|
| Studies | 55 |
| Unique samples | 14 316 |
| Variant regions | 202 431 |
| Deletion | 77 268 |
| Duplication | 668 |
| Loss | 64 185 |
| Gain | 24 891 |
| Gain + loss | 3850 |
| Insertion | 24 140 |
| Inversion | 1149 |
| Complex | 4090 |
| Unknown | 2189 |
| Variant calls | 2 393 718 |
| CNV | 2 391 408 |
| Inversion | 2310 |
| Filtered variants | 3 900 253 |
An overall summary of the number of studies and samples reported in the database (July 2013 update, mapped to GRCh37 assembly). Individual variant types are reported highlighting the distribution of SV content in the database.
Overview of novel features incorporated in DGV
| New tools/features | Categories | Description |
|---|---|---|
| Gbrowse | Navigation | Click and drag zoom capabilities on chromosome and/or position bar. |
| Filter | Option to display only selected entries for DGV structural variant data. | |
| Export | Option to save data from DGV and annotation tracks to a text file for the region, chromosome or whole genome. | |
| Annotations | Additional relevant annotations including ISCA and DECIPHER consented patient data. | |
| Query tool | Study | Information on each individual study in DGV. |
| Variant | Complete list of all structural variants with details on mapping location, samples and the study of origin. | |
| Sample | Details on the identifier, gender, ethnicity and source of samples used in each study. | |
| Method | Description of discovery and validation methods used for each study. | |
| Platform | The name of the platform used in each experiment with links to GEO and Array Express. | |
| Analysis | Individual tools, algorithms and approaches used with associated descriptions. | |
| Export Options | Allows users to save output as csv, excel or PDF file. | |
| Filter Options | Can apply multiple search options across all fields in the database. | |
| Variant details page | Allele State | Identifies if variant is heterozygous or homozygous. |
| Allele Origin | Identifies if a variant is | |
| Copy Number | Reporting the absolute number of copies for a variant call. | |
| Allele length | The length of insertion sequences is listed when available. | |
| Probe number | The number of probes reported for an individual variant call. | |
| Method | Description of discovery and validation methods used for each study. | |
| Analysis | Individual tools, algorithms and approaches used to identify a variant. | |
| Platform | The name of the platform used in each experiment. | |
| Accessions | nsv | NCBI structural variant (variant region). |
| nssv | NCBI ssv (variant call). | |
| esv | EBI structural variant (variant region). | |
| essv | EBI ssv (variant call). | |
| dgv | DGV merged variant; generated if two or more variant regions share >70% reciprocal overlap within a study. |
Improvements in the number of options for navigation and display (gbrowse) are outlined in addition to an overview of the content provided in the relevant tables (query tool). An increased number of attributes have been defined and reported (where applicable) and are outlined with details on the new SV accessions.
Figure 2.Functionality and navigation options for accessing entries in DGV. (A) An example of search options available in the DGV query tool, which identify sample level deletions in study nstd65 mapped to the GRCh37 assembly. (B) Links for each variant in the query tool result, allow for navigation to the variant details page, which includes a summary of all available attributes. (C) Links from the variant details page provide access to the genome browser to allow for evaluation of selected variants in their respective genomic region.