| Literature DB >> 28105316 |
Daniel Chubb1, Peter Broderick1, Sara E Dobbins1, Richard S Houlston2.
Abstract
The advent of high-throughput sequencing has accelerated our ability to discover genes predisposing to disease and is transforming clinical genomic sequencing. In both contexts knowledge of the spectrum and frequency of genetic variation in the general population and in disease cohorts is vital to the interpretation of sequencing data. While population level data is becoming increasingly available from publicly accessible sources, as exemplified by The Exome Aggregation Consortium (ExAC), the availability of large-scale disease-specific frequency information is limited. These data are of particular importance to contextualise findings from clinical mutation screens and small gene discovery projects. This is especially true for cancer, which is typified by a number of hereditary predisposition syndromes. Although mutation frequencies in tumours are available from resources such as Cosmic and The Cancer Genome Atlas, a similar facility for germline variation is lacking. Here we present the Cancer Variation Resource (CanVar) an online database which has been developed using the ExAC framework to provide open access to germline variant frequency data from the sequenced exomes of cancer patients. In its first release, CanVar catalogues the exomes of 1,006 familial early-onset colorectal cancer (CRC) patients sequenced at The Institute of Cancer Research. It is anticipated that CanVar will host data for additional cancers, providing a resource for others studying cancer predisposition and an example of how the research community can utilise the ExAC framework to share sequencing data.Entities:
Keywords: CanVar; ExAC; Germline; NGS; cancer; colorectal cancer; database; exome sequencing
Year: 2016 PMID: 28105316 PMCID: PMC5200944 DOI: 10.12688/f1000research.10058.1
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Figure 1. The CanVar front page features a search bar, example queries and additional news and updates.
Figure 2. The Gene page is divided in to three parts.
A) metadata and external links, including the ExAC page for a given gene; B) coverage plot and exon/intron structure C) table containing annotations and variant frequencies for each variant identified within a gene. The ExAC_AF column refers to the frequency from non-TCGA ExAC. The variant table has a menu C.1) which is used to select which cancer frequencies are displayed. Currently only NSCCG CRC samples are available.
Figure 3. The Variant page can be divided in to five parts.
A) Call rate of a given variant B) Metadata and external links, including equivalent ExAC page; C) Quality metrics D) Transcript annotations E) Frequency information in different studies.