| Literature DB >> 24244640 |
Abstract
Copy number variation (CNV) is one of the most prevalent genetic variations in the genome, leading to an abnormal number of copies of moderate to large genomic regions. High-throughput technologies such as next-generation sequencing often identify thousands of CNVs involved in biological or pathological processes. Despite the growing demand to filter and classify CNVs by factors such as frequency in population, biological features, and function, surprisingly, no online web server for CNV annotations has been made available to the research community. Here, we present CNVannotator, a web server that accepts an input set of human genomic positions in a user-friendly tabular format. CNVannotator can perform genomic overlaps of the input coordinates using various functional features, including a list of the reported 356,817 common CNVs, 181,261 disease CNVs, as well as, 140,342 SNPs from genome-wide association studies. In addition, CNVannotator incorporates 2,211,468 genomic features, including ENCODE regulatory elements, cytoband, segmental duplication, genome fragile site, pseudogene, promoter, enhancer, CpG island, and methylation site. For cancer research community users, CNVannotator can apply various filters to retrieve a subgroup of CNVs pinpointed in hundreds of tumor suppressor genes and oncogenes. In total, 5,277,234 unique genomic coordinates with functional features are available to generate an output in a plain text format that is free to download. In summary, we provide a comprehensive web resource for human CNVs. The annotated results along with the server can be accessed at http://bioinfo.mc.vanderbilt.edu/CNVannotator/.Entities:
Mesh:
Year: 2013 PMID: 24244640 PMCID: PMC3828214 DOI: 10.1371/journal.pone.0080170
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
All the annotations in CNVannotator web server.
| Data source | Number of genomic coordinate | Source and reference |
|
| ||
| Common CNVs | 356,817 | Common CNVs from DGV database |
| Disease CNVs | 181,261 | Disease CNVs from CNVD database |
|
| ||
| dbVar | 2,716,881 | Genomic structural variants in dbVAR |
| GWASdb | 137,111 | Human genetic variants by GWAS |
| GWAS Catalog | 6381 | Etiologic and functional variants |
| GAD | 3057 | Genetic variants by association studies |
| Gene fusion | 1198/1103 | Experimentally validated gene fusion events from ChimerDB |
| NGS Catalog | 1071 | Genetic variants from NGS-based studies in human |
|
| ||
| microRNA target | 52,920 | Targeting gene for all human miRNAs |
| Coding gene | 30,770 | Protein-coding RefSeq genes |
| Long non-coding RNA | 21,033 | Long non-coding genes (UCSC browser |
| Other non-coding RNA | 1337 | Non-coding genes from UCSC browser (Excluding long non-coding RNAs) |
|
| ||
| ENCODE regulomeDB | 1,880,556 | Genomic functional elements from ENCODE data |
| Segmental duplication | 40,832 | Global analysis result of human segmental duplications |
| Promoter | 29,119 | 500 bp upstream from the transcription start sites using UCSC data |
| CpG island | 28,691 | CpG island data from UCSC browser |
| Methylation | 19,754 | Human disease methylation sites from DiseaseMeth database |
| Pseudogene | 11,983 | Pseudogene data from UCSC browser |
| Enhancer | 1478 | Enhancer data from UCSC browser |
| Cytoband | 862 | Cytoband data from UCSC browser |
| Fragile site | 69 | Human genomic fragile sites from Entrez gene database |
|
| ||
| COSMIC | 125,753 | Somatic mutations in cancer |
| Tumor suppressor | 716 | Coding and non-coding tumor suppressor genes from TSGene database |
| Oncogene | 263 | Coding oncogenes integrated from UniProt and TAG databases |
Two numbers represent the unique genomic regions for the fusion gene pairs.
Figure 1The input, annotation categories, and output of CNVannotator.
Figure 2The layout of CNVannotator gene-based query viewer.
The single gene and multiple gene query interfaces are shown. Both queries require an input of the official gene symbol(s).
Figure 3The genomic region-based query viewer in CNVannotator.
(A) The access to various analytic tools for numerous functional annotation modules. (B) The web interface to input a list of CNVs for annotation. In the drop-down menu, users can choose the most relevant annotation option.
Figure 4An example of the CNVannotator region-based search result layout.
After successfully uploading a CNV list(s), a set of annotations are overlapped to the input CNVs and are represented in by hyperlinks to the UCSC Genome Browser. Additionally, the tabular text file is available to download for further filtering and classification.
The annotation results for the top ten novel CNVs from microsatellite stable hereditary nonpolyposis colorectal cancer samples using the CNVannotator web server.
| Data source | Number of annotations |
| Structure variants from dbVAR | 2008 |
| Disease CNVs | 1432 |
| Common CNVs | 141 |
| Gene fusion events | 104 |
| Cancer mutations | 88 |
| Significant SNPs from GWASdb | 82 |
| The microRNA target genes | 28 |
| Known protein-coding genes | 27 |
| Methylation sites in promoter region | 19 |
| Segmental duplication regions | 17 |
| Promoters regions | 16 |
| Cytobands | 10 |
| CpG islands | 8 |
| Long non-coding RNAs | 6 |
| Pseudogenes | 4 |
| Tumor suppressor genes | 3 |
| Fragile sites | 3 |
| Significant SNPs from GWAS catalog | 2 |
| Oncogenes | 2 |