| Literature DB >> 20952405 |
Simon A Forbes1, Nidhi Bindal, Sally Bamford, Charlotte Cole, Chai Yin Kok, David Beare, Mingming Jia, Rebecca Shepherd, Kenric Leung, Andrew Menzies, Jon W Teague, Peter J Campbell, Michael R Stratton, P Andrew Futreal.
Abstract
COSMIC (http://www.sanger.ac.uk/cosmic) curates comprehensive information on somatic mutations in human cancer. Release v48 (July 2010) describes over 136,000 coding mutations in almost 542,000 tumour samples; of the 18,490 genes documented, 4803 (26%) have one or more mutations. Full scientific literature curations are available on 83 major cancer genes and 49 fusion gene pairs (19 new cancer genes and 30 new fusion pairs this year) and this number is continually increasing. Key amongst these is TP53, now available through a collaboration with the IARC p53 database. In addition to data from the Cancer Genome Project (CGP) at the Sanger Institute, UK, and The Cancer Genome Atlas project (TCGA), large systematic screens are also now curated. Major website upgrades now make these data much more mineable, with many new selection filters and graphics. A Biomart is now available allowing more automated data mining and integration with other biological databases. Annotation of genomic features has become a significant focus; COSMIC has begun curating full-genome resequencing experiments, developing new web pages, export formats and graphics styles. With all genomic information recently updated to GRCh37, COSMIC integrates many diverse types of mutation information and is making much closer links with Ensembl and other data resources.Entities:
Mesh:
Year: 2010 PMID: 20952405 PMCID: PMC3013785 DOI: 10.1093/nar/gkq929
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Total contents in v48 of the COSMIC database, July 2010 release
| Curated data type | Curated data count |
|---|---|
| Experiments | 2 760 220 |
| Tumours | 541 928 |
| Mutations | 136 326 |
| References | 10 383 |
| Genes | 18 490 |
| Fusions | 4946 |
| Structural variants | 2307 |
| Whole cancer genomes | 29 |
Figure 1.Circos diagram summarizing the full somatic mutation content of cell line NCI-H209. Concentric rings summarize the data on different types of mutation. From the inside out, the core displays the structural rearrangements; intrachromosomal are in green, interchromosomal in purple. The next ring out shows the chromosomal copy number in histogram form, with inner red patches indicating regions of LOH. Further out, several rings of single base coding substitutions are shown (black tiles show splice site mutations, red stop-gained, purple non-synonymous and grey synonymous changes). The inner dark orange and outer light orange histograms represent non-coding mutations, relative frequencies of homozygous and heterozygous mutations, respectively. In the final ring before the chromosome indicators, indels are shown in green; light green represents insertions and dark green deletions.
Figure 2.The gene histogram page for TP53. The histogram shows relative frequencies of mutations (y-axis) across the CDS of the gene (x-axis). Underneath the x-axis scale bar are complex replacement mutations, followed by simple deletions (blue triangles) and insertions (red triangles). Under this, zoom options are available. On the left, the new specialization filters are shown, offering many query options.
Figure 3.Pie charts (here showing the TP53 gene) are increasingly used for summarization of complex spectrum data in COSMIC. Two are currently live with many more forthcoming. The top graph (a) shows the breakdown of all observed mutations by type, and the lower (b) shows the breakdown of mutated samples by source. The total number differs slightly due to some samples having more than one mutation, thus being counted once in (b) but twice or more in (a).
Figure 4.Mutation spectrum histogram for whole-genome-resequencing sample COLO-829, displaying the considerable overrepresentation of C:G>T:A events in its coding mutation repertoire, reflecting the characteristic signature of DNA damage due to ultraviolet light exposure common in malignant melanoma.