| Literature DB >> 31345222 |
Claire M Simpson1,2, Bin Zhang1, Peter V Hornbeck1, Florian Gnad3.
Abstract
BACKGROUND: Perturbed posttranslational modification (PTM) landscapes commonly cause pathological phenotypes. The Cancer Genome Atlas (TCGA) project profiles thousands of tumors allowing the identification of spontaneous cancer-driving mutations, while Uniprot and dbSNP manage genetic disease-associated variants in the human population. PhosphoSitePlus (PSP) is the most comprehensive resource for studying experimentally observed PTM sites and the only repository with daily updates on functional annotations for many of these sites. To elucidate altered PTM landscapes on a large scale, we integrated disease-associated mutations from TCGA, Uniprot, and dbSNP with PTM sites from PhosphoSitePlus. We characterized each dataset individually, compared somatic with germline mutations, and analyzed PTM sites intersecting directly with disease variants. To assess the impact of mutations in the flanking regions of phosphosites, we developed DeltaScansite, a pipeline that compares Scansite predictions on wild type versus mutated sequences. Disease mutations are also visualized in PhosphoSitePlus.Entities:
Keywords: Cancer; Disease; PhosphoSitePlus; Posttranslational modification; Signal transduction; TCGA; dbSNP
Mesh:
Year: 2019 PMID: 31345222 PMCID: PMC6657027 DOI: 10.1186/s12920-019-0543-2
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Overview of PhosphoSitePlus and TCGA datasets. a Total number of human PTM sites by PTM type in PhosphoSitePlus. b Bar plot showing total number of tumors and violin plot showing the distribution of the number of missense mutations per tumor for each TCGA cancer type
Fig. 2Correlation of hotspot mutation score (delta S) and the number of mutations. a The logarithm of the number of mutations found in a protein normalized by its sequence length plotted against its hotspot mutation score. b The logarithm of the number of times the most frequent alteration occurs in a protein plotted against its hotspot score. The associated density is in blue
Fig. 3Manhattan plot of TCGA alterations. Each alteration in the TCGA dataset is plotted by its frequency. Light green indicates a low hotspot mutation score for the associated protein, while dark blue indicates a high hotspot mutation score. The dashed line indicates the 3-tumor cutoff used to define hotspot mutations
Fig. 4PhosphoSitePlus lollipop plots of (a) p53 and (b) CTNNB1. Circles indicate PTM sites with a height reflecting the number of references describing the site. Squares indicate hotspot mutations with a height reflecting the number of TCGA tumors containing mutations on that residue
Fig. 5Comparison of alterations found on PTM sites and all alterations. a Barplot comparing the number of observed (orange) to expected (green) mutated PTM sites in the SNP dataset. b Heatmap of the frequencies of alteration types within the germline datasets on PTM sites and their unmodified counterparts, normalized by row. Mutation residues are annotated by residue type
Fig. 6Comparison of Delta-ScanSite and MIMP scoring for (a) TCGA hotspot mutations and (b) disease SNPs in flanking regions of PTM sites. The difference between predicted mutated and wild type kinase binding scores for ScanSite is plotted against the difference in scores calculated by MIMP. Dots are colored according to the group of the kinase