| Literature DB >> 33240890 |
Di Peng1, Huiqin Li1, Bosu Hu1, Hongwan Zhang2, Li Chen1, Shaofeng Lin3, Zhixiang Zuo2, Yu Xue3, Jian Ren1,2, Yubin Xie1.
Abstract
High-throughput sequencing technologies have identified millions of genetic mutations in multiple human diseases. However, the interpretation of the pathogenesis of these mutations and the discovery of driver genes that dominate disease progression is still a major challenge. Combining functional features such as protein post-translational modification (PTM) with genetic mutations is an effective way to predict such alterations. Here, we present PTMsnp, a web server that implements a Bayesian hierarchical model to identify driver genetic mutations targeting PTM sites. PTMsnp accepts genetic mutations in a standard variant call format or tabular format as input and outputs several interactive charts of PTM-related mutations that potentially affect PTMs. Additional functional annotations are performed to evaluate the impact of PTM-related mutations on protein structure and function, as well as to classify variants relevant to Mendelian disease. A total of 4,11,574 modification sites from 33 different types of PTMs and 1,776,848 somatic mutations from TCGA across 33 different cancer types are integrated into the web server, enabling identification of candidate cancer driver genes based on PTM. Applications of PTMsnp to the cancer cohorts and a GWAS dataset of type 2 diabetes identified a set of potential drivers together with several known disease-related genes, indicating its reliability in distinguishing disease-related mutations and providing potential molecular targets for new therapeutic strategies. PTMsnp is freely available at: http://ptmsnp.renlab.org.Entities:
Keywords: Bayesian hierarchical model; disease; driver genes; genetic mutations; protein post-translational modification
Year: 2020 PMID: 33240890 PMCID: PMC7683509 DOI: 10.3389/fcell.2020.593661
Source DB: PubMed Journal: Front Cell Dev Biol ISSN: 2296-634X
FIGURE 1A schematic workflow of the PTMsnp web server. (A) Data input section. (B) Six options set for the PTMsnp program. (C) Task records to monitor the running task and view the results. The result page consists of five parts, including (D) A summary table of significantly PTM-mutated proteins. (E) The statistical graphs of significant PTM-related mutations and mutated PTM types in identified proteins. (F) The mutation sites on the protein sequence and its known functional domains. (G) GO annotation of identified proteins. (H) KEGG pathway enrichment of identified proteins.
FIGURE 2Significantly mutated proteins identified in TCGA cancer cohorts regarding 5 PTM types. (A) Number of significantly PTM-mutated genes across five PTM types identified in different cancers. (B) Schematic diagram of mutations and protein phosphorylation modification regions within BRAF gene in five cancer types. Upper panel shows the number of mutated samples per position. The blue and yellow dashed boxes represent the P-loop and activation loop on the BRAF protein, respectively. The lower panel shows the mutation and phosphorylation within 594–606 region of the BRAF protein in SKCM. Positions 596–600 are the activation segment. Above the position coordinates is the amino acid sequence. The phosphorylated amino acids are marked with a yellow solid circle. Altered amino acid after mutation is above the original sequence. V600 has three different mutation forms, marked with different colors. (C) The enriched pathways of PTM-mutated proteins in SKCM. (D) The enriched GO terms obtained from the identified PTM-mutated proteins in SKCM.
FIGURE 3The top 30 PTM-mutated genes identified in more than 7 cancer types, among which the known cancer genes are indicated in red.
FIGURE 4Significantly PTM-mutated proteins identified from a GWAS dataset of Type 2 Diabetes (T2D) samples with 1,916 individuals. (A) The top 30 genes ranked by the number of significant PTM-related mutations. Bar height shows the number of samples harboring mutations in each PTM type, respectively. The red and white gradient bar below represents the FDR q-value. (B) The proportion of PTM-related mutations of each modification type in identified proteins. (C) SLC16A1 has the most frequent PTM-related mutations affecting three types of modifications. Upper panel shows the number of mutated samples per position. Protein domain of SLC16A1 are shown in green region along the sequence. The modified regions of three PTM types on SLC16A1 protein are shown below. The modified position where the mutation has occurred is indicated by a red arrow. (D) Mutation M1808I were identified to significantly alter phosphorylation status of WNK1. Protein domain of WNK1 are shown in blue and orange (PK, Protein kinase domain; OSR1-C, Oxidative-stress-responsive kinase 1 C-terminal domain). The modified position where the mutation has occurred is indicated by a red arrow.