| Literature DB >> 31036831 |
Marta J Strumillo1, Michaela Oplová2,3, Cristina Viéitez1,4, David Ochoa1, Mohammed Shahraz4, Bede P Busby1,4, Richelle Sopko5, Romain A Studer1,6, Norbert Perrimon5,7,8, Vikram G Panse2, Pedro Beltrao9.
Abstract
Protein phosphorylation is the best characterized post-translational modification that regulates almost all cellular processes through diverse mechanisms such as changing protein conformations, interactions, and localization. While the inventory for phosphorylation sites across different species has rapidly expanded, their functional role remains poorly investigated. Here, we combine 537,321 phosphosites from 40 eukaryotic species to identify highly conserved phosphorylation hotspot regions within domain families. Mapping these regions onto structural data reveals that they are often found at interfaces, near catalytic residues and tend to harbor functionally important phosphosites. Notably, functional studies of a phospho-deficient mutant in the C-terminal hotspot region within the ribosomal S11 domain in the yeast ribosomal protein uS11 shows impaired growth and defective cytoplasmic 20S pre-rRNA processing at 16 °C and 20 °C. Altogether, our study identifies phosphorylation hotspots for 162 protein domains suggestive of an ancient role for the control of diverse eukaryotic domain families.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31036831 PMCID: PMC6488607 DOI: 10.1038/s41467-019-09952-x
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Prediction of phosphorylation hotspots regions for eukaryotic domain families. a Phylogenetic tree of the species from which phosphorylation data has been obtained. The numbers in the left column correspond to the phosphosites per species obtained and the right column the phosphosites found within Pfam domains. b Hotspot regions are defined as those having higher than randomly expected number of phosphorylation. A rolling window is used to count the observed average number of phosphosites in the alignment (black line) and a background expectation is calculated from random sampling (gray line and gray band for standard deviation). A p-value is calculated for the enrichment of phosphorylation at each position and projected onto structural models. c The capacity to discriminate between phosphosites of known function from other phosphosites was tested using a ROC curve. We compared the discrimination power of the hotspot p-value (blue line for ST and yellow line for Y). d Enrichment over random of human phosphosites with known functions for residues predicted as a hotspot region when compared with the rest of the domain (blue for ST and yellow for Y; p-values for Fisher’s exact test)
Fig. 2Phosphorylation hotspots overlapping with human phosphosites of known function. For 4 protein domain families we show the enrichment over random of protein phosphorylation along the domain sequence. The average number of phosphosites observed per rolling window is plotted in a solid black line (observed). The background level of expected phosphorylation calculated from random sampling is shown in gray line, with standard deviations as gray band. The blue line represents the negative logarithm of p-value at each position (right y axis). A horizontal red line indicates a cut-off of the Bonferroni corrected p-value of 0.01. Positions with a −log(p-value) above this cut-off and average phosphosites per window higher than 2 are considered putative regulatory regions and highlighted under a vertical yellow bar. Red circles indicate human phosphosite positions with known regulatory function. In the structural representations the predicted hotspot regions are highlighted in yellow
Fig. 3Structural features of phosphorylation hotspots. a Enrichment over random of structural features comparing hotspots with other residues within the same domains. The tested features include water accessibility—residues with >20% RSA; protein disorder as predicted by DISOPRED; catalytic residues; residues within 5 amino-acid distance to a catalytic residue; residues within 5 Å of a catalytic residue; residues at an interface based on 3DID. For each feature we report the −log(p-value), p-value calculated using Fisher’s exact test. b Examples of hotspot regions at interfaces where the hotspot region (red) from a domain (gray) has been observed contacting many other types of domains (other colors) in empirical structures
Fig. 4Examples of putative regulatory hotspots at or near catalytic residues. The average number of phosphosites observed per rolling window is plotted in a solid black line (observed). The background level of randomly expected phosphorylation is shown in gray line, with standard deviations as gray band. The blue line represents the negative logarithm of p-value at each position (right y axis). A horizontal red line indicates a cut-off of the Bonferroni corrected p-value of 0.05. Positions with a −log(p-value) above this cut-off and average phosphosites per window higher than 2 are considered putative regulatory regions and highlighted under the yellow bar. Red circles indicate human phosphosite positions with known regulatory function and yellow circles represent catalytic residue positions. In the structural representations the predicted hotspot regions are highlighted in yellow. The catalytic residues have been represented in as orange sticks and in red stick representations are substrates or products
Fig. 5Hotspot regions near catalytic residues that are distal in protein sequence. a The IMPDH hotspot region is represented in yellow segment. In the insets, the loop near the hotspot region is shown changing from an open conformation (blue volumes) to closed conformation (magenta volumes). A serine residue within the hotspot region (yellow sticks) points to substrate binding pocket and is often found phosphorylated across species (see alignment). b The transaldolase hotspot region is shown in yellow. In the structural inset a serine within the hotspot region (yellow sticks) is found just at the entrance of the substrate cavity. The identified phosphorylation sites contributing to the identification of the hotspot region are shown in the alignments in red
Fig. 6Rps14a T119A mutant shows growth and 20S processing defects in cold shock. a The phosphorylation enrichment over random for the ribosomal S11 domain (PFAM:PF00411) is plotted in a solid black line. The background expectation is shown in gray line, with standard deviations as gray band. The blue line represents the negative logarithm of p-value (Y axis on the right side). A horizontal red line indicates a cut-off equivalent to a Bonferroni corrected p-value of 0.05. Mutated residues in Rps14a (T119 and S123) are indicated by orange stars in the plot and shown in b as orange stick representations (PDB:5wnt_K). c Conservation of phosphorylation sites in this region across species. d Growth curve for Rps14a T119A and S123A mutants in 25 °C in SC media. Source data are provided as a Source Data file. e In situ hybridization with a Cy3-labeled oligonucleotide complementary to the 5′ sequence portion of ITS1 was assayed in 30 and 20 °C. f Structural representation of contacts between the hotspot region of Rps14a (represented in gray) and the ATPase domain of Fap7 (represented in orange)