Literature DB >> 22865923

A method for large-scale identification of protein arginine methylation.

Thomas Uhlmann1, Vincent L Geoghegan, Benjamin Thomas, Gabriela Ridlova, David C Trudgian, Oreste Acuto.   

Abstract

The lack of methods for proteome-scale detection of arginine methylation restricts our knowledge of its relevance in physiological and pathological processes. Here we show that most tryptic peptides containing methylated arginine(s) are highly basic and hydrophilic. Consequently, they could be considerably enriched from total cell extracts by simple protocols using either one of strong cation exchange chromatography, isoelectric focusing, or hydrophilic interaction liquid chromatography, the latter being by far the most effective of all. These methods, coupled with heavy methyl-stable isotope labeling by amino acids in cell culture and mass spectrometry, enabled in T cells the identification of 249 arginine methylation sites in 131 proteins, including 190 new sites and 93 proteins not previously known to be arginine methylated. By extending considerably the number of known arginine methylation sites, our data reveal a novel proline-rich consensus motif and identify for the first time arginine methylation in proteins involved in cytoskeleton rearrangement at the immunological synapse and in endosomal trafficking.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22865923      PMCID: PMC3494207          DOI: 10.1074/mcp.M112.020743

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


Knowledge of the type, extent and dynamics of post-translational modifications (PTMs) reveals the changing network of protein interactions and the regulation of cellular functions. Methylated arginines (Meth-R) are relatively frequent in cellular proteins (e.g. 0.7–1% of total arginines (1) and this work) and have a very slow turnover (2), likely to confer lasting functional properties to proteins. Meth-Rs are often found at glycine-arginine-rich (GAR) sequences and modulate protein-protein and protein-nucleic acid interactions by reducing hydrogen-bonding and local hydrophilicity (2). Methylation at arginine can weaken interactions but also enhance the binding of adaptors called Tudor domains to Meth-R-containing sequences (3, 4). Meth-R occurs in three forms: asymmetric dimethylarginine (ADMA), bearing two methyl groups on one nitrogen of the guanidino group; symmetric dimethylarginine (SDMA), where both nitrogens are singly methylated and monomethylarginine (MMA), a reaction intermediate. In humans, these reactions are catalyzed by a family of protein arginine methyltransferases (PRMTs), with PRMT1, 3, 4, 6, and 8 forming ADMA and PRMT5 and 7 producing SDMA. In mice, PRMT1 and PRMT4 deficiency causes embryonic (5) or Perinatal (6) lethality, respectively, whereas lack of PRMT3 leads to growth defect (7). Arginine methylation has been investigated for its role in the regulation of gene expression at multiple levels and in DNA repair (2). PRMT1, 4, 5, 6, and 7 control the epigenetic code by methylating histone N-terminal tails (H1, H2A, 3, and 4) (4, 8–12) and tune the activity of transcription factors (2, 13–15), coactivators (16, 17) and corepressors (8, 11). Factors involved in mRNA splicing (18, 19), elongation (20), transport and translation (19, 21) often contain Meth-R, suspected to be involved in protein-RNA interaction (2). Moreover, T cell CD28 co-stimulation (22), TNFα (15), TLR4 (15), NGF (23) and NFkB (15) signaling pathways may be regulated by PRMTs. Protein arginine methylation is implicated in pathogenic processes, including oncogenesis (24), cardiovascular disease (25), autoimmunity and viral infections (26), raising the possibility that abnormally methylated proteins can be disease markers and PRMTs potential therapeutic targets (27). Only a very limited number of arginine methylated proteins and sites have been identified with certainty to date (see UniProtKB). In vitro elegant screenings were developed to search for PRMT substrates (21) but identifying the Meth-R concerned is usually cumbersome. The existence of Meth-R sites is in many cases only suspected by protein similarity such as the presence of GAR motifs, which are substrates for some but not all PRMTs (2). These vexing limitations hamper progress of a very promising field in biology. Previous work developed a powerful approach using heavy methyl-SILAC coupled to MS to identify with high certainty peptides containing Meth-R (28). An anti-ADMA antibody was used to enrich for Meth-R-containing proteins but only a limited number of Meth-R were identified (28). The relative low abundance of Meth-R-containing proteins and peptides and lack of adequate methods for their enrichment remains therefore a major hurdle for both large- and small-scale discovery of Meth-R sites, leaving the heavy methyl-SILAC method essentially unexploited. Here, we have uncovered distinct physico-chemical properties of Meth-R-containing tryptic peptides and showed that hydrophilic interaction liquid chromatography (HILIC) is a simple method providing excellent enrichment of Meth-R-containing peptides. When coupled to heavy methyl-stable isotope labeling by amino acids in cell culture (SILAC) based MS, HILIC allowed us to identify hundreds of Meth-R sites, providing for the first time a means for “methylome”-scale investigation that vastly outperforms the use of anti-ADMA antibody.

EXPERIMENTAL PROCEDURES

SILAC Labeling and Cell Lysis for MS Analysis

Custom-made RPMI 1640 medium lacking l-Methionine, l-Arginine, and l-Lysine (Invitrogen, Carlsbad, CA) was supplemented with 10% dialyzed fetal calf serum (Invitrogen), l-Arginine and l-Lysine and either a) l-Methionine (CK Gas Products Ltd.), or b) l-Methionine-methyl-13C-methyl-D3 (Sigma, ISOTECTM) at a final concentration of 0.1 mm l-Methionine, 0.29 mm l-Arginine and 0.219 mm l-Lysine and filter sterilized (0.22 μm pore size, Millipore, Billerica, MA). Jurkat cells were grown at 37 °C in a humidified 5% CO2-containing atmosphere for 5–7 cell doublings in labeling media. Human primary CD4+ lymphocytes were isolated from whole blood of healthy donors by negative selection using a Dynal isolation kit (Invitrogen) according to the manufacturer's protocol. 1 × 106 to 2 × 106 CD4+ lymphocytes in 10 ml of light or heavy medium were then stimulated by plate bound anti-CD3 at 10 μg/ml and soluble CD28.2 (BioLegend, San Diego, CA) at 1 μg/ml. IL-2 (AbDSerotec) was added to a concentration of 50 ng/ml. Cells were grown at 37 °C in a humidified 5% CO2-containing atmosphere for 7–9 days. Cells were harvested and lysed for either a) 10 min in ice-cold lysis buffer (20 mm Tris pH 7.5, 150 mm NaCl, 0.1 mm EDTA, 1% Nonidet P-40, 0.5% Deoxicholate, protease inhibitor mixture (Roche), or b) 20 min in 8 m Urea in 25 mm Tris pH 8. Lysates were cleared by centrifugation at 14,000 × g for 20 min and mixed in a 1:1 ratio of protein content (measured by Bradford method and absorbance at 280 nm). Lysates were used for a) immunoprecipitation, or b) strong Cation exchange (SCX) chromatography, HILIC and OFFGEL isoelectric focusing.

Immunoprecipitation and In-gel Digestion

Immunoprecipitation was performed on the mixed cell lysates overnight at 4 °C on Sepharose protein G beads (GE Healthcare) pre-incubated with an anti-DMA mAb 7E6 (Abcam, Cambridge, UK). Beads were washed four times with fresh ice-cold lysis buffer and eluted with 100 mm triethylamine pH 11.5 for 10 min at RT. The eluates were neutralized with 1/20 volume of 1 m phosphate buffer pH 6.8, boiled in reducing SDS NuPAGE sample buffer (Invitrogen), alkylated with 55 mm iodoacteamide (Sigma) and separated on 4–12% gradient Bis-Tris NuPAGE gels (Invitrogen). The gel was washed with distilled water and lightly stained with Colloidal Blue (Invitrogen). The gel lane was divided into 15 slices and subject to GeLCMS/MS.

SCX, HILIC, and OFFGEL Fractionation

Cleared Urea lysate was acetone precipitated and resuspended in 1.6 m Urea in 25 mm Tris pH8. Overnight digestion at 37 °C was carried out with 12.5 ng/μl trypsin (Proteomics grade, Sigma) or chymotrypsin (Sequencing grade, Promega, Madison, WI). For SCX, 4–6 mg peptides were acidified with 0.1% trifluoroacetic acid (Sigma) and cleared by centrifuging at 20,000 × g for 10 min. Peptides were applied to a 1 ml SCX column (Resource S, GE Healthcare) using Buffer A (50 mm KH2PO4 20% acetonitrile, pH2.7). Bound peptides were eluted with increasing Buffer B (50 mm KH2PO4, 20% acetonitrile, 1 m KCl, pH2.7). In a first region of separation the gradient was stepped up to 7% Buffer B to elute 1+ and most of the 2+ charged peptides. A shallow gradient to 35% Buffer B separated the remaining 2+ from 3+ and higher charged peptides. In the last stage the Buffer B was changed to pH7 and a steep gradient to 100% Buffer B ensured complete elution of bound peptides. Following C18 desalting (Empore Octadecyl C18, Sigma), the fractions were analyzed by LC-MS/MS. For HILIC 1 mg of peptides were acidified with 0.1% trifluoroacetic acid (TFA) and cleared by centrifuging at 20,000g for 10 min. Peptides were loaded onto a reverse phase C2/C18 column (GE Healthcare) with Buffer A (0.1% formic acid, pH2.7). Elution of peptides was carried out with a steep gradient to 100% Buffer B (0.1% formic acid, 95% acetonitrile, pH2.7). Desalted peptides were then dried under vacuum, re-suspended in Buffer B and applied to a HILIC column (Merck). Bound peptides were eluted with a shallow gradient of Buffer A up to 80%. For IEF on OFFGEL, 1 mg of peptides were loaded onto an IPG strip (13 cm pH7–11, GE Healthcare) with IPG buffer (pH7–11, GE Healthcare) according to manufacturer's instructions. Peptides were focused in solution with an Agilent 3100 OFFGEL fractionator (Agilent Technologies, Santa Clara, CA) for 20 kVh up to a maximum voltage of 8000 V. Focused peptides were removed and desalted with C18 (Empore Octadecyl C18, Sigma). Peptides were also extracted by immersing the paper wicks from the cathode end in 0.1% TFA for 1–2 h, followed by brief sonication. Peptides from the respective fractions were collected and adjusted to 0.1% TFA, 5% acetonitrile and applied to a homemade C18 stage tip for desalting. Bound peptides were washed with 0.1% TFA, 2% acetonitrile and eluted in 0.1% TFA, 60% acetonitrile. Eluted peptides were dried and sent to the PNAC facility in Cambridge (http://www.bioc.cam.ac.uk/pnac/aaa.hml) for amino acid analysis on a Biochrome 30 Analyzer.

Mass Spectrometry Data Acquisition

Samples were analyzed on an Ultimate 3000 nano HPLC (Dionex, Camberley, UK) system run in direct injection mode coupled to either a LTQ XL Orbitrap mass spectrometer, or a Q Exactive mass spectrometer (Thermo Electron, Hemel Hempstead, UK). Samples were resolved on a 15 cm by 75 μm inner diameter picotip analytical column (New Objective, Woburn, MA), which was packed in-house with Reprosil-Pur C18-AQ phase, 3 μm bead (Dr. Maisch, Germany). A 120 min gradient was used to separate the peptides and each sample was typically injected three times and data merged in order to increase sample coverage. The LTQ XL Orbitrap mass spectrometer was operated in a “Top 5” data dependent acquisition mode and the Q Exactive mass spectrometer was operated in a “Top 10” data dependent acquisition mode. Precursor scans were performed at a resolving power of 60,000, from which either five precursor ions (LTQ XL Orbitrap) or 10 precursor ions (Q Exactive) were selected and fragmented. Charge state +1 ions were rejected.

Mass Spectrometry Data Analysis

MS/MS peak lists were converted to mzXML format using ReAdW version 4.4.1 (LTQ XL Orbitrap data) or to mgf format using ProteoWizard msconvert release 3.0.3535 (Q Exactive data). Both tools were used with default parameters except that zlib compression of spectral data was enabled. Data was uploaded to the central proteomics facilities pipeline (CPFP at: https://cpfp-master.molbiol.ox.ac.uk/cpfp_demo/auth/login) (29). Files were searched using Mascot version 2.3.01 (Matrix science), X!TANDEM version 2008.12.01.1 (30) and OMSSA version 2.1.8 (31) against a concatenated target and reversed decoy version of the IPI human sequence database (EBI), versions 3.78 and 3.79 containing 86,702 and 86,635 target protein sequences respectively. Enzyme was set to trypsin (or chymotrypsin for some experiments) allowing for up to 3 missed cleavages. Carbamidomethyl cysteine was set as a fixed modification. For analysis of arginine methylation oxidized methionine, heavy oxidized methionine, monomethylarginine, heavy monomethylarginine, dimethylarginine, and heavy dimethylarginine were set as variable modifications. For analysis of lysine methylation, oxidized methionine, heavy oxidized methionine, monomethyllysine, heavy monomethyllysine, dimethyllysine, heavy dimethyllysine, trimethyllysine, and heavy trimethyllysine were set as variable modifications. Mass tolerances for MS and MS/MS peak identifications were 10 ppm and 0.5 Da respectively (LTQ XL Orbitrap) or 20 ppm and 0.1 Da respectively (Q Exactive). Mass spectrometry data are available at http://www.peptideatlas.org/PASS/PASS00081. Assignment of a methylation site required identification by searches in the Central proteomics facility pipeline (CPFP) at 1% FDR and the presence of a confirming peptide in the precursor spectrum of equal intensity separated by a mass difference introduced by the light/heavy methyl group(s). The actual false positive rate of identification using the heavy methyl SILAC criteria (1.2%, supplemental Fig. S5) was calculated on a representative dataset by analyzing all false positive hits for arginine methylation for the presence of confirming heavy methyl SILAC pairs. This number was then doubled and divided by the total number of hits for arginine methylation. The requirement for the presence of a methylSILAC pair to corroborate each methylated peptide increases confidence and enables even low scoring peptides to be considered (28). Modification localization scoring was performed using the ModLS algorithm (unpublished) which extends the PTMScore method to incorporate automatic specificity expansion. For each variable modification chosen for the database search all amino acid specificities defined in the Unimod database (www.unimod.org) are considered during localization. This allows the correct localization of e.g. dimethylation to lysine, even if only arginine methylation was chosen for the search. Ambiguous modification localizations are annotated as such in supplemental Table S5. Within CPFP, inference of protein identifications from peptide sequences was performed using ProteinProphet (32). Default parameters were used for protein grouping and the assignment of isoform identifications. Precursor m/z, charge, InterProphet probability for all methylated peptides found in this work are listed in supplemental Table S4.

RESULTS

Rationale and Methods for Enrichment of Meth-R Containing Peptides

Biosynthetic labeling with methionine (M0) and [13CD3]-methionine (M4) (heavy methyl-SILAC) (Fig. 1A) enables detection of protein methylation sites by LC-MS/MS with high confidence (28). In this study, we labeled Jurkat T cells by heavy methyl-SILAC and extended this approach to primary cells by showing efficient SILAC labeling of antigen receptor stimulated T-cells (supplemental Figs. S1, S2). In a previous study, enrichment of Meth-R-containing proteins from Hela cells with the monoclonal antibody (mAb) 7E6 to MMA and ADMA led to the identification of 59 Meth-R sites (28). By a similar approach (see Methods), we identified 41 Meth-R sites in Jurkat T cells, only 6 of which were novel (e.g. not annotated in the UniProtKB database) (supplemental Table S1). However, cell proteomes should comprise higher numbers of Meth-R-containing proteins, since 0.7–1.0% of arginine in proteins extracted from cell lines or tissues is methylated (1) (and our unpublished data). Moreover, similar to previous investigations (28), we identified Meth-R sites almost invariably at RG sites (supplemental Table S1), pointing again at the limited usefulness of 7E6 mAb for proteomic-scale studies. These limitations and the small number of Meth-R sites (137) known to-date (e.g. in UniProtKB), spurred us to develop effective, unbiased approaches that enrich for Meth-R-containing peptides for large-scale identification of Meth-R-containing sites using LC-MS/MS analysis (Fig. 1A).
Fig. 1.

Proteins were digested with trypsin or, in some experiments, with chymotrypsin. Arginine methylated peptides (depicted as red lines) were enriched through the use of SCX, IEF (in an OFFGEL apparatus) or HILIC and analyzed by mass spectrometry. MS spectra for putative arginine methylated peptides were manually verified for the presence of a 1:1 methyl-SILAC pair. An example methyl-SILAC pair is shown. For each separation method, the total number of Meth-R sites identified as well as the novel ones is indicated; B, Meth-R-containing tryptic peptides have distinct physico-chemical properties. The Table indicates maximum, minimum and median values for isoelectric points (pI) and GRAVY scores of Meth-R peptides identified with the indicated enrichment methods. The number of unique peptides (n) from each method used to calculate values is shown. C, 3D scatter plot of 33000 non-methylated tryptic peptides (gray circles) identified from pH3–11 IPG strip and 128 arginine methylated tryptic peptides (yellow-red circles) identified from anti-DMA-IP, HILIC, SCX and IEF. Three different views of the same plot are shown. The peptides are plotted according to pI, GRAVY score and charge state at pH 2.7, representing the different selection criteria exploited by the enrichment methods. Yellow represents a low pI and red a high pI.

Proteins were digested with trypsin or, in some experiments, with chymotrypsin. Arginine methylated peptides (depicted as red lines) were enriched through the use of SCX, IEF (in an OFFGEL apparatus) or HILIC and analyzed by mass spectrometry. MS spectra for putative arginine methylated peptides were manually verified for the presence of a 1:1 methyl-SILAC pair. An example methyl-SILAC pair is shown. For each separation method, the total number of Meth-R sites identified as well as the novel ones is indicated; B, Meth-R-containing tryptic peptides have distinct physico-chemical properties. The Table indicates maximum, minimum and median values for isoelectric points (pI) and GRAVY scores of Meth-R peptides identified with the indicated enrichment methods. The number of unique peptides (n) from each method used to calculate values is shown. C, 3D scatter plot of 33000 non-methylated tryptic peptides (gray circles) identified from pH3–11 IPG strip and 128 arginine methylated tryptic peptides (yellow-red circles) identified from anti-DMA-IP, HILIC, SCX and IEF. Three different views of the same plot are shown. The peptides are plotted according to pI, GRAVY score and charge state at pH 2.7, representing the different selection criteria exploited by the enrichment methods. Yellow represents a low pI and red a high pI.

SCX Chromatography

Approximately 70% of tryptic peptides in proteomes exhibit a +2 net charge at pH 2.7 and can be separated from other peptides according to single charge differences using SCX chromatography (33). Because trypsin cuts poorly at Meth-R, the missed cleavages produce peptides with charge ≥+3 at pH 2.7. Thus, most Meth-R peptides should be present in the ≥+3 peptide-containing fractions after SCX chromatography. SCX fractions enriched for tryptic peptides of charge <+3 from total proteins of Jurkat cell, exhibited ∼1.0% of Meth-R (as revealed by amino acid analysis, see Methods and supplemental Fig. S3) but no Meth-R peptide could be identified by LC-MS/MS (data not shown). In contrast, in SCX fractions of tryptic peptides of charge ≥+3, ∼4.0% of arginine was present as Meth-R (supplemental Fig. S3) allowing us to identify 39 Meth-R sites from Jurkat and primary human T cells in four independent experiments (Fig. 1A and supplemental Table S1), of which 25 were new and, interestingly, 4 were not at RG motifs (supplemental Table S1). SCX fractionation enabled us to discover 10 new Meth-R containing proteins and Meth-R within sequences rarely selected by the 7E6 mAb (supplemental Table S1). In addition, as predicted by the poor trypsin cleavage at Meth-K, a known Meth-K peptide was also found (histone H4, supplemental Table S1).

IEF

Histidine-containing tryptic peptides (charge ≥+3 at pH 2.7) were the major contaminant (60–70%) within the SCX fractions in which most Meth-R peptides were detected (not shown). This likely contributed to the analyte complexity, limiting the number of Meth-R peptides identified. However, these data brought to our attention key physico-chemical properties of Meth-R tryptic peptides. Arginines prone to be methylated are likely to be fully exposed to solvent and therefore most often embedded within sequences containing hydrophilic and/or neutral amino acid residues. Moreover, we noticed that sequences surrounding Meth-R sites (from UniProtKB and those found by 7E6 Ab and SCX enrichment) infrequently contained negatively charged amino acids. This suggested that tryptic peptides carrying one or more Meth-R should frequently display basic isoelectric points (pIs). This was indeed the case for Meth-R peptides found by 7E6 mAb and SCX enrichment (Fig. 1B) with a median pI> 11.00. To test this prediction on a larger scale we fractionated tryptic digests from heavy methyl-SILAC-labeled Jurkat or primary CD4+ lymphocytes by isoelectric focusing on IPG strips (Fig. 1A, see Methods). The results of four independent experiments consistently showed that > 90% of Meth-R peptides displayed a pI ≥ 9 with a median pI of 11 (Figs. 1B and 1C), indicating that Meth-R tryptic peptides are largely highly basic in nature. Interestingly, at the basic end of the IEF gradient, (pH 9–11 and in the electrode pad itself) the enrichment for Meth-R peptides was substantial, reaching values of ≥ 5% of peptides containing Meth-R as revealed by LC-MS/MS analysis (not shown). However, the lack of IPG strips with a suitable pH range precluded good recovery of most of the highly basic peptides. Using IEF we were able to detect 66 Meth-R sites of which 34 were new (Fig. 1A and supplemental Table S1) and found 20 that did not contain an RG motif (supplemental Table S1). IEF fractionation permitted the identification of 19 proteins previously unknown to contain Meth-R and 3 Meth-K sites.

HILIC

HILIC has recently been shown to be an excellent method to complement reverse-phase fractionation (34) by enriching for highly hydrophilic peptides (35). Because the vast majority of Meth-R-containing tryptic peptides identified after 7E6 mAb, SCX or IEF enrichment were highly hydrophilic with a median GRAVY score (36) of –1.05 (Fig. 1B), we reasoned that HILIC fractionation should also enable selection for Meth-R containing peptides. Trypsin digests of Methyl-SILAC Jurkat or primary CD4+ lymphocytes were loaded onto a HILIC column in 90% acetonitrile. Bound peptides were eluted with increasing concentration of aqueous buffer (supplemental Fig. S3) and were subjected to amino acid and LC-MS/MS analysis. Fractions containing highly hydrophilic peptides (see supplemental Fig. S3) were enriched ∼10 fold for Meth-R, as compared with fractions with fewer hydrophilic peptides, in which no Meth-R-containing peptide could be detected (not shown). Meth-R peptides identified after HILIC separation had a median GRAVY score of −0.84 (Fig. 1B), a value similar to that of Meth-R peptides found by anti-DMA, SCX and IEF. In five independent experiments, LC-MS/MS analysis of the twelve most hydrophilic HILIC fractions led to the identification of 215 Meth-R sites (Fig. 1A and supplemental Table S1). Notably, HILIC enabled also the detection in chymotrypsin-digested peptides of 37 Meth-R sites not found after tryptic digestion (supplemental Table S1). HILIC alone identified 3 to 5 times higher numbers than 7E6 mAb, SCX and IEF individually and covered ∼70% of the 101 unique sites identified by the other three methods together (supplemental Table S1). Fig. 2A illustrates the power of HILIC method to discover Meth-R sites as compared with SCX, 7E6 and IEF individually. Importantly, 171 (∼ 80%) of Meth-R sites found with HILIC were new (Fig. 1A and Table S1), the highest discovery rate among the four different approaches. 87 (∼50%) of these new sites were at non-RG motifs. Mono-methylated-Rs were more frequently seen with HILIC than with the three other methods. HILIC alone allowed us to assign 215 Meth-R sites to 115 proteins, 94 of which were not previously known to be substrates of PRMTs. In total, 190 (77%) of the 249 unique Meth-R sites identified by 7E6 mAb, SCX, IEF and HILIC together were new (Fig. 2B and supplemental Table S1). Unique sites were assigned to 131 different proteins, 93 (72%) of which were previously unknown to be methylated on arginine (Fig. 2C, supplemental Table S1 and Fig. 5). Our study brings the total number of Meth-R sites and Meth-R proteins in the human proteome to 327 and 149 (Fig. 2C), respectively, substantially widening, in a single study, our insight into the regulation of cellular functions by PRMTs. HILIC was vastly superior to each of the three other enrichment methods alone or combined for detecting known and new Meth-R sites.
Fig. 2.

B, Venn diagram showing the number of Meth-R sites found only in this study, the Meth-R sites annotated in UniProtKB database and those in common. C, Venn diagram of the number of arginine methylated proteins identified in this study, those previously reported in the UniProtKB database and those in common. Circles are approximately scaled relative to number of sites.

Fig. 5.

Overview of identified proteins containing methylated arginine/lysine. Proteins are grouped according to annotated function in the UniprotKB database. Proteins newly identified as PRMT substrates in this study are drawn in dark blue. Previously known PRMT substrates are shaded in gray. Individual arginine methylation sites are marked as dots, blue representing novel Meth-R sites and gray representing previously known Meth-R sites. Green denotes a protein methylated on lysine.

B, Venn diagram showing the number of Meth-R sites found only in this study, the Meth-R sites annotated in UniProtKB database and those in common. C, Venn diagram of the number of arginine methylated proteins identified in this study, those previously reported in the UniProtKB database and those in common. Circles are approximately scaled relative to number of sites.

Bioinformatic Analysis of Meth-R Sites and Meth-R Proteins

Heavy methyl-SILAC validation (28) greatly increases the confidence in the identification of Meth-R (and Meth-K) containing peptides. Excluding heavy methyl-SILAC validation, reduced the confidence for Meth-R containing peptides to 67% FDR (despite the fact that for the overall dataset of modified and unmodified peptides the FDR was 1%). However, inclusion of putative Meth-R peptides requiring a corresponding light/heavy methyl-SILAC partner raised the confidence of identifying Meth-R sites to ∼1% FDR (supplemental Fig. S5). Therefore we feel highly confident about the Meth-R sites identified in our study, which were all manually validated by inspecting individual light/heavy pairs. This is further supported by the fact that our investigation found 43% of the known Meth-R sites. Of the 190 new Meth-R sites discovered in our study, 57% occurred at RG sites, at times within GAR motifs (Fig. 3A and supplemental Table S1), thought to be mainly substrates for PRMT1 (2). The remaining 43% were found at non-RG motifs. This is a considerable increase in the frequency of non-RG containing sequences among Meth-R sites when compared with 25% within the 137 sites known to date (supplemental Fig. S6). In ∼ 50% of non-RG sites found here, proline occurred quite frequently before and/or after R (Fig. 3A). Such a feature was also observable, though less frequently (32%), in previously identified non-GAR containing methylation sites (not shown). Inspecting the set of non-RG sites for the occurrence of proline at a particular position flanking Meth-R (see Methods), revealed that proline at positions −1 (R being 0) and +4, was significantly favored, the sequences containing proline at either or both of these positions (x(P)Rxxx(P)) accounting for 33% of all the methylation sites we found (Fig. 3B). We have provisionally designated this subset of new Meth-R consensus sites as Proline-Rich Arginine Methylation (PRAM) motif. Although glycine and proline often followed the Meth-R, we found a number of sites where the Meth-R was followed by leucine and alanine (Fig 3C). Clearly, with growing numbers of Meth-R sites to be identified in future proteomics studies, it should be possible to refine consensus site subsets and help assign PRMT specificities.
Fig. 3.

Of methylated arginines, 57% were followed by a glycine and were generally flanked by a glycine rich sequence. The remaining 43% of sites reveal a previously unreported proline-rich arginine methylation (PRAM) motif. B, Frequency of proline at each position relative to the methylated arginine. Dashed lines indicate frequency level of proline with a probability of occurrence by chance of 0.05 or 0.01. p values for proline frequency at position –1 and +4 are shown. p values were calculated by bootstrapping. C, Most common motifs found at Meth-R sites. Note that motifs can be composed of combinations of motifs listed.

Of methylated arginines, 57% were followed by a glycine and were generally flanked by a glycine rich sequence. The remaining 43% of sites reveal a previously unreported proline-rich arginine methylation (PRAM) motif. B, Frequency of proline at each position relative to the methylated arginine. Dashed lines indicate frequency level of proline with a probability of occurrence by chance of 0.05 or 0.01. p values for proline frequency at position –1 and +4 are shown. p values were calculated by bootstrapping. C, Most common motifs found at Meth-R sites. Note that motifs can be composed of combinations of motifs listed. The newly discovered Meth-R proteins appear to reside in the cytoplasm and/or nucleus with no preference for either compartment Fig. 4A). As previously noticed (2), a high proportion of Meth-R proteins are known to bind nucleic acids and are involved in regulating chromatin structure, transcription, RNA processing and translation. Our study considerably increases the number of such proteins regulated by PRMTs, including five implicated in chromatin structure, 22 participating in the transcription process and 21 in RNA processing (see Figs. 4B and 5, blue circles). Most interestingly, we also found proteins implicated in other cellular functions not previously reported to be subjected to regulation by PRMTs. Among these (see Fig. 5) were two nuclear pore components (RANBP2 and TPR) and 6 proteins (DNM2, PICALM, SEC24C, GGA3, SNX3, and CLINT1 (see, UniProtKB) implicated in endosomal trafficking and three, WASP/WIP complex (37) and DNM2 (38), involved in actin cytoskeletal re-arrangement at the immunological synapse (IS). A considerable number of Meth-R sites were found in proteins of unknown function (Fig. 5).
Fig. 4.

Breakdown of arginine methylated proteins according to distribution in cellular compartments (

Breakdown of arginine methylated proteins according to distribution in cellular compartments ( Overview of identified proteins containing methylated arginine/lysine. Proteins are grouped according to annotated function in the UniprotKB database. Proteins newly identified as PRMT substrates in this study are drawn in dark blue. Previously known PRMT substrates are shaded in gray. Individual arginine methylation sites are marked as dots, blue representing novel Meth-R sites and gray representing previously known Meth-R sites. Green denotes a protein methylated on lysine. Although SCX and HILIC could in principle enrich also for Meth-K proteins, we found only 10 lysine methylation sites (supplemental Table S1 and Fig. 5). Perhaps Meth-K occurs less frequently than Meth-R in proteins and/or Meth-K- and Meth-R-containing peptides present different physico-chemical properties and thus both would not be enriched by the methods used here. Nevertheless, we found three proteins previously unknown to be Meth-K, including two heat shock proteins, HSPA5 and HSPA8. Overall, the methods explored here, with HILIC being by far the most effective, allowed us to considerably enrich for Meth-R peptides other than those in canonical RG/GAR motifs and uncover a substantial number of new proteins and functions likely to be regulated by PRMTs.

DISCUSSION

To date, no effective and unbiased method for the enrichment of protein and/or peptide containing Meth-R exists. An mAb (7E6) directed at ADMA and MMA permitted the identification of only a handful of Meth-R sites (28) (and this work) almost exclusively at RG sites. Because antibodies are unlikely to effectively discriminate (and with high affinity) the subtle chemical difference between Meth-R and unmodified R, use of pan Meth-R antibodies may not be suitable. A major obstacle in the identification of MS-based Meth-R, namely, being incorrectly mistaken for other isobaric chemical modifications, was recently solved by using a heavy methyl-SILAC approach (28), raising the possibility of accurate large-scale studies of protein arginine methylation. Despite this, a simple and robust method was missing for effective enrichment of Meth-R-containing peptide for small- and proteomic-scale analysis of arginine methylation. The exploratory phase of this investigation revealed that >90% of Meth-R-containing tryptic peptides displayed high basic pI and high hydrophilicity. Enrichment using IEF was barred by the unavailability of IPG strips with adequate high pH range. We found however that HILIC had excellent capacity to enrich for Meth-R peptides. HILIC alone allowed us to identify from trypsin or chymotrypsin digests of heavy methyl-SILAC cell lysates Jurkat cells and primary CD4+ lymphocytes, 215 Meth-R sites, 80% of which were new and covered almost 90% of the total 190 new Meth-R sites discovered here by all four methods. Our study more than doubles the 137 Meth-R sites known to date and adds 93 Meth-R-containing proteins to the 59 previously known in humans, substantially increasing in a single effort our knowledge of the extent and biological significance of arginine methylation. Although these numbers seem limited in comparison to those reported for other PTMs, it is possible that arginine methylation is more restricted and selective, also considering that in this study we explored only the T cell proteome. Although HILIC performance overshadowed SCX and IEF, it missed 40 Meth-R sites identified by the latter two methods, despite a similar number of analysis performed. However, pI and GRAVY score of HILIC and SCX/IEF Meth-R peptides did not significantly differ, suggesting that the major cause of missed-hits by HILIC could be a persistent complexity of the peptide mixture in the HILIC fractions. In an effort to discover higher numbers of Meth-R sites after HILIC enrichment, we made use of a Q Exactive Orbitrap (see Methods), a powerful new MS/MS instrument that allows faster fragmentation rates, hence higher numbers of peptide identified (39). This allowed us to confirm in a single experiment about half of Meth-R sites identified by several analyses of HILIC fractions on an LTQ XL Orbitrap and led us to discover 11 new ones, indicating that the Q Exactive is largely preferable for proteomics scale discovery of Meth-R. Meth-R proteins detected in Jurkat cells and primary CD4+ lymphocytes overlapped only partially (supplemental Fig. S4). Possible differences in protein expression and/or PRMT levels in the two cell types may explain this divergence, considering that Jurkat is a T cell lymphoma and that cancer cells may express modified repertoires of Meth-R proteins (24). Similarly, differences in tissue expression may explain why our data missed 18 Meth-R proteins in the mammalian proteome. Only 25% of Meth-R sites annotated to date are at non-RG sites (supplemental Table S2). Some contain prolines and may be CARM1 substrates (21). In contrast to the use of 7E6 mAb, HILIC (as well as SCX and IEF) enrichment seems to be unbiased, as almost half of the sites we identified were at non-RG sites (supplemental Table S1 and Fig. 3A). The relatively low number of Meth-R sites known to date are largely derived from the most abundant proteins implicated in RNA metabolism (supplemental Table S3), as RG sites tend to occur at repeated GAR sequences of RNA binding proteins. However, only 7 such proteins (of 56 total) cover half of the 137 Meth-R sites known to date (see supplemental Tables S2 and S3). Thus, HILIC alone removes the above biases and reveals that PRMTs control proteins implicated in many diverse cellular functions. Meth-R occurring at GAR sequences of RNA-binding proteins may contribute to protein-RNA interaction (2). The PRAM sequences defined in our study could be part of protein-protein interaction sites and Meth-R may modulate such interactions (40). The PRxxxP motif is predicted to be a poly-proline II helix able to interact with SH3 and other protein domains. Inspection of methylated PRxxxP motif revealed that a hydrophobic amino acid often precedes the +4 proline (PRxxΦP) (supplemental Fig. S7), a signature for poly-proline II helix interacting with SH3 domains and supports a role for PRMTs in modulating protein-protein interactions (40). Interestingly, our data revealed that the WASP/WIP complex and DNM-2, both involved in the formation and maintenance of the IS induced by TCR signaling, contained PRAM sequences known to be part of SH3 binding sites for several binding partners (37, 38). These data, together with the identification of novel Meth-R proteins implicated in endosomal trafficking, chromatin remodeling, transcription, RNA processing and translation open new perspectives to further uncover the role of PRMTs in the T cell activation process. While HILIC represents a simple and robust method to enrich for Meth-R peptides, a few technical improvements lie ahead for routine “methylome”-scale studies. One immediate possibility to de-complex further peptide mixtures prior to MS analysis is orthogonal combination of two chromatographic methods according to the physico-chemical features of Meth-R-containing peptides highlighted in this study. Large/highly charged peptides might escape detection by MS, a problem that can be partly overcome using proteases other than trypsin. Indeed, chymotrypsin enabled us to identify a sizable number of Meth-R-containing peptides not found in trypsin digests. Moreover, SCX, IEF and, preferentially HILIC, can be readily employed in small-scale formats to discover arginine methylation sites in low complexity peptide mixtures such as those derived from digests of purified protein complexes or individual proteins, proven or suspected to interact with PRMTs. The discovery of methods for unbiased enrichment of Meth-R-containing peptides, such as HILIC, will also allow us to ask why and to what extent abnormal protein arginine methylation plays a role in carcinogenesis (e.g. prostate cancer and breast cancer) and several other, important pathologies, including cardiovascular and infectious diseases, contributing perhaps to opening avenues for new therapeutic tools and drugs.
  41 in total

1.  Methylation of SPT5 regulates its interaction with RNA polymerase II and transcriptional elongation properties.

Authors:  Youn Tae Kwak; Jun Guo; Shashi Prajapati; Kyu-Jin Park; Rama M Surabhi; Brady Miller; Peter Gehrig; Richard B Gaynor
Journal:  Mol Cell       Date:  2003-04       Impact factor: 17.970

2.  A statistical model for identifying proteins by tandem mass spectrometry.

Authors:  Alexey I Nesvizhskii; Andrew Keller; Eugene Kolker; Ruedi Aebersold
Journal:  Anal Chem       Date:  2003-09-01       Impact factor: 6.986

3.  Large-scale characterization of HeLa cell nuclear phosphoproteins.

Authors:  Sean A Beausoleil; Mark Jedrychowski; Daniel Schwartz; Joshua E Elias; Judit Villén; Jiaxu Li; Martin A Cohn; Lewis C Cantley; Steven P Gygi
Journal:  Proc Natl Acad Sci U S A       Date:  2004-08-09       Impact factor: 11.205

4.  TANDEM: matching proteins with tandem mass spectra.

Authors:  Robertson Craig; Ronald C Beavis
Journal:  Bioinformatics       Date:  2004-02-19       Impact factor: 6.937

5.  Open mass spectrometry search algorithm.

Authors:  Lewis Y Geer; Sanford P Markey; Jeffrey A Kowalak; Lukas Wagner; Ming Xu; Dawn M Maynard; Xiaoyu Yang; Wenyao Shi; Stephen H Bryant
Journal:  J Proteome Res       Date:  2004 Sep-Oct       Impact factor: 4.466

6.  Arginine methylation inhibits the binding of proline-rich ligands to Src homology 3, but not WW, domains.

Authors:  M T Bedford; A Frankel; M B Yaffe; S Clarke; P Leder; S Richard
Journal:  J Biol Chem       Date:  2000-05-26       Impact factor: 5.157

7.  High-resolution X-ray and NMR structures of the SMN Tudor domain: conformational variation in the binding site for symmetrically dimethylated arginine residues.

Authors:  Remco Sprangers; Matthew R Groves; Irmgard Sinning; Michael Sattler
Journal:  J Mol Biol       Date:  2003-03-21       Impact factor: 5.469

8.  A simple method for displaying the hydropathic character of a protein.

Authors:  J Kyte; R F Doolittle
Journal:  J Mol Biol       Date:  1982-05-05       Impact factor: 5.469

9.  Arginine methylation of the histone H3 tail impedes effector binding.

Authors:  Aimee N Iberg; Alexsandra Espejo; Donghang Cheng; Daehoon Kim; Jonathan Michaud-Levesque; Stephane Richard; Mark T Bedford
Journal:  J Biol Chem       Date:  2007-12-11       Impact factor: 5.157

10.  Loss of CARM1 results in hypomethylation of thymocyte cyclic AMP-regulated phosphoprotein and deregulated early T cell development.

Authors:  Jeesun Kim; Jaeho Lee; Neelu Yadav; Qi Wu; Carla Carter; Stéphane Richard; Ellen Richie; Mark T Bedford
Journal:  J Biol Chem       Date:  2004-04-19       Impact factor: 5.157

View more
  44 in total

1.  Large Scale Mass Spectrometry-based Identifications of Enzyme-mediated Protein Methylation Are Subject to High False Discovery Rates.

Authors:  Gene Hart-Smith; Daniel Yagoub; Aidan P Tay; Russell Pickford; Marc R Wilkins
Journal:  Mol Cell Proteomics       Date:  2015-12-23       Impact factor: 5.911

Review 2.  PRMT7 as a unique member of the protein arginine methyltransferase family: A review.

Authors:  Kanishk Jain; Steven G Clarke
Journal:  Arch Biochem Biophys       Date:  2019-02-22       Impact factor: 4.013

3.  Discovery of Missing Methylation Sites on Endogenous Peptides of Human Cell Lines.

Authors:  Xin Yan; Lingjun Li; Chenxi Jia
Journal:  J Am Soc Mass Spectrom       Date:  2019-08-19       Impact factor: 3.109

4.  Proteomic analysis of arginine methylation sites in human cells reveals dynamic regulation during transcriptional arrest.

Authors:  Kathrine B Sylvestersen; Heiko Horn; Stephanie Jungmichel; Lars J Jensen; Michael L Nielsen
Journal:  Mol Cell Proteomics       Date:  2014-02-21       Impact factor: 5.911

Review 5.  Unconventional post-translational modifications in immunological signaling.

Authors:  Kerri A Mowen; Michael David
Journal:  Nat Immunol       Date:  2014-06       Impact factor: 25.606

Review 6.  Readers of histone methylarginine marks.

Authors:  Sitaram Gayatri; Mark T Bedford
Journal:  Biochim Biophys Acta       Date:  2014-02-28

7.  Determining the Mitochondrial Methyl Proteome in Saccharomyces cerevisiae using Heavy Methyl SILAC.

Authors:  Katelyn E Caslavka Zempel; Ajay A Vashisht; William D Barshop; James A Wohlschlegel; Steven G Clarke
Journal:  J Proteome Res       Date:  2016-10-18       Impact factor: 4.466

Review 8.  Emerging technologies to map the protein methylome.

Authors:  Scott M Carlson; Or Gozani
Journal:  J Mol Biol       Date:  2014-05-05       Impact factor: 5.469

Review 9.  Comprehending dynamic protein methylation with mass spectrometry.

Authors:  Leila Afjehi-Sadat; Benjamin A Garcia
Journal:  Curr Opin Chem Biol       Date:  2013-01-18       Impact factor: 8.822

Review 10.  AdoMet analog synthesis and utilization: current state of the art.

Authors:  Tyler D Huber; Brooke R Johnson; Jianjun Zhang; Jon S Thorson
Journal:  Curr Opin Biotechnol       Date:  2016-08-06       Impact factor: 9.740

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.