Maxim I Maron1, Stephanie M Lehman2, Sitaram Gayatri3,4,5, Joseph D DeAngelo1, Subray Hegde1, Benjamin M Lorton1, Yan Sun1, Dina L Bai2, Simone Sidoli1, Varun Gupta6, Matthew R Marunde7, James R Bone7, Zu-Wen Sun7, Mark T Bedford3,4,5, Jeffrey Shabanowitz2, Hongshan Chen1, Donald F Hunt8, David Shechter1. 1. Department of Biochemistry, Albert Einstein College of Medicine, Bronx, NY 10461, USA. 2. Department of Chemistry, University of Virginia, Charlottesville, VA 22904, USA. 3. Department of Epigenetics and Molecular Carcinogenesis, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA. 4. Center for Cancer Epigenetics, The University of Texas MD Anderson Cancer Center, Smithville, TX, USA. 5. Graduate Program in Genetics and Epigenetics, The University of Texas MD Anderson UT Health Graduate School of Biomedical Sciences, Houston, TX 77030, USA. 6. Department of Cell Biology, Albert Einstein College of Medicine, Bronx, NY 10461, USA. 7. EpiCypher, Inc., Research Triangle Park, NC 27709, USA. 8. Departments of Chemistry and Pathology, University of Virginia, Charlottesville, VA 22904, USA.
Abstract
Protein arginine methyltransferases (PRMTs) catalyze the post-translational monomethylation (Rme1), asymmetric (Rme2a), or symmetric (Rme2s) dimethylation of arginine. To determine the cellular consequences of type I (Rme2a) and II (Rme2s) PRMTs, we developed and integrated multiple approaches. First, we determined total cellular dimethylarginine levels, revealing that Rme2s was ∼3% of total Rme2 and that this percentage was dependent upon cell type and PRMT inhibition status. Second, we quantitatively characterized in vitro substrates of the major enzymes and expanded upon PRMT substrate recognition motifs. We also compiled our data with publicly available methylarginine-modified residues into a comprehensive database. Third, we inhibited type I and II PRMTs and performed proteomic and transcriptomic analyses to reveal their phenotypic consequences. These experiments revealed both overlapping and independent PRMT substrates and cellular functions. Overall, this study expands upon PRMT substrate diversity, the arginine methylome, and the complex interplay of type I and II PRMTs.
Protein arginine methyltransferases (PRMTs) catalyze the post-translational monomethylation (Rme1), asymmetric (Rme2a), or symmetric (Rme2s) dimethylation of arginine. To determine the cellular consequences of type I (Rme2a) and II (Rme2s) PRMTs, we developed and integrated multiple approaches. First, we determined total cellular dimethylarginine levels, revealing that Rme2s was ∼3% of total Rme2 and that this percentage was dependent upon cell type and PRMT inhibition status. Second, we quantitatively characterized in vitro substrates of the major enzymes and expanded upon PRMT substrate recognition motifs. We also compiled our data with publicly available methylarginine-modified residues into a comprehensive database. Third, we inhibited type I and II PRMTs and performed proteomic and transcriptomic analyses to reveal their phenotypic consequences. These experiments revealed both overlapping and independent PRMT substrates and cellular functions. Overall, this study expands upon PRMT substrate diversity, the arginine methylome, and the complex interplay of type I and II PRMTs.
Methylation, a conserved post-translational modification (PTM), is the enzymatic transfer of a methyl group by a methyltransferase from the donor, S-adenosyl-L-methionine (SAM), to a substrate amino acid. Protein methylation primarily occurs on lysines and arginines (Levy, 2019; Lorton and Shechter, 2019). Arginine methylation is found on diverse substrates and has a significant role in transcription, RNA splicing, and DNA repair (Guccione and Richard, 2019). However, its protein locations and cellular consequences are still incompletely defined.Vertebrates have nine protein arginine methyltransferases (PRMTs) grouped into three types (Guccione and Richard, 2019; Tewary et al., 2019). Type I, II, and III PRMTs catalyze the monomethylation of the terminal nitrogen of the arginine guanidinium group (Rme1 or MMA) (ω-NG-monomethylarginine). Type I PRMTs (PRMT1,2,3,4,6,8 [PRMT4 is also known as CARM1]) further modify the monomethylated nitrogen to produce an asymmetrically dimethylated arginine residue (Rme2a or ADMA) (ω-NG,NG-asymmetric dimethylarginine). Type II PRMTs (5 and 9) catalyze an additional methylation of the unmodified guanidinium nitrogen, thereby creating a symmetrically dimethylated arginine residue (Rme2s or SDMA) (ω-NG,NG′-symmetric dimethylarginine). PRMT5 is the predominant type II enzyme (Stopa et al., 2015). Importantly, while PRMT1 and PRMT9 have been shown to exhibit semi-processivity (Brown et al., 2018; Gui et al., 2013; Jain et al., 2016; Osborne et al., 2007), PRMT enzymes are generally distributive (Burgos et al., 2015; Hu et al., 2016; Jacques et al., 2016; Lakowski and Frankel, 2008, 2009; Obianyo et al., 2008; Wang et al., 2013, 2014) and are capable of scavenging each other's substrates (Dhar et al., 2013). These observations suggest a complex enzyme/substrate interplay and imply potentially complementary cellular roles. Both PRMT1 and PRMT5—considered the primary enzymes for Rme2a and Rme2s, respectively—are essential for cell viability and development (Guccione and Richard, 2019).Many of the known roles for arginine methylation in the cell involve protein binding or DNA and RNA binding (Lorton and Shechter, 2019). Hydrogen bonding, water displacement, and electrostatic interactions govern arginine:RNA interactions (Hofweber and Dormann, 2019). The methylation pattern can direct binding preferences: both the number of methylations and the methylation symmetry are relevant for function (Lorton and Shechter, 2019). Studies on histone arginine methylation highlight the importance of the PTMs geometry. For instance, both histone H3R2 and H4R3 are either symmetrically or asymmetrically modified leading to inverse cellular transcriptional consequences: H3R2me1, H3R2me2s, and H4R3me2a are “activation” marks (Chen et al., 2017; Migliori et al., 2012; Wang et al., 2001), while H3R2me2a and H4R3me2s are “repressive” marks (Guccione et al., 2007; Hyllus et al., 2007; Zhao et al., 2009). Furthermore, we and others have shown that the WDR5 reader structurally distinguishes between the various methylarginines (Chen et al., 2017; Hyllus et al., 2007; Lorton et al., 2020; Migliori et al., 2012). Therefore, understanding how many proteins, at what sites, and the type of arginine methylation is critical for future mechanistic studies.Efforts to determine the total methylarginine content of cells have employed techniques including protein hydrolysis and high-performance liquid chromatography (Boffa et al., 1977; Bulau et al., 2006; Dhar et al., 2013; Esse et al., 2014; Paik and Kim, 1970) and more recently, nuclear magnetic resonance (NMR) spectroscopy (Zhang et al., 2021). These studies suggest that approximately 0.5–4% of all arginine residues are methylated in mammalian cells with the predominant methylarginine species being Rme2a; Rme1 and Rme2s are estimated to be ∼10% of Rme2a. To characterize PRMT enzymes' substrate motifs, a variety of targeted enzymatic and computational approaches have been used (Gathiaka et al., 2016; Gayatri et al., 2016; Nguyen et al., 2015; Wooderchak et al., 2008). While most studies show that glycine and arginine-rich regions (“GAR” motifs) are common targets for arginine methylation (Branscombe et al., 2001; Dhar et al., 2013), PRMTs have also been demonstrated to methylate arginine residues not flanked by glycine (Hadjikyriacou et al., 2015; Wooderchak et al., 2008). Prior studies revealed that CARM1 prefers “PGM” motifs—sequences enriched for proline, glycine, and methionine (Cheng et al., 2007; Gayatri et al., 2016; Shishkova et al., 2017). Alternatively, PRMT7 has been demonstrated to methylate an “RxR” motif (Feng et al., 2013). Finally, others have tested the transcriptomic and proteomic consequences of inhibiting various PRMTs (Fedoriw et al., 2019; Fong et al., 2019; Li et al., 2021; Musiani et al., 2019; Radzisheuskaya et al., 2019). To characterize the arginine methylome, recent investigations have used a variety of approaches. For example, PTMScan—developed to immunoprecipitate post-translationally modified peptides from proteolyzed lysate (Stokes et al., 2012)—was used in HCT116 colorectal cells to identify methylarginine modified residues (Guo et al., 2014), HEK293 cells to identify PRMT-regulated Rme1 sites (Larsen et al., 2016), and in Toxoplasma to identify PRMT1 substrates (Yakubu et al., 2017). Others have used PTMScan with stable isotope labeling by amino acids in cell culture (SILAC) to determine sites of all three methylarginine states, mostly found to be enriched in RNA processing factors (Fedoriw et al., 2019; Fong et al., 2019; Li et al., 2021; Musiani et al., 2019; Radzisheuskaya et al., 2019). More recently, middle-down proteomics coupled with electron-transfer dissociation (ETD) was used to specifically identify arginine methylation in arginine- and serine-rich domains in RNA-binding proteins (Kundinger et al., 2020). Although these studies provide tools and begin to establish a broad role for PRMTs in cellular physiology, more work is still needed to refine our understanding of these enzymes and their interplay.We set out to provide a larger and complete picture of the arginine methylome, PRMT substrate motifs, as well as the PRMT-regulated proteome and transcriptome. To elucidate the diverse cellular consequences of type I and II PRMTs, here we develop and integrate many different techniques, including a new time efficient, high-resolution, direct-injection-based mass spectrometric analysis of Rme2a and Rme2s; PRMT enzyme peptide substrate array library assays; for all three methylarginine states, PTMScan using a decision-tree-based MS/MS approach with combined ETD and HCD (higher-energy collisional dissociation); and transcriptomic, proteomic, and phenotypic profiling. We demonstrate how drugging either type I or type II PRMTs leads to independent consequences. Ultimately, we provide a new and broader view of the role of PRMTs within the cell.
Results
PRMTs are frequently amplified in lung cancer cells
Our goal was to understand the proteomic and transcriptional regulatory roles of the PRMT family. PRMTs together catalyze all three possible methylarginine products (Figure 1A). PRMT1 and PRMT5 are the two most abundant PRMTs. To determine if any PRMTs were mutated or had altered expression in human cancers, we probed the Cancer Genome Atlas. A cBioPortal oncoprint for lung adenocarcinomas (n = 520) revealed that at least one PRMT had an alteration in 45% of all tumors, with the vast majority of these being gene amplifications or transcriptional upregulation (Figure S1A) (Cerami et al., 2012; Gao et al., 2013). Almost no mutations are found in PRMTs in lung cancers or any other cancers we probed (data not shown). A few of the PRMT expression changes are co-enriched, including PRMT1 and PRMT5 (Figure S1B). Indeed, elevated expression of PRMTs is correlated with overall survival hazard ratio (Figures S1C and S1D), further supporting that PRMTs are important in cancer biology.
Figure 1
Total proteome Rme2s and Rme2a fraction
(A) Schematic of the reactions catalyzed by the three types of protein arginine methyltransferases (type I PRMTs catalyze Rme1 and Rme2a; type II PRMTs catalyze Rme1 and Rme2s; type III PRMTs catalyze only Rme1).
(B) Experimental setup: A549 cells were cultured for 1 week with either 0.01% DMSO, 1 μM GSK591, or 1 μM MS023. Total protein lysates (8M Urea) were precipitated with TCA followed by complete hydrolysis in 6M HCl and heat. Resulting amino acid products were diluted in acetonitrile and direct injected onto an Orbitrap Fusion Lumos.
(C) Example MS2 spectra from varying ratios of Rme2a:Rme2s highlighting abundance changes in the unique Rme2s fragment ion. Schematic at top indicates characteristic ions for Rme2s and Rme2a in the MS2.
(D) Standard curve of the change in Rme2s relative to total Rme2 over varying concentrations of Rme2a and Rme2s.
(E) Fraction of Rme2s and Rme2a of total Rme2 for A549 cells treated with either 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple), as well as IMR90 cells (orange) and Xenopus cell-free egg extract (blue). Data are represented as mean ± SD.
Total proteome Rme2s and Rme2a fraction(A) Schematic of the reactions catalyzed by the three types of protein arginine methyltransferases (type I PRMTs catalyze Rme1 and Rme2a; type II PRMTs catalyze Rme1 and Rme2s; type III PRMTs catalyze only Rme1).(B) Experimental setup: A549 cells were cultured for 1 week with either 0.01% DMSO, 1 μM GSK591, or 1 μM MS023. Total protein lysates (8M Urea) were precipitated with TCA followed by complete hydrolysis in 6M HCl and heat. Resulting amino acid products were diluted in acetonitrile and direct injected onto an Orbitrap Fusion Lumos.(C) Example MS2 spectra from varying ratios of Rme2a:Rme2s highlighting abundance changes in the unique Rme2s fragment ion. Schematic at top indicates characteristic ions for Rme2s and Rme2a in the MS2.(D) Standard curve of the change in Rme2s relative to total Rme2 over varying concentrations of Rme2a and Rme2s.(E) Fraction of Rme2s and Rme2a of total Rme2 for A549 cells treated with either 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple), as well as IMR90 cells (orange) and Xenopus cell-free egg extract (blue). Data are represented as mean ± SD.To determine how the RNA expression of each PRMT correlated between normal lung tissue, lung tumors, and lung cancer cell lines, we probed the Metabolic Gene Rapid Visualizer (MERAV) database (Figure S1E) (Shaul et al., 2016). We observed large increases in expression for PRMT1 and PRMT5 when comparing normal tissue to tumors and cancer cells. Interestingly, many of the cancer cell lines exhibited even larger increases in expression for most of the PRMTs, consistent with elevated arginine methylation as a driver or key aspect of cell growth and proliferation. For PRMT5, this was consistent with our previous measurement of protein levels across multiple normal cells and cancer cell lines (Chen et al., 2017). As we had previously used the well characterized A549 lung adenocarcinoma cell line, we tested how PRMT expression in these cells compared with the other cell lines. We observed that expression levels of most of the PRMTs in A549 cells were distributed within one standard deviation of the median, making these an appropriate cell line for further study of PRMT function (Figure S1E, red circles).
Total proteome Rme2a and Rme2s
A long-standing question in PRMT biology is the abundance of the various species of methylarginine within cells. As we were interested in the reciprocity of type I and type II PRMTs, we developed an assay to dissect the relative abundance of Rme2s and Rme2a. To accomplish this, we treated human A549 lung adenocarcinoma cells with 0.01% DMSO (control), 1 μM GSK591 (PRMT5 inhibitor), or 1 μM MS023 (type I PRMT inhibitor) for one week (Figure 1A) (Duncan et al., 2016; Eram et al., 2016). We ensured that the DMSO concentration was not affecting cell viability and that neither treatment resulted in substantial changes in any of the PRMTs expression (Figures S2A–S2C). Next, we prepared total cell lysates, precipitated total protein, and hydrolyzed the proteins to their constituent amino acids (Figure 1B). On these samples, we confirmed that PRMT inhibition resulted in the expected methylation changes (Figure S2D). We then used direct-injection mass spectrometry with an Advion TriVersa NanoMate coupled online with a high-resolution Orbitrap Fusion Lumos (Thermo Scientific) (Sun et al., 2021). Direct injection allowed for rapid characterization of (un)modified arginine, as well as minimizing potential biases given by differential chromatographic retention of the modified versus unmodified species. We discriminated Rme2s from Rme2a abundance by using a distinct Rme2s fragment ion (Figure 1C). By comparing changes in the unique Rme2s fragment ion over varying concentrations of 1H-NMR-calibrated Rme2s and Rme2a, we were able to derive a standard curve used to solve for abundances of these methylarginine species (Figure 1D). We employed this method to determine the relative fraction of Rme2s and Rme2a in each of the PRMT inhibited cell states, as well as in IMR90 fetal lung fibroblasts and Xenopus laevis cell-free egg extract (Figure 1E). This analysis revealed that in A549 cells—in the absence of PRMT inhibition—Rme2s was ∼3.5% of total Rme2. In contrast, in GSK591-treated A549 cells Rme2s was below the threshold of detection and in MS023-treated cells Rme2s increased to ∼8.5% of total Rme2 (Figure 1E). Furthermore, when compared with untreated A549 cells, we observed that the IMR90 cells had significantly more Rme2s (∼12.5%), and this was even more pronounced in Xenopus eggs (∼25% Rme2s) (Figure 1E). These results are consistent with type II substrate scavenging following inhibition of type I PRMTs and further support the potential importance of Rme2s in early development.
In vitro PRMT substrates
To gain further information about PRMT substrates of the major and most abundant enzymes (PRMT1, PRMT4/CARM1, and PRMT5), we used oriented peptide array libraries (OPALs) and recombinant enzymes. OPAL substrate peptides contained a fixed arginine (R) surrounded by one of any 19 amino acids (cysteine was not included) in fixed positions, with the remainder as a degenerate mix of amino acids (Figure 2A). Next, recombinant human PRMT1, human PRMT4 (CARM1), C. elegans PRMT5, and Xenopus laevis PRMT5-MEP50 were incubated on the OPAL array in the presence of 3H-SAM. Using scintillation counting on the OPAL peptides, we determined relative methyltransferase activity (Figures 2B–2E) (Cornett et al., 2018; Wu et al., 2012). When comparing the probability of amino acid distribution adjacent to the methylarginine, we observed a few striking features: PRMT1 methylated “GR”-rich substrates and accommodated some hydrophobic residues but did not methylate substrates containing acidic residues; PRMT4/CARM1 methylated the known “PR”-rich motifs with additional enrichment of “F/W”-rich motifs and decreased activity toward glycine or acidic residues; both the C. elegans PRMT5—that is active without MEP50/WDR77—and the vertebrate PRMT5-MEP50 complex methylated “GR”-rich substrates and also accommodated acidic aspartic acid residues (Figures 2F–2I, bottom panels). These results revealed previously unknown differences in in vitro PRMT substrates, including the ability of PRMT4 and PRMT5 to tolerate hydrophobic and acidic residues, respectively.
Figure 2
In vitro substrates of the major enzymes PRMT1, PRMT4/CARM1, and PRMT5
(A) Oriented peptide array library (OPAL) substrate degeneracy schematic. As shown, the substrate arginine (R) is fixed as is one other position per mixture (Z) (locations −3,-2,-1, or 1,2,3 relative to the fixed R). The remainder of the substrate peptide residues are degenerate (X).
(B) Homo sapiens (Hs) PRMT1 relative activity toward the OPAL substrate library. Each row represents the fixed amino acid in each position. Charged residues are colored (blue = positive, red = negative) and shown at the top. Relative activity is shown as a heatmap (0–100%, white to blue).
(C) HsPRMT4/CARM1 relative activity shown as a heatmap.
(D) C. elegans (Ce) PRMT5 relative activity shown as a heatmap.
(E) X. laevis (Xl) PRMT5-MEP50 complex relative activity shown as a heatmap.
(F) Sequence logo probability plot of PRMT1 relative activity. Acidic residues are shown in red, basic in blue, hydrophobic in black, neutral in purple, and polar residues shown in green.
(G) Sequence logo probability plot of HsPRMT4/CARM1 relative activity.
(H) Sequence logo probability plot of CePRMT5 relative activity.
(I) Sequence logo probability plot of XlPRMT5-MEP50 relative activity.
In vitro substrates of the major enzymes PRMT1, PRMT4/CARM1, and PRMT5(A) Oriented peptide array library (OPAL) substrate degeneracy schematic. As shown, the substrate arginine (R) is fixed as is one other position per mixture (Z) (locations −3,-2,-1, or 1,2,3 relative to the fixed R). The remainder of the substrate peptide residues are degenerate (X).(B) Homo sapiens (Hs) PRMT1 relative activity toward the OPAL substrate library. Each row represents the fixed amino acid in each position. Charged residues are colored (blue = positive, red = negative) and shown at the top. Relative activity is shown as a heatmap (0–100%, white to blue).(C) HsPRMT4/CARM1 relative activity shown as a heatmap.(D) C. elegans (Ce) PRMT5 relative activity shown as a heatmap.(E) X. laevis (Xl) PRMT5-MEP50 complex relative activity shown as a heatmap.(F) Sequence logo probability plot of PRMT1 relative activity. Acidic residues are shown in red, basic in blue, hydrophobic in black, neutral in purple, and polar residues shown in green.(G) Sequence logo probability plot of HsPRMT4/CARM1 relative activity.(H) Sequence logo probability plot of CePRMT5 relative activity.(I) Sequence logo probability plot of XlPRMT5-MEP50 relative activity.
Establishment of a decision-tree-based mass spectrometric method for identification of methylated peptides
To characterize which proteins are methylated by the family of PRMTs, we turned to the PTMScan approach (Stokes et al., 2012). Following PRMT inhibition with either GSK591 or MS023, we probed A549 cell total lysate with the Cell Signaling Technology (CST) Rme1, Rme2s, and Rme2a antibodies (Figure 3A) (Gayatri et al., 2016). In our studies, we performed triplicate biological replicates in which we digested the total proteome with either trypsin or GluC proteases and isolated the resultant peptides (Figures S3A and 3B). We then performed sequential peptide immunoprecipitations with the Rme1, Rme2a, and Rme2s antibodies (Figure 3B). As described below, both the triplicate input peptides—representing the total proteome in each condition—as well as the immunoprecipitated peptides were subjected to mass spectrometry analysis.
Figure 3
PTMScan of arginine methylated proteins in cells treated with PRMT inhibitors
(A) Total proteome western blots of all three methylarginine states. Using the CST Rme1 (left), Rme2s (center left panel), and Rme2a (center right panel) antibodies, the changes in methylarginine protein abundance are shown for the control (DMSO), GSK591, and MS023 conditions. The right panel shows the Direct Blue 71 (DB71) membrane stain.
(B) Schematic of PTMScan approach. Purified tryptic or GluC peptides—in biological triplicate—were sequentially immunoprecipitated with the CST Rme1 antibodies, Rme2a antibodies, and then Rme2s antibodies. Peptides were eluted and subject to mass spectrometry. A sample of input peptides was reserved for total proteome analysis.
(C) Ratio of charge distribution of methylated (dark) versus non-methylated (light) peptides in either trypsin (top) or GluC (bottom) samples.
(D) Example ETD spectrum of the C-terminal peptide from small nuclear ribonucleoprotein SmD1 (SNRPD1). The peptide fragment from residue 93 to 118 containing 9x dimethylarginines, all site localized. The region from 350 to 750 m/z of the full mass spectrum (left/top) is expanded in the (right/bottom) spectra.
PTMScan of arginine methylated proteins in cells treated with PRMT inhibitors(A) Total proteome western blots of all three methylarginine states. Using the CST Rme1 (left), Rme2s (center left panel), and Rme2a (center right panel) antibodies, the changes in methylarginine protein abundance are shown for the control (DMSO), GSK591, and MS023 conditions. The right panel shows the Direct Blue 71 (DB71) membrane stain.(B) Schematic of PTMScan approach. Purified tryptic or GluC peptides—in biological triplicate—were sequentially immunoprecipitated with the CST Rme1 antibodies, Rme2a antibodies, and then Rme2s antibodies. Peptides were eluted and subject to mass spectrometry. A sample of input peptides was reserved for total proteome analysis.(C) Ratio of charge distribution of methylated (dark) versus non-methylated (light) peptides in either trypsin (top) or GluC (bottom) samples.(D) Example ETD spectrum of the C-terminal peptide from small nuclear ribonucleoprotein SmD1 (SNRPD1). The peptide fragment from residue 93 to 118 containing 9x dimethylarginines, all site localized. The region from 350 to 750 m/z of the full mass spectrum (left/top) is expanded in the (right/bottom) spectra.The combination of the multiply modified peptides along with the trypsin and GluC digestions, sequential IPs, and cell treatment resulted in a complicated network of results. Therefore, we developed a decision tree-based mass spectrometry analysis that took advantage of the Thermo Orbitrap Fusion Tribrid's two mass analyzers, multiple fragmentation techniques, and highly customizable method builder. As trypsin preferentially cleaves at unmodified arginine residues, peptides modified with methylarginine were expected to contain more basic residues and consequently higher charge states than the nonmethylated counterparts. This made ETD the more optimal fragmentation method. As such, we relied on ETD as the primary means of fragmentation but also incorporated HCD-based analysis (Figure 3B). We analyzed the shorter and lower charged peptides with the linear ion trap, while the longer, more complex, higher charged peptides were analyzed with the Orbitrap. This customized method was designed to focus instrument resources such as time and resolution on higher charged, low m/z peptides, which were anticipated to contain arginine methylation. In this approach—allowing for charge-state and m/z-based MS/MS analysis—both the fragmentation method and spectral acquisition were optimized based on the properties of the peptides (Figure 3B). As methylarginine-containing peptides had higher charge distributions relative to unmethylated peptides, the decision tree-based instrument method improved the detection of more complex peptides (Figure 3C). For instance, for an SmD1 (SNRPD) C-terminal peptide (aa 93-118) containing 9x modified methylarginines, we first considered both the m/z (418.9755) and the charge (+7) (Figure 3D). Owing to the lower m/z and the higher charge, the peptide was directed to be fragmented by ETD. This resulted in comprehensive fragmentation, which followed by mass spectrum acquisition with the high resolution Orbitrap mass analyzer, allowed us to site localize each modification.In our new approach, mass spectra were searched with a combination of Byonic by Protein Metrics and Proteome Discoverer 2.4 by Thermo Scientific. Proteome Discoverer filtered MS/MS by mass analyzer for database searching. Byonic enabled the database search to contain sufficient missed cleavages resulting from the modified arginine residues, as well as adequate numbers of modifications.The average total number of peptides identified in each PTMScan IP is shown in Figure S4A. There were a comparable number of peptides resulting from either trypsin or GluC digestion. Across all the IPs, approximately 20-40% of the total peptides contained arginine methylation (Figure S4B). As expected from the sequential immunoprecipitations, there were about 50% more Rme1-enriched peptides than Rme2s or Rme2a (Figure S4B). The peptide abundances revealed that most peptides exhibited a hybrid state with both a monomethyl- and dimethylarginine (Figure S4C).
PRMT inhibition results in global transcriptomic and proteomic consequences
Prior to performing an in-depth analysis of the arginine methylome, we wanted to understand both the global proteomic and transcriptomic consequences of PRMT inhibition. To accomplish this, we analyzed the total proteome and also performed random hexamer RNA sequencing following 7-day treatment with either GSK591 or MS023. We found that independent trypsin and GluC digestions allowed us to sample diverse proteins and therefore to increase the resolution of our proteomic analysis we combined both trypsin and GluC results (Figure S4D). When comparing protein-level abundances in GSK591-treated cells, we observed 305 unique proteins with significant changes (P < 0.05) (Figure 4A and Table S1). Within the quadrants, we listed the five most significant down- or up-regulated proteins. Consistent with a role for PRMT5 in the regulation of global gene expression, when looking at the transcriptome, we observed 10,764 differentially expressed transcripts (Padj < 0.05) (Figure 4B and Table S1). We next intersected the significantly affected proteins and transcripts to determine their correlation. Of 305 differentially expressed proteins, 212 had a significant change in transcript abundance. Interestingly, we observed 62 proteins that were upregulated despite having decreased transcript and 23 that were downregulated with increased transcript (Spearman's rank correlation coefficient (ρ) = 0.23) (Figure 4C and Table S1).
Figure 4
Proteomic and Transcriptomic analysis reveals robust PRMT-inhibitor dependent changes
(A) Volcano plot of combined trypsin (circle) and GluC (square) derived total protein changes in GSK591-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log2(P) (dashed y axis line represents P = 0.05; significant values are green). The top 5 most significant proteins are listed in the upper quadrants.
(B) Volcano plot of transcriptomic changes in GSK591-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log10(Padj) (dashed y axis line represents Padj = 0.05; significant values are green). The top 5 most significant transcripts are listed in the upper quadrants.
(C) Comparison of common significant transcript and protein log2(fold change relative to DMSO) for GSK591-treated cells. The top 5 most significant genes are listed in their respective quadrants.
(D) Volcano plot of combined trypsin (circle) and GluC (square) derived total protein changes in MS023-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log2(P) (dashed y axis line represents P = 0.05; significant values are purple). The top 5 most significant proteins are listed in the upper quadrants.
(E) Volcano plot of transcriptomic changes in MS023-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log10(Padj) (dashed y axis line represents Padj = 0.05; significant values are purple). The top 5 most significant transcripts are listed in the upper quadrants.
(F) Comparison of common significant transcript and protein log2(fold change relative to DMSO) for MS023-treated cells. The top 5 most significant genes are listed in their respective quadrants.
(G) Comparison of common significant protein log2(fold change relative to DMSO) for GSK591- and MS023-treated cells. The top 5 most significant proteins are listed in their respective quadrants.
(H) Comparison of common significant transcript log2(fold change relative to DMSO) for GSK591- and MS023-treated cells. The top 5 most significant transcripts are listed in their respective quadrants.
(I) Over-representation analysis for Cellular Component of the top 300 most significant differentially abundant proteins in either GSK591- or MS023-treated cells. Circle size is proportional to the Gene Ratio, while color denotes significance (orange is more significant, purple is less significant).
(J) Over-representation analysis for Cellular Component of the top 300 most significant differentially abundant transcripts in either GSK591- or MS023-treated cells. Circle size is proportional to the Gene Ratio, while color denotes significance (orange is more significant, purple is less significant).
Proteomic and Transcriptomic analysis reveals robust PRMT-inhibitor dependent changes(A) Volcano plot of combined trypsin (circle) and GluC (square) derived total protein changes in GSK591-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log2(P) (dashed y axis line represents P = 0.05; significant values are green). The top 5 most significant proteins are listed in the upper quadrants.(B) Volcano plot of transcriptomic changes in GSK591-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log10(Padj) (dashed y axis line represents Padj = 0.05; significant values are green). The top 5 most significant transcripts are listed in the upper quadrants.(C) Comparison of common significant transcript and protein log2(fold change relative to DMSO) for GSK591-treated cells. The top 5 most significant genes are listed in their respective quadrants.(D) Volcano plot of combined trypsin (circle) and GluC (square) derived total protein changes in MS023-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log2(P) (dashed y axis line represents P = 0.05; significant values are purple). The top 5 most significant proteins are listed in the upper quadrants.(E) Volcano plot of transcriptomic changes in MS023-treated A549 cells relative to DMSO where x axis is log2(fold change relative to DMSO); y axis is -log10(Padj) (dashed y axis line represents Padj = 0.05; significant values are purple). The top 5 most significant transcripts are listed in the upper quadrants.(F) Comparison of common significant transcript and protein log2(fold change relative to DMSO) for MS023-treated cells. The top 5 most significant genes are listed in their respective quadrants.(G) Comparison of common significant protein log2(fold change relative to DMSO) for GSK591- and MS023-treated cells. The top 5 most significant proteins are listed in their respective quadrants.(H) Comparison of common significant transcript log2(fold change relative to DMSO) for GSK591- and MS023-treated cells. The top 5 most significant transcripts are listed in their respective quadrants.(I) Over-representation analysis for Cellular Component of the top 300 most significant differentially abundant proteins in either GSK591- or MS023-treated cells. Circle size is proportional to the Gene Ratio, while color denotes significance (orange is more significant, purple is less significant).(J) Over-representation analysis for Cellular Component of the top 300 most significant differentially abundant transcripts in either GSK591- or MS023-treated cells. Circle size is proportional to the Gene Ratio, while color denotes significance (orange is more significant, purple is less significant).We performed the same analysis with MS023-treated cells. MS023 resulted in differential expression of 499 proteins (P < 0.05) (Figure 4D). Similar to PRMT5 inhibition, perturbation of type I PRMTs promoted gross transcriptomic changes, with 8,685 differentially expressed transcripts (Padj < 0.05) (Figure 4E). When compared with the proteome, there were 307 genes with altered protein and transcript abundance (ρ = 0.46). Of those, 55 proteins were increased despite a decrease in transcript while 20 decreased with increased transcript (Figure 4F).Next, we asked how the proteins with differential expression compared between GSK591 and MS023. There were 116 proteins in common between both conditions and these had a strongly significant correlation in their differential expression (ρ = 0.93) (Figure 4G). Despite this correlation, most proteins were not coregulated between treatments: 189 and 383 proteins were unique to either GSK591 or MS023, respectively (Figure S5A). We performed a parallel analysis with the transcriptome and observed a much poorer correlation between either inhibitor treatment (ρ = 0.19) (Figure 4H). There were 5,663 common transcripts to GSK591 and MS023 with 2,134 having inverse expression changes. We confirmed these results on select genes whose expression was either dependent on the type of PRMT inhibition or not changed to validate by RT-qPCR (Figure S5B). Taken together, these results support that PRMT5 and type I PRMTs have both mutual and independent roles in the global regulation of gene expression.To gain an understanding of what pathways PRMTs regulate, we performed over representation analysis on both the proteome and transcriptome with either GSK591 or MS023 treatment (Figures 4I and 4J). Despite the weak correlation of the transcriptome and proteome, many of the represented ontologies were similar. With GSK591 treatment—in both the proteome and transcriptome—we observed an enrichment of terms related to chromatin, RNA processing, and translation (Figures 4I and 4J). In the transcriptome specifically, we also noted an enrichment of cytoskeletal and extracellular matrix related pathways. In both the transcriptome and proteome of MS023-treated cells, we observed a strong enrichment of pathways pertaining to the cytoskeleton and extracellular matrix (Figures 4I and 4J). These results support that—despite the limited correlation of the transcriptome and proteome—the most overrepresented pathways affected by PRMT inhibition are highly coincident. Furthermore, as most ontologies were unique to either inhibitor, this is consistent with PRMT5 and type I PRMTs having independent roles in cellular homeostasis.
Type I PRMTs and PRMT5 promote contrasting changes in cellular phenotype
As many of the transcripts and proteins that were affected by either GSK591 or MS023 treatment are involved in the cytoskeleton—Dystonin and Tight Junction Protein 3 were among the top 5 most upregulated proteins in either condition—we tested whether there were clear morphological and phenotypic differences in A549 cells following treatment with either inhibitor. To accomplish this, we treated A549 cells for one week and then performed phalloidin staining to capture their phenotype (Figure 5A). Strikingly, we observed gross morphological changes that were opposite with either GSK591 or MS023 treatment. Although A549 cells treated with DMSO appeared stellate and maintained moderate intercellular spacing, GSK591 treatment resulted in large, outstretched cells. Contrastingly, MS023 led to the clustering of cells with reduced intercellular space. When cell area was measured, the GSK591 cells has an increased relative cell area (77.6 ± 8.66; P < 0.001), whereas DMSO and MS023 had comparable relative cell areas (15.5 ± 1.30 and 15.2 ± 1.50, respectively) (Figure 5B). DAPI staining also indicated that GSK591 led to a higher frequency of multinucleate cells (1.29 ± 0.05) compared with either DMSO (1.04 ± 0.02; P < 0.001) or MS023 (1.12 ± 0.04; P = 0.005) (Figure 5C).
Figure 5
Phenotypic consequences of type I PRMT- or PRMT5-inhibition
(A) Rhodamine phalloidin (red) and DAPI (blue) staining of A549 cells treated with 0.01% DMSO, 1 μM GSK591, or 1 μM MS023 for 7 days. Scale bar is 50 μm.
(B) Cell size analysis of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Y axis denotes relative cell area. Data are represented as mean ± SD.
(C) Nuclei per cell analysis of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Y axis denotes observed cells.
(D) Migration assay of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Micrographs of crystal violet stained (purple) cells (left); scale bar is 50 pixels. Quantitation of successfully migrating cells relative to control (right). Data are represented as mean ± SD.
(E) Invasion assay of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Micrographs of crystal violet stained (purple) cells (left); scale bar is 50 pixels. Quantitation of successfully invading cells relative to control (right). Data are represented as mean ± SD.
(F) BLISS synergy and antagonism score for dose response matrix of cells treated with GSK591 (y axis) or MS023 (x axis) for 7 days (blue represents increased synergy; red represents increased antagonism).
(G) Combenefit analysis for drug synergy with GSK591 and MS023 using Loewe, BLISS, and HSA models (positive more synergistic; negative more antagonistic).
Phenotypic consequences of type I PRMT- or PRMT5-inhibition(A) Rhodamine phalloidin (red) and DAPI (blue) staining of A549 cells treated with 0.01% DMSO, 1 μM GSK591, or 1 μM MS023 for 7 days. Scale bar is 50 μm.(B) Cell size analysis of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Y axis denotes relative cell area. Data are represented as mean ± SD.(C) Nuclei per cell analysis of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Y axis denotes observed cells.(D) Migration assay of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Micrographs of crystal violet stained (purple) cells (left); scale bar is 50 pixels. Quantitation of successfully migrating cells relative to control (right). Data are represented as mean ± SD.(E) Invasion assay of A549 cells treated with 0.01% DMSO (gray), 1 μM GSK591 (green), or 1 μM MS023 (purple) for 7 days. Micrographs of crystal violet stained (purple) cells (left); scale bar is 50 pixels. Quantitation of successfully invading cells relative to control (right). Data are represented as mean ± SD.(F) BLISS synergy and antagonism score for dose response matrix of cells treated with GSK591 (y axis) or MS023 (x axis) for 7 days (blue represents increased synergy; red represents increased antagonism).(G) Combenefit analysis for drug synergy with GSK591 and MS023 using Loewe, BLISS, and HSA models (positive more synergistic; negative more antagonistic).To further test the biological consequences of PRMT inhibition, we performed migration and Matrigel invasion assays (Figures 5D and 5E) (Chen et al., 2017). We observed a severely reduced ability of A549 cells to migrate toward media containing serum following starvation relative to DMSO when treated with either GSK591 (13.3 ± 1.8%; P < 0.001) or MS023 (42.7 ± 4.8%; P < 0.001). This is consistent with our previous tests using shRNA targeting PRMT5 or GSK591 at 500 nM for 4 days (Chen et al., 2017). We observed a similar phenotype with MS023 treatment, supporting that type I PRMTs are also necessary for invasive cellular phenotypes, although the effect of PRMT5 inhibition was greater than that of MS023 (P < 0.01).Since type I PRMTs and PRMT5 promote opposing phenotypic changes while causing similar migration and invasion phenotypes, to further define the enzymes' cellular functions we performed drug synergy screens. Recent publications have used the power of chemical synergy and antagonism screening to identify interacting molecular pathways (Barretina et al., 2012; Dietlein et al., 2015; Kryukov et al., 2016). Co-inhibition of type I PRMTs and PRMT5 has also been demonstrated as a successful strategy for killing tumor cells (Fedoriw et al., 2019; Gao et al., 2019). We reasoned that we could use this approach to test and confirm the independent or complementary roles of each PRMT family. For these studies, we established a matrix of dose and systematically tested combinations of inhibitors on cell viability using an MTS redox assay. We then tested the combination of GSK591 and MS023 at varying concentrations and determined a broad range of either inhibitor that led to increased cell death (Figure 5F). Strong synergy between inhibitors was supported in all three models used for analysis: Loewe, BLISS, and HSA (67.8, 57.7, and 78.4, respectively) (Figure 5G) (Di Veroli et al., 2016). Taken together, these data reinforce that type I PRMTs and PRMT5 maintain independent roles in regulating cellular viability.
Arginine methylation is enriched on a diverse array of substrates
As we had established that type I PRMTs and PRMT5 maintain independent control over both the proteome and transcriptome—consistent with the phenotypic differences and high level of synergy seen with GSK591 and MS023—we next sought to characterize methylarginine level changes that may be mediating these robust cellular consequences. Using our novel decision tree-based MS/MS approach to identify highly charged and redundant arginine methylated peptides (Figure 3B), in addition to independent trypsin and GluC digestions, we observed 2,444 unique methylations (Table S2). Of these, 666 were only modified with Rme1, 548 only Rme2, and 615 with both (Figure S4E). We then determined the sequence distributions of the flanking 10 amino acids of all methylated arginines, as well as those modified with either Rme1 or Rme2 (Figures S6A–S6C). When looking at all methylarginines in our data, the consensus sequence was consistent with the well-established “GAR” motif, although there was an increased probability of a downstream glycine compared with an upstream one. We also noted a high probability of adjacent proline, serine, and aspartic acid residues.Next, we wanted to understand which proteins contained the most methylarginine. We identified the RNA-binding proteins TAF15 (36), EWSR1 (34), and FUS (29)—together referred to as the FET family and frequently rearranged and aberrantly expressed in sarcomas and hematopoietic cancers—as the three most heavily modified methylarginine containing proteins (Figure 6A) (Kovar, 2011). This result highlights the FET family—as well as other heavily modified methylarginine containing proteins—as potentially important PRMT targets in cancer biology.
Figure 6
PTMScan protein and residue level analysis reveals PRMT-inhibitor dependent changes in arginine methylation
(A) Number of unique methylarginine residues per protein (x axis) versus the number of proteins (y axis).
(B) Intersection of proteins with significant differential expression and methylarginine abundance in A549 cells treated with either GSK591 (green) or MS023 (purple) relative to DMSO.
(C–E) Volcano plot of combined trypsin (circle) and GluC (square) monomethylarginine (Rme1) (c) asymmetric dimethylarginine (Rme2a) (d) and symmetric dimethylarginine (Rme2s) (e) peptide enrichments for GSK591 (left, green) and MS023 (right, purple) treated cells where x axis is log2(fold change relative to DMSO); y axis is -log2(P) (dashed y axis line represents P = 0.05). The top 5 most significant proteins are listed in the upper quadrants.
(F) Over-representation analysis for Biological Process (BP), Molecular Function (MF), and Cellular Component (CC) ontologies of the significant differentially abundant protein enrichments according to their IP in either GSK591- or MS023-treated cells. Circle size is proportional to the Protein Ratio, while color denotes significance (orange is more significant, purple is less significant).
PTMScan protein and residue level analysis reveals PRMT-inhibitor dependent changes in arginine methylation(A) Number of unique methylarginine residues per protein (x axis) versus the number of proteins (y axis).(B) Intersection of proteins with significant differential expression and methylarginine abundance in A549 cells treated with either GSK591 (green) or MS023 (purple) relative to DMSO.(C–E) Volcano plot of combined trypsin (circle) and GluC (square) monomethylarginine (Rme1) (c) asymmetric dimethylarginine (Rme2a) (d) and symmetric dimethylarginine (Rme2s) (e) peptide enrichments for GSK591 (left, green) and MS023 (right, purple) treated cells where x axis is log2(fold change relative to DMSO); y axis is -log2(P) (dashed y axis line represents P = 0.05). The top 5 most significant proteins are listed in the upper quadrants.(F) Over-representation analysis for Biological Process (BP), Molecular Function (MF), and Cellular Component (CC) ontologies of the significant differentially abundant protein enrichments according to their IP in either GSK591- or MS023-treated cells. Circle size is proportional to the Protein Ratio, while color denotes significance (orange is more significant, purple is less significant).Given the robust impact PRMT inhibition had on the proteome, we wanted to compare how many of the proteins that changed in abundance also had significant changes in arginine methylation. Surprisingly, we observed that most proteins that had significant changes in arginine methylation did not have altered abundance (Figure 6B). Furthermore, when looking specifically at proteins that had both altered methylarginine levels and abundances, we noted that factors that contained higher amounts of methylarginine and were involved in RNA processing—such as SNRPB and TAF15 in GSK591 or FUS, YBX3, and RBMX in MS023—were decreased.Next, using IP enrichment, we analyzed GSK591- or MS023-mediated differences in methylated peptides (Figures 6C–6E, Rme1, Rme2a, Rme2s IPs, respectively). Owing to the unique peptides resulting from trypsin- and GluC-based digestions, to increase our resolution of methylarginine containing residues we combined both data sets (Figure S4F). As expected from the Western blots, there were fewer overall peptide level abundance changes with GSK591 treatment (Figures 6C–6E, left green plots) with a clear decrease in Rme2s-enriched peptides. Alternatively, MS023-treatment (purple) resulted in increases in both Rme1-and Rme2s-containing peptides. The top five most significant peptides are listed in the upper quadrants. Specific methylation-site level changes are shown in Table S3.To understand the classes of proteins that are mono- or dimethylated, we performed over representation analysis on each class of enriched peptides (Figure 6F). Consistent with prior studies, the methylated-protein ontologies were primarily related to nuclear function, transcription, splicing, and translation. However, each methylated state also showed distinct ontological representation. For instance, Rme2s containing peptides were specifically enriched in DNA replication and helicase activity, localization to Cajal bodies, the methylosome, and SMN-Sm assembly (Figure 6F). Alternatively, Rme2a was associated with nuclear export, ribonucleoprotein localization, and exosome activity (Figure 6F). Rme1 was found to have unique ontologies as well, including RNA helicase activity, and cell-cell adhesion-related terms (Figure 6F). In sum, these results support that methylarginine states have a broad overlap in RNA processing and translation but also maintain specific, non-overlapping roles within the cell.
Methylarginine is enriched in charged, disordered, and LLPS susceptible proteins
To better understand the features of proteins that are methylated, we first characterized the entire human proteome in terms of molecular weight, hydrophobicity, isoelectric point, and predicted intrinsic disorder. We probed 56,392 human Uniprot sequences, covering most known human protein variants. For each sequence, we also calculated the molecular weight, the predicted hydrophobicity as determined by GRAVY (Kyte and Doolittle, 1982), and the predicted isoelectric point as calculated at isoelectricpointdb.org (Kozlowski, 2016). We plotted the percent disorder, as calculated by RAPID (Yan et al., 2013), versus these values to illustrate the characteristics of the human proteome (Figures 7A–7C and Table S4). The median predicted disorder was 18.1%, the median molecular weight was 31.4 kDa, the median hydrophobicity was −0.37, and the median isoelectric point was 7.03. These plots gave us a unique perspective on the proteome: proteins of greater predicted intrinsic disorder are smaller than the median, substantially less hydrophobic, and more highly charged.
Figure 7
Proteome characteristics reveal the nature of the PTMScan arginine methylome
(A) Comparison of RAPID predicted intrinsic disorder percentage on x axis and the log10(molecular weight (Da)) on the y axis for 56,392 human proteins (Uniprot, 2012). Vertical dashed line denotes the median intrinsic disorder (18.1%); horizontal dashed line denotes median molecular weight (31.4 kDa).
(B) Comparison of RAPID predicted intrinsic disorder (x axis) and isoelectric point (y axis). Vertical dashed line denotes the median intrinsic disorder (18.1%); horizontal dashed line denotes median isoelectric point (7.03).
(C) Comparison of RAPID predicted intrinsic disorder (x axis) and hydrophobicity as calculated by GRAVY (y axis). Positive scores are hydrophobic, while negative scores are hydrophilic. Vertical dashed line denotes the median intrinsic disorder (18.1%); horizontal dashed line denotes median hydrophobicity (−0.37).
(D) RAPID percent disorder distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets. Vertical dashed line denotes the proteomic median intrinsic disorder (18.1%); solid line within individual plots denotes median, while dashed lines denote quartiles.
(E) The molecular weight distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets are shown as violin plots as in d.
(F) The isoelectric point distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets are shown as violin plots as in d.
(G) The GRAVY hydrophobicity distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets are shown as violin plots as in d.
(H) Table showing the number of proteins in each set.
(I) Venn diagram showing the intersection human proteins between the PTMScan methylarginine containing proteins and those previously identified to be bound to RNA using RBR-ID.
(J) PScore distribution—indicating pi-pi mediated liquid-liquid phase separation (LLPS) propensity—for the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets. Vertical dashed line denotes the proteomic median PScore (0.69); solid line within individual plots denotes median, while dashed lines denote quartiles.
(K) Percent of residues found in intrinsically disordered regions (light) or non-disordered regions (dark) for Rme1 (86.2%) and Rme2 (85.2%).
Proteome characteristics reveal the nature of the PTMScan arginine methylome(A) Comparison of RAPID predicted intrinsic disorder percentage on x axis and the log10(molecular weight (Da)) on the y axis for 56,392 human proteins (Uniprot, 2012). Vertical dashed line denotes the median intrinsic disorder (18.1%); horizontal dashed line denotes median molecular weight (31.4 kDa).(B) Comparison of RAPID predicted intrinsic disorder (x axis) and isoelectric point (y axis). Vertical dashed line denotes the median intrinsic disorder (18.1%); horizontal dashed line denotes median isoelectric point (7.03).(C) Comparison of RAPID predicted intrinsic disorder (x axis) and hydrophobicity as calculated by GRAVY (y axis). Positive scores are hydrophobic, while negative scores are hydrophilic. Vertical dashed line denotes the median intrinsic disorder (18.1%); horizontal dashed line denotes median hydrophobicity (−0.37).(D) RAPID percent disorder distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets. Vertical dashed line denotes the proteomic median intrinsic disorder (18.1%); solid line within individual plots denotes median, while dashed lines denote quartiles.(E) The molecular weight distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets are shown as violin plots as in d.(F) The isoelectric point distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets are shown as violin plots as in d.(G) The GRAVY hydrophobicity distribution of the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets are shown as violin plots as in d.(H) Table showing the number of proteins in each set.(I) Venn diagram showing the intersection human proteins between the PTMScan methylarginine containing proteins and those previously identified to be bound to RNA using RBR-ID.(J) PScore distribution—indicating pi-pi mediated liquid-liquid phase separation (LLPS) propensity—for the proteome, PDB, Nucleus, RNA-binding, chromatin, and methylarginine (orange) sets. Vertical dashed line denotes the proteomic median PScore (0.69); solid line within individual plots denotes median, while dashed lines denote quartiles.(K) Percent of residues found in intrinsically disordered regions (light) or non-disordered regions (dark) for Rme1 (86.2%) and Rme2 (85.2%).For comparison with all 585 identified proteins containing methylarginine, we used Uniprot classifications to categorize human proteins, including proteins with solved PDB structures, those annotated to be found in the nucleus, those known to be involved in RNA binding, and proteins found associated with chromatin. In Figures 7D–7H, we compared these distributions. The methylarginine proteins had the largest degree of predicted disorder but were substantially larger in MW and less hydrophobic than most. Importantly, methylarginine containing proteins had a wide charge distribution, consistent with enrichment of basic patches (e.g. “GAR”) and neighboring acidic stretches. To further confirm the likelihood that these methylated proteins are involved in RNA binding, we intersected the proteins with a set of human proteins positively identified through RBR-ID as interacting with RNA (He et al., 2016). As shown in Figures 7I, 346 of the proteins intersected. These observations are all consistent with arginine methylation having a major role in regulating RNA-based interactions.As methylarginine has been hypothesized to be involved in liquid-liquid phase separation (LLPS), we further tested properties of the methylated proteins. First, we used the PScore algorithm to predict LLPS propensity; this algorithm specifically probes putative pi-pi contacts, the kind that may be directly influenced by arginine methylation (Vernon et al., 2018). As shown in Figure 7J, the methylarginine-containing proteins had a higher PScore and therefore a higher predicted propensity for LLPS.Next, as many LLPS and nucleic acid interactions are mediated through intrinsically disordered regions (IDRs), we asked if the identified methylarginine sites were embedded in IDRs. We probed the MobiDB database (Piovesan et al., 2017) (https://mobidb.bio.unipd.it/, downloaded February 2020) of IDRs and interrogated if each site was in an annotated IDR. As shown in Figure 7K, ∼86% of the identified sites in all three IPs were found in IDRs. We compared the sequence distributions of methylarginines in IDRs to those not in IDRs. We observed that the methylarginines within IDRs were enriched in glycine, proline, and serine residues—representing the canonical “GAR” motif—while non-IDR methylarginine had a higher probability of adjacent hydrophobic leucine residues (Figures S6D and S6E).Lastly, to further expand our knowledge of methylarginine containing proteins and the residues which are specifically modified, we compiled previously published publicly available methylarginine datasets (Table S5) (Fedoriw et al., 2019; Fong et al., 2019; Guo et al., 2014; Hornbeck et al., 2015; Larsen et al., 2016; Li et al., 2021; Lim et al., 2020; Musiani et al., 2019; Wei et al., 2020). Together, these data sets—including our own—contained 5,255 unique methylarginine-modified proteins with 15,386 independent methylarginine residues (Figures S7A and S7B). Using our decision-based MS/MS approach, we identified 630 novel methylarginine sites and 106 previously unidentified methylarginine containing proteins. Furthermore, when cross-referencing our compiled database of published methylarginines with those that had correctly aligned sequences in the corresponding MobiDB dataset, we noted that ∼55% of all methylarginines were contained within a disordered region. This is considerably less than the results reported by our data set—likely owing to the sequence bias inherent to antibody-based enrichment—yet significantly greater than the median disorder of the human proteome (Figure 7D). When examining the amino acid sequence distribution of methylarginines contained within either disordered (n = 7,300) or structured regions (n = 5,937), they varied significantly. Methylarginine contained within disordered regions was enriched for adjacent glycine, proline, and serine consistent with the majority of published methylarginine motifs (Figure S7C). In contrast, methylarginines located in structured domains had a higher incidence of neighboring leucine and alanine, as well as aspartic acid and glutamate (Figure S7D). Taken together, these results support that methylarginine decorates a diverse array of proteins, is typically enriched in IDRs, and that the sequence motifs of methylarginine in IDRs has distinct differences than those in non-IDRs.
FUS and TAF15 are inversely regulated by type I PRMTs and PRMT5
Now that we had built a greater understanding of the diversity of methylarginine containing proteins and their characteristics, we wanted to dissect another phenomenon apparent in our data: substrate scavenging. As such, we selected two proteins predicted to be highly disordered, containing numerous methylarginines, and with differential enrichment in the methylarginine IPs—FUS and TAF15. We then mapped the normalized abundance of Rme1 and Rme2 detected at each residue and overlayed this with the predicted intrinsic disorder using the DISOPRED3 algorithm (Jones and Cozzetto, 2015) (Figure 8A). We found that the majority of methylarginines were enriched in IDRs, although there did appear to be a patch of methylarginine enriched in a region predicted to be structured in FUS (R216, R218, R234, R242, R244, R248, R251, and R259). Of note, this same region was predicted to be disordered in the MobiDB data set; as the MobiDB data set is a compendium of many algorithms, this illustrates discrepancies between IDR predictors. We also observed differences in the location of Rme1 and Rme2 in both FUS and TAF15 (Figure 8A).
Figure 8
PRMT inhibition promotes substrate scavenging of FUS and TAF15
(A) Rme1 and Rme2s abundance in A549 cells (lollipop height and size) juxtaposed with DISOPRED3 predicted intrinsic disorder for FUS and TAF15 (white less disordered; black more disordered).
(B) Comparison of significant GSK591- or MS023-dependent changes in Rme1 (yellow) or Rme2 (burgundy) relative to DMSO on FUS and TAF15 where each circle represents a unique residue. Circle size is proportional to -log2(P).
(C) FUS and TAF15 co-immunoprecipitation blotted for each protein and methylarginine state (Rme1, Rme2s, Rme2a, as indicated). Direct Blue 71 (DB71) total protein membrane stain is at the bottom.
PRMT inhibition promotes substrate scavenging of FUS and TAF15(A) Rme1 and Rme2s abundance in A549 cells (lollipop height and size) juxtaposed with DISOPRED3 predicted intrinsic disorder for FUS and TAF15 (white less disordered; black more disordered).(B) Comparison of significant GSK591- or MS023-dependent changes in Rme1 (yellow) or Rme2 (burgundy) relative to DMSO on FUS and TAF15 where each circle represents a unique residue. Circle size is proportional to -log2(P).(C) FUS and TAF15 co-immunoprecipitation blotted for each protein and methylarginine state (Rme1, Rme2s, Rme2a, as indicated). Direct Blue 71 (DB71) total protein membrane stain is at the bottom.Next, we looked at the residue level abundance changes in either Rme1 or Rme2 (Figure 8B). We found that in the presence of GSK591 FUS had increased Rme1 and Rme2 while TAF15 had decreased Rme1 and Rme2. When analyzing MS023-treated cells, we observed that there were many more residues with significant differences in Rme1 or Rme2 abundance. In FUS, there was a large increase in Rme1 and Rme2. Alternatively, in TAF15 there was an increase in Rme1 with more balanced changes in Rme2.To confirm these results on whole protein, after a 7-day treatment of A549 cells with either GSK591 or MS023, we performed a co-immunoprecipitation of both FUS and TAF15 (Figure 8C). Consistent with the changes detected in the total proteome analysis, we observed a decrease in FUS in MS023, while TAF15 was decreased in GSK591. Furthermore, we observed that TAF15 co-immunoprecipitated with FUS and in support of different types of PRMTs promoting inverse consequences, co-immunoprecipitation of FUS and TAF15 was decreased in the presence of GSK591 and increased with MS023. When looking specifically at different methylarginine states—consistent with PRMT-substrate scavenging—we observed that with MS023-treatment Rme2a on FUS was decreased while both Rme2s and Rme1 were increased. On TAF15, Rme2a was lost with a corresponding increase in Rme1. In the presence of GSK591, there were no observable changes in FUS methylation; however, TAF15 had a strong decrease in Rme1. Taken together, these results provide further evidence for PRMT-substrate scavenging of RNA-binding, intrinsically disordered proteins.
Discussion
In this study, our goal was to provide a diverse and comprehensive understanding of the consequences of PRMT activity on the transcriptome, proteome, and phenotype of human cells. As highly effective and specific chemical probes have been developed for the main classes of PRMTs, to test cellular consequences of the loss of these enzymes we used MS023 (a general inhibitor of type I enzymes) and GSK591 (a specific PRMT5 inhibitor, also known as EPZ015866 or GSK3203591) (Duncan et al., 2015; Eram et al., 2016). We developed a high-resolution, rapid, direct-injection-based mass spectrometric method for characterizing Rme2s and Rme2a relative abundance, as well as a label-free methodology for PTMScan peptide immunoaffinity analysis. In parallel, to determine PRMT substrate specificity, we utilized a new in vitro methyltransferase assay using degenerate peptide substrates. Finally, to understand how PRMT activity regulates the transcriptome, proteome, and cellular phenotype, we performed RNA seq, total proteome analysis, and characterized cellular morphology and viability in control and drug-treated conditions. We also specifically probed PRMT-mediated substrate scavenging of the most heavily arginine methylated proteins in our data, TAF15 and FUS. Altogether, our work serves to comprehensively integrate an understanding of PRMT substrates with the transcriptomic, proteomic, and phenotypic consequences of their activity.Multiple approaches have been previously used to characterize total proteome arginine methylation; most relied upon hydrolyzed-protein reversed-phase chromatography coupled with standards and peak integration (Boffa et al., 1977; Bulau et al., 2006; Dhar et al., 2013; Esse et al., 2014; Paik and Kim, 1970). More recent advances have employed the use of highly sensitive NMR (Zhang et al., 2021). Here, after total protein acid hydrolysis to produce constituent amino acids, we developed a new procedure. To characterize the relative abundances of Rme2s and Rme2a more rapidly and precisely, we used direct-injection mass spectrometry with the high-resolution Orbitrap Fusion Lumos. Using this new approach, we demonstrated that Rme2a was approximately 20 x higher than Rme2s in A549 lung adenocarcinoma cells and that Rme2s was ∼25% of total Rme2 in Xenopus cell-free egg extract. Most surprising was that although Rme2a decreased with MS023, this methylarginine species was still highly abundant. Since we used both drugs at relatively high concentrations (10-100X) compared with their IC50s, there are a series of possible explanations for these observations. As the half-life of methylarginine is long, it is possible that even after one week of treatment some residual long-lived methylation may be present (Barth and Imhof, 2010; Zee et al., 2010). Future studies will need to use SILAC type approaches to determine methylarginine half-lives (Zee et al., 2010). Alternatively, other PRMTs with lower sensitivity to MS023—such as PRMT2 and PRMT3—may produce these methylations. Importantly, the large and poorly documented family of methyltransferase-like (METTL) SAM-dependent methyltransferases—recently demonstrated to catalyze histone arginine methylation (Hatanaka et al., 2017)—may also compensate for MS023-mediated type I PRMT inhibition. Future studies are necessary to test this possibility.Our studies also included an advance of the previously performed OPAL technology. In this work, we used more quantitative FlashPlate scintillation counting of PRMT activity toward immobilized degenerate peptides (Cornett et al., 2018; Wu et al., 2012). As these peptides contained a fixed central arginine substrate along with a single other fixed amino acid position (excluding cysteine), we validated and solidified in vitro substrates for the three most abundant enzymes: PRMT1, PRMT4/CARM1, and PRMT5. Our results showed wider sequence determinants for PRMT1 and PRMT5 than previously known, including some likely modulatory consequences of the MEP50 substrate presenter (Burgos et al., 2015; Ho et al., 2013). Most strikingly, CARM1 showed remarkably distinct substrates, with increased activity in the presence of neighboring hydrophobic residues. While this was previously hinted at (Gayatri et al., 2016; Shishkova et al., 2017), our evidence should prompt a deeper search for unique CARM1 substrates in cells. Considering our analysis of IDR and non-IDR localized methylarginine—non-IDR methylarginine was enriched for adjacent hydrophobic residues—it is interesting to speculate that CARM1 may preferentially methylate arginines located in more structured regions (Price and Hevel, 2020).To characterize the cellular PRMT substrates, we employed and advanced the widely used PTMScan approach (Stokes et al., 2012). This approach relies on proteolysis of the proteome to ensure soluble peptides for targeted immunoprecipitation. Uniquely, we performed this experiment in three conditions with three targeted antigens: in DMSO-, GSK591-, or MS023-treated cells with antibodies against Rme1, Rme2a, and Rme2s. To make the experiment time- and cost-effective, we performed successive immunoprecipitations. We developed a new approach for the detection of methylarginine-containing peptides by liquid chromatography coupled online with tandem mass spectrometry (LC-MS/MS) that relied on a decision tree-based instrument method and database search. The method, which prioritized ETD fragmentation, but also used HCD-based fragmentation, allowed for confident site localization of mono- and di-methylated residues in multiply modified peptides. With this approach, peptides with up to 10 modified arginine residues were able to be identified. The decision tree-based approach enabled the best fragmentation method to be applied for each peptide. ETD, a charge dependent ion/ion reaction, was used for arginine rich peptides; collision-based HCD was implemented for larger peptides with lower charge. Additionally, the decision tree-based method took advantage of both mass analyzers on the Orbitrap Fusion Tribrid mass spectrometer, with the high-resolution Orbitrap reserved for calculating high accuracy precursor masses and fragments of large (arginine-rich) peptides. Furthermore, the database search allowed for up to 12 missed cleavages for trypsin and two missed cleavages for GluC. As previous methods have been limited to collision-based fragmentation and were only searched with three or four missed cleavages (Larsen et al., 2016; Musiani et al., 2019), our approach permitted identification of hypermethylated peptides with several non-cleaved arginine residues. Despite this, a frequent hurdle in proteomic analyses is that not all peptides are detected in all samples. To overcome this, we used the Differential Enrichment Analysis of Proteomics data (DEP) Bioconductor package with a mixed-imputation approach (Zhang et al., 2018). We used a combination of k-nearest neighbor for samples that were missing at random—missing randomly irrespective of condition—and a deterministic minimum value imputation for values missing not at random—missing consistently within a condition (Gatto and Lilley, 2012). These combined advances allowed us to better analyze the broad spectrum of data that we gathered.Using these advances, our study revealed important insights: inhibiting type I enzymes with MS023 increased total proteome Rme2s levels as determined by direct-injection MS/MS and promoted increased Rme2s peptide enrichment. We also noted numerous changes in Rme1 with MS023—with many substrates having increased Rme1—implying a putative role for type II and III PRMTs in catalyzing much of the first methylation step. Consistent with the total proteome analysis, most of the Rme2s substrate signals were lost upon GSK591 treatment, while Rme2a had evenly distributed up- and down-regulated enrichment in the presence of MS023. Increased Rme1 and Rme2s following type I PRMT inhibition was particularly evident when analyzing FUS and TAF15—the two most abundant methylarginine containing proteins in our analyses. Importantly, changes in the methylarginine species of these two proteins due to substrate scavenging were associated with differences in their co-immunoprecipitation. Together, these results highlight the importance of established arginine methylation in both protecting and directing future methylation events.Consistent with other studies of methylarginine containing proteins, many of our enriched methylated peptides were found in RNA-processing factors—although each methylated state did show distinct ontological enrichments. As we had prior interest in protein intrinsic disorder (Warren and Shechter, 2017), we hypothesized that many methylarginine-containing proteins were embedded in IDRs. We produced an analysis of the total human proteome, showing unique characteristics of proteins containing predicted intrinsic disorder. Consistently, the proteins we identified as containing methylarginine were significantly enriched in intrinsic disorder. However, unlike most intrinsically disordered proteins, methylarginine-containing proteins were larger than the median and less hydrophobic. These characteristics suggest a potential unique role in charged- and disorder-based regulatory function for methylarginine. We demonstrated that the methylarginine-containing proteins were also more likely to be found in the nucleus and directly bind RNA; these features are commonly enriched in proteins driving LLPS and molecular condensates (Ditlev et al., 2018). Given the striking enrichment of methylarginine in RNA processing, the influence of methylarginine on LLPS (Courchaine et al., 2021; Hofweber et al., 2018; Qamar et al., 2018; Tsang et al., 2019), and the abundant literature on how LLPS is critical for nuclear function (Strom and Brangwynne, 2019)—together with the gross transcriptomic changes promoted by PRMT inhibition—it is tempting to hypothesize that methylarginine broadly regulates gene expression through LLPS. Future studies are needed to directly test this hypothesis.In line with the evidence above that protein arginine methylation is crucial for gene expression, we observed robust changes in both the transcriptome and proteome following type I PRMT and PRMT5 inhibition. Interestingly, most proteins that had changes in methylarginine levels did not have altered abundance, suggesting that arginine methylation acts indirectly to regulate gene expression. Furthermore, there was a weak correlation between significantly changing transcripts and proteins in GSK591- and MS023-treated cells. As we used random hexamers for reverse transcription, it is possible that many of the changing RNA species are not translated. Regardless, the poor correlation between transcriptomes and proteomes has been observed previously and supports that transcriptomic analyses alone are insufficient as a surrogate for protein level changes (Vogel and Marcotte, 2012).Total proteome and transcriptome analysis also revealed largely independent roles for type I PRMTs and PRMT5 in regulating biological processes. These included pathways involved in extracellular matrix organization, metabolism, and DNA packaging. Consistent with these results, we observed diametric changes in cell morphology and strong synergy in reducing viability with GSK591 and MS023. This has been reported previously and further supports the non-overlapping influence of type I PRMTs and PRMT5 on cell viability (Fedoriw et al., 2019; Gao et al., 2019). As the methylation of proteins may have multiple consequences, including regulation of transcription, post-transcriptional RNA processing, and translation, this stresses both the importance and complexity of the PRMT-regulatory regime.Lastly, we compiled all publicly available data for the human arginine methylome (Fedoriw et al., 2019; Fong et al., 2019; Guo et al., 2014; Hornbeck et al., 2015; Larsen et al., 2016; Li et al., 2021; Lim et al., 2020; Musiani et al., 2019; Wei et al., 2020) (Table S5). Together these data exemplify the diverse array of PRMT substrates. Importantly, we noted that most proteomic studies on PRMTs identified distinct substrates. For instance, in our analysis we observed 36 unique methylated residues on TAF15. However, when analyzing the compiled dataset, we noted a total of 90 unique methylarginine containing residues in TAF15 (contributed by 7 independent studies). Taken together, this compiled data set, as well as our new approach of total dimethyl arginine analysis by direct-injection mass spectrometry, in vitro PRMT activity assays, robust label-free PTMScan, proteomics, transcriptomics, and phenotypic analyses demonstrate that protein arginine methylation is more complex and more abundant than previously known. Future work will be necessary to assign specific functions to all three methylarginine states, to identify more enzymes responsible for arginine methylation, and to add resolution to the cellular role of PRMTs.
Limitations of the study
Despite the advances presented in this study, our approaches had some limitations. As we could not ensure steady-state kinetics in the OPAL array measurements, definitive substrate preference is difficult to ascertain. Furthermore, since we used degenerate peptides, it is possible that a random combination of amino acids inadvertently inhibited the PRMTs and prevented methylation of a potential substrate.In the PTMScan studies, our approach was limited by the following: the protease used for digestion; potential bias of the IP antibodies; and the sequential immunoprecipitations. We overcame the first limitation by employing both trypsin and GluC digestions. However, these are still nonrandom ways to sample the proteome. The PTMScan antibodies were raised toward GR-rich sequences and therefore peptides not containing these sequences may be underrepresented. Furthermore, the sequential immunoprecipitation strategy biased enriched methylarginine containing peptides to the antibodies used earlier in the immunoprecipitation.Finally, the results returned from proteomic search algorithms include summed abundances at the protein level and at the peptide level. As the search algorithm used intact mass to distinguish peptides, the same peptide may be returned with varying numbers of methylations. Thus, the same peptide may appear depleted in a treated sample only to see the same peptide with fewer methyl groups increased compared with the control.
STAR★Methods
Key resources table
Resource availability
Lead contact
Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, David Shechter (david.shechter@einsteinmed.org).
Materials availability
XlPRMT5-MEP50 was previously described by us (Wilczek et al., 2011). PRMT4 was made as described previously (Lee and Bedford, 2002). CePRMT5 expression clone was a gift from Dr. Rui-Ming Xu (Institute of Biophysics, Chinese Academy of Sciences, Beijing, China) (Sun et al., 2011). PRMT1 DNA was cloned from the DNASU repository (HsCD00299896). All resources will be made available upon request.
Experimental model and subject details
Cell lines and animal models
The following cell lines and animals were used in this study:A549 cells (ATCC CCL-185): Human lung carcinoma epithelial cells obtained from a 58-year-old male.IMR-90 cells (ATCC CCL-186): Human lung fibroblast cells obtained from a 16-week gestation female.Xenopus laevis: Female African clawed frogs obtained from Nasco.
Method details
Cell culture and PRMT inhibition
A549 cells and IMR90 cells were both freshly purchased for this study from ATCC. The cells were cultured in DMEM (Corning) supplemented with 10% FBS (Hyclone), 100 I.U./mL Penicillin (Corning), 100 μg/mL Streptomycin (Corning) and maintained at 37°C with humidity and 5% CO2. Cell passaging was accomplished with trypsin-facilitated (Corning) dissociation followed by centrifugation at 300 x g for 3 minutes. Cells were washed with 37°C PBS prior to replating. For PRMT inhibition cells were exposed to 1 μM MS023 or GSK591 (Cayman) in 0.01% DMSO for 7 days. 150 mm plates (30 mL medium) were seeded with 0.2 x 105 and 0.4 x 105 cells for the control and drug-treated groups, respectively.
Xenopus egg extract preparation
Xenopus laevis frogs were kept in accordance with our approved IACUC protocol (20181103). Xenopus membrane-free high-speed interphase supernatant (HSS) was prepared as described previously (Banaszynski et al., 2010). Briefly, eggs collected from four frogs were de-jellied in 2.2% L-cysteine, pH 7.7 and washed in 0.5X MMR buffer (50 mM NaCl, 1 mM KCl, 0.5 mM MgSO4, 1 mM CaCl2, 0.05 mM EDTA, and 2.5 mM HEPES pH 7.8) and then in 1X ELB (50 mM KCl, 2.5 mM MgCl2, and 10 mM HEPES pH 7.8) with 250 mM sucrose. Eggs were packed with a 200 x g spin, excess buffer removed, and supplemented with protease inhibitors, 2.5 μg/ml cytochalasin B, and 50 μg/ml cycloheximide. Eggs were crushed by centrifugation at 16,000 x g in a chilled rotor, soluble middle layer was removed, and the lysate briefly re-spun. A final ultracentrifuge spin at 260,000 x g x 1 hour in an SW-55 rotor produced HSS.
Whole protein acid hydrolysis
Per group, a total of 107 cells were isolated and flash frozen. On ice, cell pellets (50 mL tube) or HSS egg extract were resuspended in 4 mL freshly prepared Lysis Buffer (150 mM NaHCO3, 8 M Urea, supplemented with 1 mM DTT; pH 8.3) to obtain a clear lysate. Trichloroacetic acid (TCA) solution was made from 10g TCA dissolved into 10 mL of water (i.e. 100% TCA solution) and added to lysate (1:4 v/v) to precipitate proteins. Whole protein pellets were isolated by centrifugation, further washed with ice-cold acetone (5X) and dried under vacuum. Dry pellets (∼25−50 mg) were transferred into glass pressure tubes (AceGlass #8648-230) and flushed with nitrogen prior to adding hydrochloric acid (2 mL of 6 M sequencing grade solution; Thermo Scientific #PI24308). Sealed tubes were kept at 125°C for 48h (oil bath) to achieve acid-hydrolysis of the whole proteome. Light yellow solutions were further diluted 1:10 in deionized (DI) H2O, lyophilized, and then resuspended in DI H2O. Immediately prior to MS analysis/quantitation samples were diluted 1:1 with 100% acetonitrile.
Quantitation of dimethylarginine species
Direct-injection mass spectrometry was performed with a TriVersa NanoMate (Advion) coupled online with the Orbitrap Fusion Lumos mass spectrometer (Thermo Scientific). The NanoMate was programmed to pick up 5 μL of solution followed by 0.5 μL of air gap to avoid spilling. Samples were sprayed into the mass spectrometer using a gas pressure of 0.3 psi and a positive voltage set at 1.5 kV. Contact closure to start MS acquisition was 5 seconds after engaging the probe to the instrument chip nozzle. The temperature of the heated capillary in the MS source chamber was set to 180°C. The full scan range of 50 - 210 m/z was acquired in the Orbitrap at a resolution 120,000. Targeted scans were performed for MS/MS fragmentation using an HCD energy of 30 V separately and acquired in the Orbitrap at a resolution of 120,000. Concentrations of arginine species standards (i.e. Rme2s, Rme2a) were determined by 1H NMR using adenosine internal standard, δ 6.1 ppm, d, 1H. For the standard curve, Rme2s was fixed at 2 μM and Rme2a varied at 20 μM, 10 μM, and 2 μM. Data analysis was completed using R (version 4.0.2).
Protein purification
HsPRMT1, HsPRMT4/CARM1, CePRMT5, or XlPRMT5-MEP50 were purified as described previously (Cheng et al., 2012; Wilczek et al., 2011). Briefly, 6xHis-PRMT1, GST-PRMT4, and 6xHis-CePRMT5 expression clones were transformed into E. coli, induced with 0.4 mM IPTG for four hours, and cell pellets collected and lysed. For PRMT5-MEP50, individual baculovirus encoded 6xHis-PRMT5 and 6xHis-MEP50 were co-infected into 3L of Sf9 cells and grown for 48 hours. For 6XHis-tagged proteins, lysed and sonicated bacteria or insect cells were applied to Nickel-NTA resin, eluted with 300 mM imidazole, and further purified by size-exclusion chromatography. Cells expressing GST-PRMT4 were lysed and sonicated and the tagged protein was purified on glutathione-sepharose resin and eluted with 10 mM reduced glutathione. All proteins were confirmed pure by Coomassie stained gels.
Oriented peptide array library PRMT screening
Oriented Peptide Array Library (OPAL) peptides were synthesized in the following fashion: A-ZXX-R-XXX-A-PEG-Biotin, A-XZX-R-XXX-A-PEG-Biotin, A-XXZ-R-XXX-A-PEG-Biotin, A-XXX-R-ZXX-A-PEG-Biotin, A-XXX-R-XZX-A-PEG-Biotin, A-XXX-R-XXZ-A-PEG-Biotin in which R is a fixed Arginine, X is any amino acid except cysteine, and Z is a fixed position. FlashPlate wells were coated with Streptavidin. Biotinylated peptide pools were treated with recombinant HsPRMT1, HsPRMT4/CARM1, CePRMT5, or XlPRMT5-MEP50. Reactions were performed with 1 μg of peptide, 50mM Tris-HCl pH 8.8, 5 mM MgCl2, 4 mM DTT, 1 μCi 3H-SAM and ∼1 μg of enzyme for 1 hr at 30°C. Following incubation, the peptides were transferred to FlashPlates (Perkin Elmer) and incubated for 30 min for peptide capture. Polystyrene-based scintillant was added. The radioactivity of the biotinylated peptides—now bound to the wells via streptavidin—was counted, while unbound SAM was not counted. To generate the heatmap, we collected MicroBeta counts per minute (CPM). For each position (P3- to P3+), all the readings were normalized to the highest value so that the highest CPM reading has a value of one and all others are expressed as a fraction of the highest CPM. The assay was performed in triplicate.
PTMScan® peptides and immunoprecipitation
PTMScan was conducted entirely according to the manufacturer's instructions (Cell Signaling Technology,#13563). Briefly, 2 x 108 A549 cells of each treated condition were washed with PBS and scraped into freshly made room temperature urea lysis buffer (20 mM HEPES, pH 8.0, 9 M Urea, 1 mM sodium orthovanadate, 2.5 mM sodium pyrophosphate, 1 mM β-glycerophosphate). Cell lysates were sonicated and centrifuged. The soluble protein lysate was diluted to a final concentration of 2 M Urea, 20 mM HEPES pH 8.0, reduced with DTT, and alkylated with iodoacetimide. Each lysate was digested for 24 h at room temperature with 50 μg trypsin in 1mM HCl (Pierce 90305) or 50 μg GluC (Roche 11420399001) as per manufacturer recommendation for 48 h with an additional 50 μg GluC added after the first 24 h incubation. Complete digestion was confirmed by SDS-PAGE. Digested peptides were acidified in 1% trifluoroacetic acid (TFA) and purified on SepPak C18 gravity columns (Waters WAT051910). Eluted peptides were lyophilized and dissolved in the manufacturer's IAP buffer. A fraction of eluted peptides were reserved for the “Input Total proteome” analysis. Peptides were subjected to successive immunoprecipitations with the Rme1 (CST Kit # 12235), Rme2a (CST Kit # 13474), and Rme2s (CST Kit # 13563) prebound resins. Flowthrough from each IP was applied to the next resin. Resin was washed and eluted with two applications of 0.15% TFA.
PTMScan® mass spectrometric methods
Digested input total proteome samples and input PTMScan samples were desalted using C18 STAGE tips (3M). Proteome samples were acidified with 1% TFA prior to loading onto STAGE tip discs. The samples were washed with 0.1% TFA and eluted with 50% acetonitrile and 0.1% formic acid. Samples were dried and reconstituted with 0.1% formic acid.Input samples were loaded into a Dionex UltiMate 3000 (Thermo Fisher Scientific) liquid chromatography system, coupled with a Q-Exactive HF instrument (Thermo Fisher Scientific). Peptides were separated on an in-house made capillary column (20 cm x 75 um fused silica column with a laser pulled tip, packed with Dr. Maisch, Reprosil-Pur 120 Å C18-AQ, 3 μm particles). The peptides were separated at 400 nL/min in solvent A (0.1% formic acid) and solvent B (0.1% formic acid in 95% acetonitrile) as follows: 3% solvent B to 30% in 105 minutes, 50% solvent B at 120 minutes followed by column wash and re-equilibration. The instrument method had a full MS scan with 120,000 resolution at 200 m/z and a 1,000,000 AGC target, acquired from 300 to 1500 m/z. The top 20 precursors from z = +2 to +6 were selected for isolation with a 2.0 m/z window for HCD fragmentation with NCE of 28. The MS/MS had a 200,000 AGC target and a 15,000 resolution at 200 m/z with a 30 second dynamic exclusion.PTMScan samples were loaded into a Dionex UltiMate 3000 (Thermo Fisher Scientific) liquid chromatography system, in-line with an Orbitrap Fusion Tribrid mass spectrometer with ETD capabilities (Thermo Fisher Scientific). Peptides were separated on an in-house made capillary column (20 cm x 75 um fused silica column with a laser pulled tip, packed with Dr. Maisch, Reprosil-Pur 120 Å C18-AQ, 3 μm particles). Immunoprecipitated samples were analyzed with 300 nL/min flow rate as detailed: 2% to 40% solvent B (as above) in 65 minutes, increased to 60% solvent B in 74 minutes, followed by column wash and re-equilibration. The instrument method included a 60,000 resolution MS1 scan from 300 to 1500 m/z and allowed for a three second cycle time. Selection of precursors for MS/MS followed a decision-tree approach based on charge state and precursor m/z range: (1) Precursors with a z = +3 to +4, from 300 to 850 m/z were fragmented by ETD and analyzed in the ion trap. The AGC target was set to 20,000 with a maximum ion acquisition time of 100 ms; (2) Precursors with a z = +5 to +12, from 300 to 850 m/z were fragmented by ETD and analyzed in the Orbitrap with 7,500 resolution. The AGC target was set to 100,000 with a maximum ion acquisition time of 200 ms; (3) Precursors with a z = +2 to +4, from 300 to 1500 m/z were fragmented by HCD with a normalized collision energy (NCE) of 29 and analyzed in the ion trap. The AGC target was set to 20,000 with a maximum ion acquisition time of 75 ms; (4) Precursors with a z = +5 to +8, from 850 to 1500 m/z were fragmented by HCD, with NCE of 29 and acquired in the Orbitrap with 7500 resolution at 200 m/z. The AGC target was set to 100,000 with a maximum ion acquisition time of 200 ms. A dynamic exclusion of 20 seconds was used for all precursors, as well as a 2.0 m/z isolation window.
Database search
For PTMScan peptides, the data was analyzed using Byonic (Protein Metrics) in Proteome Discoverer 2.4 (Thermo Fisher Scientific). Peptides identified through Proteome Discoverer were quantified using the area of their respective extracted ion chromatogram. The PTMScan data was searched against the reviewed human proteome from SWISS-PROT, downloaded 17/09/2019 with 20,353 entries. Tryptic peptides were searched allowing for up to 12 missed cleavages for high resolution MS/MS (Orbitrap) and 6 missed cleavages for low resolution MS/MS (Ion Trap). The following modifications were allowed for all peptides: carbamidomethyl on cysteine as fixed; dimethylation on arginine (common, up to 10) and lysine (rare, up to 2), methylation on arginine (up to 5) and lysine (common, up to 2), acetylation on protein N-termini and lysine (rare, up to 1), oxidation on methionine (rare, up to 2). The results were filtered for a 1% false discovery rate.For input total protein sample peptides, the data was analyzed using Byonic (Protein Metrics) in Proteome Discoverer 2.4 (Thermo Fisher Scientific). The data was analyzed with the full reviewed human proteome from SWISS-PROT. The search allowed for up to four missed cleavages and three modifications per peptide. These modifications included: carbamidomethyl on cysteine as a fixed modification, acetylation on the protein N-terminus, up to three dimethylated arginine residues, one dimethylated lysine residue, one monomethyl arginine and lysine, and two oxidized methionine residues per peptide. Identified spectra were filtered to a 1% false discovery rate using Percolator.
Processing of database search result tables
Differential enrichment analysis was carried out using Differential Enrichment analysis of Proteomics data (DEP) (Zhang et al., 2018). To perform DEP, identified peptides were assigned unique identifiers and converted into a SummarizedExperiment object. For total proteome analysis, individual peptides were summed to determine parent protein abundance before processing with DEP. In each experiment—prior to normalization—peptides not containing abundances in all three replicates of a single condition (for PTMScan) or at least two out of three replicates of a single condition (for total proteome) were removed. Sample abundances were normalized using variance stabilizing transformation (Huber et al., 2002). Following normalization, missing values were imputed using a deterministic minimum value imputation for those missing not at random (missing in all replicates of a single condition) or k-nearest neighbor for values missing at random (the opposite of those missing not at random) (Gatto and Lilley, 2012). Once normalization and imputation were completed, differential enrichment analysis was accomplished using empirical Bayes statistics with linear models (Ritchie et al., 2015; Zhang et al., 2018). Further downstream analyses were accomplished using R (version 4.0.2).
Proteome characteristic analysis
56,392 human protein sequences (Uniprot 2012) and their predicted intrinsic disorder were obtained from http://biomine.cs.vcu.edu/servers/RAPID/homosapiens.txt (Yan et al., 2013); molecular weight and isoelectric point were obtained from http://isoelectricpointdb.org/40/UP000005640_9606_all_isoelectric_point_proteome_Homo_sapiens_Human.html (Kozlowski, 2016). To determine if a given methylarginine residue was located within an IDR, a custom Python (3.7.0) script was used to search the MobiDB IDR database (Piovesan et al., 2017) (https://mobidb.bio.unipd.it/; downloaded February 2020) to compare amino acid positions.
RNA sequencing
For each condition, triplicate total RNA was extracted with RNeasy kits (Qiagen), rRNA removed with Riboerase (Kapa Biosystems) and paired-end libraries were prepared with random hexamers (Kapa). Each library was sequenced to attain approximately 40M paired end 150bp reads on a NextSeq 500. Sequences were mapped to hg19 using STAR (Dobin et al., 2012), Fragments Per Kilobase of transcript per Million mapped reads (FPKM) were determined with cufflinks (Trapnell et al., 2010), and differential gene expression was determined using featureCounts (Liao et al., 2014) and DESeq2 (Love et al., 2014).
RT-qPCR
RNA purification was performed using TRIzol (Thermo). Isolated total RNA was reverse transcribed with Moloney murine leukemia virus (MMLV) reverse transcriptase (Invitrogen) and random hexamer primers. LightCycler 480 Sybr Green I (Roche) master mix was used to quantitate cDNA with a LightCycler 480 (Roche). An initial 95°C for 5 minutes was followed by 45 cycles of amplification using the following settings: 95°C for 15 s, 60°C for 1 minute. Primer sequences can be found in Table S6.
Drug synergy cell viability analysis
Ninety-six well plates (Corning) were prepared containing 100 μL inhibitors at varying 2x concentrations in complete DMEM. This was followed by addition of 100 μL A549 cells at 2,000 cells/mL and incubation for 7 days at 37°C and 5% CO2 with humidity. To test viability, 20 μL CellTiter Aqueous One (Riss et al., 2004) was added to each well—including media only control—followed by incubation for 3 hrs in the dark at 37°C and 5% CO2 with humidity. Absorbance at 570 nm was then recorded on the SpectraMax Plus Microplate Spectrophotometer. All wells were normalized to the DMSO only control such that viability of those cells was presumed to be 100%. Consequent synergy and antagonism scores were determined using Combenefit (Di Veroli et al., 2016).
Cell migration and invasion assays
Cell migration and invasion were assayed according to manufacturer instructions (Corning). Briefly, for cell migration assay, A549 cells were starved with DMEM containing no FBS for 24 hrs. Next, Transwell inserts were coated in coating solution. A549 cells were plated either in serum-free or serum-containing media. After 24 hrs, the insert was washed with wash buffer and stained with crystal violet staining solution for 10 minutes. For cell invasion, Matrigel matrix (Corning) was used to coat cell culture permeable supports. Cells were added to the Matrigel coated invasion chambers and chemoattractant was added beneath the permeable support. Invasion chambers were incubated overnight at 37°C, 5% CO2 in a humidified incubator. Staining was performed using the Diff-Quik kit (Corning). Micrographs were captured by a Nikon Diaphot phase contrast microscope.
Cell morphology analysis
Cells were seeded on coverslips in a 6-well plate (Corning) and allowed to grow in the presence of 0.01% DMSO, 1 μM GSK591 (Cayman), or 1 μM MS023 (Cayman) for 7 days. Cells were then washed once with 37°C PBS (Hyclone) and fixed with 4% paraformaldehyde at room temperature for 10 min followed by three washes with 4°C PBS. Fixed cells were stored at 4°C. Upon processing, residual aldehyde was quenched with 0.1 M glycine in room temperature PBS for 15 minutes. For cell morphology, cells were incubated with phalloidin-rhodamine in PBS for 30 minutes at room temperature, washed three-times with PBS and mounted with DAPI prolong gold anti-fade mounting media (Thermo). Coverslips were imaged using an Olympus IX-70 inverted microscope with a 60x objective. Cell size and nuclei per cell analysis were accomplished using Fiji (Schindelin et al., 2012).
Co-immunoprecipitation
A549 cells were seeded in 2x 15 cm dishes (Corning) per condition and treated with 0.01% DMSO, 1 μM GSK591, or 1 μM MS023 for 7 days at 37°C, 5% CO2 in a humidified incubator. Approximately 30 x 106 cells were then trypsinized and washed with 1x PBS containing protease inhibitor (Pierce). Cell pellets were frozen and stored at -80°C. Upon processing, frozen cell pellets were resuspended in 1 mL of RIPA buffer (1% NP-40, 150 mM NaCl, 50 mM Tris-HCl pH 8 at 4°C, 0.25% sodium deoxycholate, 0.1% SDS, and 1 mM EDTA supplemented with protease and phosphatase inhibitors (Pierce)). Cell suspensions were incubated on ice for 30 minutes with intermittent vortexing and then centrifuged at 10,000 x g for 15 minutes at 4°C. The supernatant was transferred to a clean eppendorf. Lysates were then pre-cleared with Protein A agarose (EMD Millipore) equilibrated in lysis buffer and incubated with gentle rotation at 4°C for 60 minutes. Afterward, lysates were centrifuged at 14,000 x g for 10 minutes at 4°C. Pre-cleared lysates were transferred to fresh tubes. Protein concentrations were determined using BCA assay (Pierce) and lysates were resupended to a final 500 μL at 1.5 μg/μL (750 μg per IP). A fraction of the sample (10%) was reserved as the input. Antibodies targeting FUS (NovusBio NB100-565; 2 μg) or TAF15 (NovusBio NB100-567; 2 μg) were added to their respective tubes and incubated overnight at 4°C with gentle rotation. The next morning, 50 μL of 50% Protein A agarose equilibrated in lysis buffer was added to each tube and incubated for 2 hrs at 4°C with gentle rotation. The samples were then centrifuged at 10,000 x g for 30 seconds at 4°C and washed with 500 μL RIPA buffer—this was repeated 3x. The supernatant was removed and the Protein A agarose was resuspended in 40 μL 2x Laemmli buffer in preparation for western blot analysis. Antibodies used in the western blot include: FUS and TAF15 (see above), Rme1 (CST 8015S), Rme2s (CST 13222S), Rme2a (CST 13522S); Secondary antibody was TidyBlot-HRP (Bio-Rad STAR209P).
Histone extraction
Acid extraction of histones was performed as described previously (Shechter et al., 2007). Briefly, nuclei were isolated by hypotonic lysis of 5 x 106 cells incubated in 10 mM Tris pH 8.0, 1 mM KCl, 1.5 mM MgCl2, and 1 mM DTT. Pelleted nuclei were suspended in 0.4 N H2SO4 for 30 minutes. Acid-solubilized histones were precipitated by 33% trichloroacetic acid, washed with 100% acetone, and dissolved in water.
PRMT western analysis
Lysis of A549 cells treated with 0.01% DMSO, 1 μM GSK591, or 1 μM MS023 for 7 days at 37°C, 5% CO2 in a humidified incubator was accomplished as above. Antibodies used in western analysis include: PRMT1 (Millipore 07-404), PRMT4 (CST 4438), PRMT5 (Millipore 07-405), PRMT7 (CST 14762), GAPDH (Abcam ab9484), H4R3me2s (Abcam ab5823), H4R3me2a (Active Motif 39705), and H3 (Abcam ab1791).
Bioinformatics and graphics
RNA-seq data is deposited under GEO (GSE158625). All raw mass spectrometry data is available in the Chorus repository (https://chorusproject.org) under project numbers 1671 and 1725. The following additional R packages were used in this study: ggSeqLogo was used to create the weblogo plots (Wagih, 2017); clusterProfiler for determining gene ontology (Yu et al., 2012); Venn diagrams were made using VennDiagram (Chen and Boutros, 2011); Upset plots and high dimensional intersections were accomplished with ComplexHeatmap (Gu et al., 2016); All plots and histograms were generated using tidyverse (Wickham et al., 2019) and Graphpad Prism v8, while final figures were assembled in Adobe Illustrator (25.2.3). All code used to generate data in this paper can be found here: https://github.com/Shechterlab/PTMscan2021.
Quantification and statistical analysis
All western blots were independently repeated at least twice. RT-qPCR was repeated with three independent biological replicates. OPAL PRMT screening was performed with three replicates within the same plate and the mean CPM reported. Cell size, nuclei per cell, migration, and invasion assays were repeated with three independent biological replicates and analyzed with one-way analysis of variance followed by Tukey's multiple comparisons testing. Differential gene expression analysis was accomplished using DESeq2 (Love et al., 2014). Differential protein expression analysis was accomplished using Differential Enrichment analysis of Proteomics data (DEP) (Zhang et al., 2018). All error bars represent mean ± standard deviation (SD) unless otherwise indicated in the figure legend.
Authors: Rama R Yakubu; Natalie C Silmon de Monerri; Edward Nieves; Kami Kim; Louis M Weiss Journal: Mol Cell Proteomics Date: 2017-01-31 Impact factor: 5.911
Authors: Maxim I Maron; Alyssa D Casill; Varun Gupta; Jacob S Roth; Simone Sidoli; Charles C Query; Matthew J Gamble; David Shechter Journal: Elife Date: 2022-01-05 Impact factor: 8.140