Literature DB >> 34718741

qPTMplants: an integrative database of quantitative post-translational modifications in plants.

Han Xue1, Qingfeng Zhang2, Panqin Wang1, Bijin Cao1, Chongchong Jia1, Ben Cheng1, Yuhua Shi1, Wei-Feng Guo3, Zhenlong Wang1, Ze-Xian Liu2, Han Cheng1.   

Abstract

As a crucial molecular mechanism, post-translational modifications (PTMs) play critical roles in a wide range of biological processes in plants. Recent advances in mass spectrometry-based proteomic technologies have greatly accelerated the profiling and quantification of plant PTM events. Although several databases have been constructed to store plant PTM data, a resource including more plant species and more PTM types with quantitative dynamics still remains to be developed. In this paper, we present an integrative database of quantitative PTMs in plants named qPTMplants (http://qptmplants.omicsbio.info), which hosts 1 242 365 experimentally identified PTM events for 429 821 nonredundant sites on 123 551 proteins under 583 conditions for 23 PTM types in 43 plant species from 293 published studies, with 620 509 quantification events for 136 700 PTM sites on 55 361 proteins under 354 conditions. Moreover, the experimental details, such as conditions, samples, instruments and methods, were manually curated, while a variety of annotations, including the sequence and structural characteristics, were integrated into qPTMplants. Then, various search and browse functions were implemented to access the qPTMplants data in a user-friendly manner. Overall, we anticipate that the qPTMplants database will be a valuable resource for further research on PTMs in plants.
© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2022        PMID: 34718741      PMCID: PMC8728288          DOI: 10.1093/nar/gkab945

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Post-translational modification (PTM) is a biochemical process that changes the properties and extends the chemical composition of a protein by participating in the addition of chemical groups (such as phosphate, succinyl and sulfate) to specific amino acid residues or by proteolytic cleavage of the protein sequence backbone (1,2). More than 200 PTMs have been registered in UniProt, and they can regulate the structure, function and localization of proteins (3). To date, protein phosphorylation is the best studied PTM in plants, and secondly often studied PTMs are lysine modifications such as acetylation, ubiquitination and succinylation (4–6). As important modulators of protein function, cysteine modifications such as S-nitrosylation, S-sulfenylation and S-glutathionylation have received increasing attention in plants (7–9). Plant PTMs play critical roles in nearly all biological processes, such as signal transduction (4,10), metabolism (11,12), stress resistance (13,14), plant immunity (15,16) and other cellular events. Recently, the rapid progress of high-throughput proteomic techniques has made great strides in the profiling and quantification of plant PTMs in different tissues under various conditions (17-20). For example, based on phosphoproteome and acetylome quantification, Uhrig et al. unveiled the diurnal changes of concerted phosphorylation and acetylation in seedlings, roots, leaf rosettes, flowers and siliques of Arabidopsis thaliana (21). During seed germination, Nietzel et al. characterized the oxidation state of cysteine in Arabidopsis through thiol redox proteomics (17), while He et al. evaluated the dynamic change and extensive regulation of ubiquitylation in Oryza sativa subsp. japonica (rice) by a quantitative ubiquitylomics approach (22). Through quantitative phosphoproteome and metabolome analyses, Pi et al. revealed the regulatory mechanism of GmMYB173 in soybean under salt stress (23). Additionally, Walley et al. profiled and quantified the acetylome of the Zea mays (maize) fungal-induced immune response (24). Therefore, the identification of PTMs and their dynamic changes could be used to understand the regulatory roles of PTMs in the processes of plant growth and stress responses. Previously, numerous informative databases have been constructed to store PTM-related data in plants. For example, some general resources, such as dbPTM (25) and SysPTM (26), contain various types of PTMs identified by experiments in eukaryotes, while several PTM-specific databases, including EPSD (27), PLMD (28) and iCysMod (29), host modified sites for phosphorylation, lysine modifications and cysteine modifications in eukaryotic proteins, respectively. However, these databases only cover a limited part of the experimentally identified PTM events in plants. Furthermore, PhosPhAt (30), P3DB (31) and dbPPT (32) are dedicated to depositing phosphorylation information in plants. Moreover, several databases, including Plant PTM Viewer (33) and FAT-PTM (34), were developed to maintain and organize a flood of plant-modified substrates with accurate sites for multiple PTM types. The Plant PTM Viewer contains ∼370 000 modified sites for 19 PTM types in five plant species from 105 publications, and experimental details and available quantitative information related to PTMs are also provided (33). FAT-PTM is a database of functional analysis tools for PTMs that supports over 49 000 PTM sites for eight types of PTMs identified in Arabidopsis as well as phosphorylation dynamics data from >10 published quantitative phosphoproteomic studies (34). Although the databases mentioned above are devoted to the collection of plant PTM data, a comprehensive resource covering more plant species, more types of PTMs and their quantitative dynamics is still needed for the research community. In this study, we developed an integrative database of quantitative PTMs in plants called qPTMplants, which hosts 123 551 published proteins with 429 821 experimentally identified nonredundant PTM sites under 583 conditions in 43 plant species for 23 types of PTMs from 293 research articles. Among them, there are 620 509 quantitative events for 136 700 PTM sites on 55 361 substrates under 354 conditions in 34 plant species from 139 published studies. In the qPTMplants database, detailed information about the PTM events was curated and provided, while annotations including the sequence and structural characteristics were also integrated. For convenient usage, qPTMplants can be searched and browsed in a user-friendly manner. Based on the curated Arabidopsis dataset, the sequence characteristics around PTM sites for different PTM types were further analyzed, while the crosstalk among diverse PTMs at the same modified residues was also studied. Taken together, qPTMplants could serve as a useful resource to further investigate the function of plant PTMs.

CONSTRUCTION AND CONTENT

To establish a comprehensive database for PTMs in plants, published PTM datasets from the literature were integrated (Figure 1). First, we retrieved the plant PTM-related literature from PubMed published before July 2021 through a number of keywords, including plant or plant species names combined with PTM types and proteome/proteomic, such as ‘plant phosphoproteome’, ‘plant acetylation proteome’, ‘plant glycosylation proteome’, ‘plant ubiquitination proteome’ and ‘plant nitrosylation proteome’. To avoid missing plant PTM data, additional keywords such as ‘plant large-scale acetylation’, ‘plant MS acetylation’, ‘plant comprehensive analysis acetylation’ and other correlative nomenclatures were further used for the literature search. The number of related studies searched for each keyword is shown in Supplementary Table S1. We initially read the abstracts to determine whether there were PTM-omic datasets; if so or if we were unsure, then we manually checked the full texts and downloaded their corresponding supplementary files to obtain the experimentally identified PTM datasets. If there were localization probabilities available in the datasets, we used a localization probability ≥0.75 as the criterion to further screen credible PTM substrates, sites and modified peptides from the datasets. In addition, we collected detailed information from the literature, including experimental conditions, sample types, enrichment methods, mass spectrometry, reference proteome and raw peptides. For the quantitative PTM events, the log2-transformed ratio (Log2Ratio) and available P value were also manually curated.
Figure 1.

The construction procedure for the qPTMplants database.

The construction procedure for the qPTMplants database. To pinpoint the precise positions of PTM sites on the protein sequences, the collected PTM peptides and modified sites were mapped to the reference proteomes acquired from the commonly used databases of TAIR (Araport11) (35), RAP-DB (Release 2021_05) (36) and UniProt (Release 2021_03) (3) for Arabidopsis, rice and other species, respectively. For some species without reference proteomes in UniProt, such as Nicotiana benthamiana, Lotus japonicus and Actinidia deliciosa, we used their original sequences provided in the literature. During the data-mapping process, owing to the differences among the diverse reference proteomes employed in these extracted articles, some raw modified peptides could not be accurately aligned with a protein sequence in reference proteomes, and these unmapped peptides were abandoned. After the raw peptides collected were successfully matched to the protein sequence, the new PTM peptides of 15 amino acids were generated with the modified sites in the center and surrounded by 7 amino acids of upstream and downstream. Furthermore, NetSurfP (37) and IUPred2A (38) were adopted to provide structural annotations for PTM substrates. Finally, the online service of qPTMplants was implemented in PHP + MySQL + JavaScript (Figure 1). In total, 1 242 365 PTM events for 429 821 PTM sites on 123 551 proteins under 583 conditions for 23 PTM types in 43 plant species from 293 published studies were collected (Supplementary Table S2). Arabidopsis, maize and rice had the most PTM sites, including 139 101 sites in 35 528 proteins for 17 PTM types, 56 879 sites in 16 297 substrates for 4 PTM types and 41 447 sites in 15 427 proteins for 11 PTM types (Figure 2A and Supplementary Table S2). Among the types of PTMs, phosphorylation (295 559 sites, 66.45%), acetylation (40 199 sites, 9.04%), 2-hydroxyisobutyrylation (37 578 sites, 8.45%) and ubiquitination (26 796 sites, 6.02%) accounted for a large proportion of these collected PTM sites (Figure 2A and Supplementary Table S2). In addition, 620 509 quantitative events from 36 700 nonredundant PTM sites on 55 361 proteins for 13 PTM types in 34 plant species were collected, involving 43 samples and 354 different experimental conditions (Figure 2B and Supplementary Table S3). For the plant model organism Arabidopsis, there were 351 822 quantitative events for 43 289 PTM sites on 12 102 substrates for 8 PTM types, and the number of quantitative phosphorylation events was predominant in accord with that in all detected plant PTM sites (Supplementary Table S3).
Figure 2.

Data statistics of qPTMplants. (A) The distribution of site and substrate numbers of different PTM types among different organisms. (B) The number distribution of quantitative sites and substrates of different PTM types among different organisms. Lys represents the lysine residue.

Data statistics of qPTMplants. (A) The distribution of site and substrate numbers of different PTM types among different organisms. (B) The number distribution of quantitative sites and substrates of different PTM types among different organisms. Lys represents the lysine residue.

USAGE

For convenient usage, qPTMplants provides multiple options, including search, browse and download, to quickly access the PTM data of plants. The help page of the website provides a detailed tutorial. Here, we describe some of these functionalities. On the home page, a quick search box was provided for fast searching by protein accession, gene or protein name, function or condition (Figure 3A), while the search can be restricted by PTM type and/or organism. In addition, two search options were implemented on the search page: ‘Advanced search’ and ‘BLAST search’. In ‘Advanced search’, various keywords specified in multiple areas could be combined through the operators of ‘AND’, ‘OR’ and ‘NOT’ to perform an accurate query with up to three search keywords under specified species or PTM type of interest (Figure 3B). The option of ‘BLAST search’ was designed to find related information in qPTMplants quickly. Users can input a protein sequence in FASTA format and select the required E value to search identical or homologous proteins through the blastall program in NCBI BLAST packages (39) (Figure 3C).
Figure 3.

The search and browse options in qPTMplants. (A) Simple search function. (B) Advanced search function. (C) BLAST search function. (D) Browse by species. (E) Browse by the PTM types of the selected species. (F) Browse for proteins with the chosen PTM in selected organism. (G) Browse for conditions with the chosen PTM in selected organism.

The search and browse options in qPTMplants. (A) Simple search function. (B) Advanced search function. (C) BLAST search function. (D) Browse by species. (E) Browse by the PTM types of the selected species. (F) Browse for proteins with the chosen PTM in selected organism. (G) Browse for conditions with the chosen PTM in selected organism. On the browse page, all plant species in qPTMplants are listed in alphabetical order, and common species are highlighted (Figure 3D). After users click on the species of interest, the numbers of modified proteins and experimental conditions for the collected PTM types in the chosen organism are displayed in the diagram below, respectively. Meanwhile, the PTM types in the chosen species are also organized alphabetically under the statistical chart (Figure 3E). By clicking on a certain modification, the associated protein IDs (Figure 3F) and detailed conditions (Figure 3G) are shown alphabetically in the table at the bottom. Furthermore, the online server jumped to the corresponding result page based on the selection of the item from the lists for proteins or conditions (Figure 4A).
Figure 4.

The result pages for the search of a gene name. (A) The returned search results. (B) The information about the experiment. (C) The annotations about the protein. (D) The sequence and structural properties of the PTM substrate.

The result pages for the search of a gene name. (A) The returned search results. (B) The information about the experiment. (C) The annotations about the protein. (D) The sequence and structural properties of the PTM substrate. On the results page, the matched query results can be further filtered by condition, organism and modification. The candidate entries are displayed in the form of a table, which includes the protein ID, gene name, position, modification type, sequence window, sample type, experimental condition, Log2Ratio and P value (Figure 4A). Moreover, by clicking the download button, users can obtain a text file of all searched results, which includes the contents in the table (Supplementary Figure S1A). If users want to add labels to clearly demark PTM sites in the output file, they can check the ‘Download with labelled PTMs’ option and then click download button (Supplementary Figure S1B). In addition, users can learn more information about the PTM sites by clicking on the ‘plus’ button. The detailed information includes the following three sections: (i) ‘About experiment’, which includes the description of source reference, detail on condition, sample type, experimental instrument and method, PRIDE accession and raw peptide (Figure 4B); (ii) ‘About protein’ shows the protein information comprising database accessions, protein and gene names, protein sequence, organism, functional description and PTMs retrieved from UniProt (Figure 4C); and (iii) ‘Sequence and Structure’ visualizes the sequence and structural features of the protein, including the PTM sites, secondary structure, disordered region and surface accessibility (Figure 4D). In this section, the diagrams can be scaled by the control element, while the details about the sequence and structure are shown through hovering over the diagram (Figure 4D). Moreover, users can go to the download page to acquire all the PTM data for their own analysis. We guarantee that we will not collect, edit and disclose any private information sent by users to this website.

DISCUSSION

As a crucial molecular mechanism, PTMs greatly expand the complexity of the proteome and are highly involved in the regulation of numerous biological processes in plants (4,10–16). Thus, deciphering the biological functions of PTM dynamics is very important for plant growth and development. Recent advancements in proteomic technologies have greatly accelerated the high-throughput profiling and quantification of PTM events in plants, and massive amounts of plant PTM data have been generated and accumulated (17–19,40–42). Although a number of public databases have been constructed to host PTM information in the past decade (25–29,32), a more comprehensive resource for plant PTMs with a larger number of modified sites especially quantitative information, more types of PTMs and a higher coverage of species is still needed for the academic community. In this work, an integrative database of quantitative PTMs in plants named qPTMplants was developed, and it contains 429 821 PTM sites on 123 551 substrates for 23 PTM types from 43 plant organisms, with 620 509 quantitative events for 136 700 PTM sites on 55 361 proteins under 354 conditions. Because different catalytic enzymes can recognize specific peptides or different amino acid compositions will form diverse microenvironments (43–45), the sequence profile around the modified sites may affect the occurrence of corresponding modification. Therefore, we adopted pLogo (46) to analyze the sequence preferences of different PTMs based on the Arabidopsis dataset in qPTMplants. For phosphorylation, proline (P) and serine (S) were significantly enriched at the +1 and ±2 positions of phosphoserine and phosphothreonine, whereas arginine (R) was highly enriched at the −3 position for phosphoserine. Additionally, S residues were highly abundant surrounding phosphotyrosine (Figure 5A). These results were consistent with previous studies showing that protein phosphorylation could recognize -RXXS-, -SP- and -TP-type motifs (47–49). For three types of lysine modifications with the most modified sites, R was overrepresented at the +1 position for acetylation and highly abundant around the ubiquitination sites, suggesting that these two PTMs may mutually interplay at the same site, whereas glutamic acid (E) was significantly enriched at the +2 position for lysine SUMOylation (Figure 5B). For cysteine modifications, lysine (K) frequently appeared at positions ±5 and +1 surrounding S-sulfenylation sites, while E was overrepresented at positions −1, ±2 and ±3 for S-nitrosylation. Additionally, K was enriched at the −4 and −1 positions, and alanine (A) residues were distributed at the +1 and +3 positions for S-cyanylation (Figure 5C). In summary, different PTM types could recognize diverse sequence features in plants, whereas some PTMs occurring on the same residue appear to interact with each other (20,47–49).
Figure 5.

Sequence characteristics around the PTM sites in Arabidopsis. (A) The sequence preferences for phosphorylation. (B) Sequence preferences for three lysine modifications. (C) Sequence preferences for three cysteine modifications.

Sequence characteristics around the PTM sites in Arabidopsis. (A) The sequence preferences for phosphorylation. (B) Sequence preferences for three lysine modifications. (C) Sequence preferences for three cysteine modifications. Different PTM types can occur on the same residue or multiple residues of the same protein, and these modifications may collectively modulate various biological processes in a manner named PTM crosstalk (6,50–53). From the collected PTM data of Arabidopsis in qPTMplants, 49 types of PTM crosstalk at the same residue from 6134 PTM events in 2972 sites were identified, including 25 types of crosstalk between two different PTMs and 24 types of crosstalk among multiple PTMs (Figure 6 and Supplementary Table S4). As the representative crosstalk, there were 1481 acetylation–ubiquitination sites, 360 phosphorylation–O-GlcNAcylation sites and 348 S-sulfenylation–S-nitrosylation sites in our results (Figure 6). This suggested that the competitive regulation between acetylation and ubiquitinylation, phosphorylation and O-GlcNAcylation, and S-sulfenylation and S-nitrosylation is widespread in plants. For instance, Xu et al. detected 31 vernalization-associated proteins with both O-GlcNAcylation and phosphorylation in wheat (54). The phosphorylation level of S350 on Fru-bisphosphatealdolase was reduced during vernalization, whereas the O-GlcNAcylation of S350 appeared after vernalization, indicating that the correlation of O-GlcNAcylation and phosphorylation may participate in vernalization regulation (54). Intensive plant PTM crosstalk suggested that a large proportion of serine, threonine, lysine and cysteine residues could be dynamically regulated by various types of PTMs in a complicated manner.
Figure 6.

The distribution of potential crosstalk among different PTM types in Arabidopsis and the numbers of concurrent sites are presented at the bottom.

The distribution of potential crosstalk among different PTM types in Arabidopsis and the numbers of concurrent sites are presented at the bottom. Taken together, we constructed the qPTMplants database, which not only collected a large amount of PTM data in plants but also provided detailed annotations such as protein information, quantitative dynamics, and sequence and structural characteristics. In the future, qPTMplants will be regularly maintained and annually updated by recruiting more researchers to survey the newly published literature and manually collect the experimentally detected PTM data. In addition, more annotations, such as substrate three-dimensional structures, potential regulatory enzymes of PTMs, information on biotic and abiotic stresses, PTM networks and single-nucleotide polymorphisms, will be integrated into qPTMplants to provide more detailed and comprehensive information. We anticipate that the qPTMplants database will be a useful resource for further studies of PTMs in plants. Click here for additional data file.
  54 in total

Review 1.  Protein S-nitrosylation in plants: photorespiratory metabolism and NO signaling.

Authors:  Kapuganti J Gupta
Journal:  Sci Signal       Date:  2011-01-04       Impact factor: 8.192

Review 2.  Starch phosphorylation and the in vivo regulation of starch metabolism and characteristics.

Authors:  Yuxian You; Mingyue Zhang; Wen Yang; Cheng Li; Yuntao Liu; Caiming Li; Jialiang He; Wenjuan Wu
Journal:  Int J Biol Macromol       Date:  2020-05-21       Impact factor: 6.953

3.  Quantitative Phosphoproteomic and Metabolomic Analyses Reveal GmMYB173 Optimizes Flavonoid Metabolism in Soybean under Salt Stress.

Authors:  Erxu Pi; Chengmin Zhu; Wei Fan; Yingying Huang; Liqun Qu; Yangyang Li; Qinyi Zhao; Feng Ding; Lijuan Qiu; Huizhong Wang; B W Poovaiah; Liqun Du
Journal:  Mol Cell Proteomics       Date:  2018-03-01       Impact factor: 5.911

4.  The Protein Modifications of O-GlcNAcylation and Phosphorylation Mediate Vernalization Response for Flowering in Winter Wheat.

Authors:  Shujuan Xu; Jun Xiao; Fang Yin; Xiaoyu Guo; Lijing Xing; Yunyuan Xu; Kang Chong
Journal:  Plant Physiol       Date:  2019-05-06       Impact factor: 8.340

5.  Regulation of Aluminum Resistance in Arabidopsis Involves the SUMOylation of the Zinc Finger Transcription Factor STOP1.

Authors:  Qiu Fang; Jie Zhang; Yang Zhang; Ni Fan; Harrold A van den Burg; Chao-Feng Huang
Journal:  Plant Cell       Date:  2020-10-21       Impact factor: 11.277

6.  Large-scale Arabidopsis phosphoproteome profiling reveals novel chloroplast kinase substrates and phosphorylation networks.

Authors:  Sonja Reiland; Gaëlle Messerli; Katja Baerenfaller; Bertran Gerrits; Anne Endler; Jonas Grossmann; Wilhelm Gruissem; Sacha Baginsky
Journal:  Plant Physiol       Date:  2009-04-17       Impact factor: 8.340

7.  Aurora1 phosphorylation activity on histone H3 and its cross-talk with other post-translational histone modifications in Arabidopsis.

Authors:  Dmitri Demidov; Susann Hesse; Annegret Tewes; Twan Rutten; Jörg Fuchs; Raheleh Karimi Ashtiyani; Sandro Lein; Andreas Fischer; Gunter Reuter; Andreas Houben
Journal:  Plant J       Date:  2009-07       Impact factor: 6.417

Review 8.  Protein N-Terminal Acetylation: Structural Basis, Mechanism, Versatility, and Regulation.

Authors:  Sunbin Deng; Ronen Marmorstein
Journal:  Trends Biochem Sci       Date:  2020-09-08       Impact factor: 13.807

9.  NCBI BLAST: a better web interface.

Authors:  Mark Johnson; Irena Zaretskaya; Yan Raytselis; Yuri Merezhuk; Scott McGinnis; Thomas L Madden
Journal:  Nucleic Acids Res       Date:  2008-04-24       Impact factor: 16.971

10.  P³DB 3.0: From plant phosphorylation sites to protein networks.

Authors:  Qiuming Yao; Huangyi Ge; Shangquan Wu; Ning Zhang; Wei Chen; Chunhui Xu; Jianjiong Gao; Jay J Thelen; Dong Xu
Journal:  Nucleic Acids Res       Date:  2013-11-15       Impact factor: 16.971

View more
  3 in total

1.  Genome-scale analysis of Arabidopsis splicing-related protein kinase families reveals roles in abiotic stress adaptation.

Authors:  M C Rodriguez Gallo; Q Li; M Devang; R G Uhrig
Journal:  BMC Plant Biol       Date:  2022-10-22       Impact factor: 5.260

2.  The 2022 Nucleic Acids Research database issue and the online molecular biology database collection.

Authors:  Daniel J Rigden; Xosé M Fernández
Journal:  Nucleic Acids Res       Date:  2022-01-07       Impact factor: 16.971

Review 3.  Current perspectives of ubiquitination and SUMOylation in abiotic stress tolerance in plants.

Authors:  Madhavi Singh; Ananya Singh; Neelam Yadav; Dinesh Kumar Yadav
Journal:  Front Plant Sci       Date:  2022-09-20       Impact factor: 6.627

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.