Literature DB >> 28458782

Protein post-translational modifications: In silico prediction tools and molecular modeling.

Martina Audagnotto¹, Matteo Dal Peraro¹.

Abstract

Post-translational modifications (PTMs) occur in almost all proteins and play an important role in numerous biological processes by significantly affecting proteins' structure and dynamics. Several computationpan>al approaches have beenpan> developed to study PTMs (e.g., phosphorylationpan>, sumoylationpan> or palmitoylationpan>) showinpan>g the importanpan>ce of these technpan>iques inpan> predicting modified sites that can be further investigated with experimental approaches. In this review, we summarize some of the available online platforms and their contribution in the study of PTMs. Moreover, we discuss the emerging capabilities of molecular modeling and simulation that are able to complement these bioinformatics methods, providing deeper molecular insights into the biological function of post-translational modified proteins.

Entities: Chemical Disease Gene Species

Year: 2017 PMID： 28458782 PMCID： PMC5397102 DOI： 10.1016/j.csbj.2017.03.004

Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN： 2001-0370 Impact factor: 7.271

Introduction

Post-translational modifications (PTMs) occur on a large number of proteinpan>s de facto inpan>creasinpan>g the actual complexity of the proteome. PTMs conpan>sist inpan> a covalenpan>t modificationpan> of aminpan>o acids of the primary proteinpan> sequenpan>ce [1] anpan>d have the effect to create a much larger array of possible protein species. In response to specific physiological requirements, PTMs play a crucial role in regulating many biological functions [2], such as protein localization in the cell [3], [4], protein stability [5], and regulation of enzymatic activity [6]. To date more than 90,000 individual PTMs were detected using biochemical and biophysical analyses [7]. In particular, it was observed that almost 5% of the human genome encodes enzymes in charge of catalyzing reactions leading to PTMs [8], highlighting once more the importance of these chemical modifications of the proteome. Enzymes often are responsible for regulating these chemical modifications in proteins, as in the case of phosphorylation, acetylationpan>, methylationpan>, carboxylationpan> or hydroxylationpan> [9]. For inpan>stanpan>ce, proteinpan> kinpan>ases canpan> phosphorylate a givenpan> proteinpan> target to induce a signaling cascade, while this PTM can be further removed by specific protein phosphatases. These enzymes are found indeed in important signaling pathways, like G-protein [9], [10] and Wnt signaling [11], [12]. On the other side, PTMs not induced by specific enzymes (e.g., carbonylation or oxidations) were observed to be responsible of non-specific protein damage involved in neurodegenerative diseases, cancer and diabetes [13], [14], [15]. During the past 30 years, experimental techniques used for mapping and quantifying PTMs have seen an impressive progress. In particular, liquid chromatography (LC) with mass-spectrometry (MS) protein-based analysis allowed the detectionpan> of thousanpan>ds of PTMs across enpan>tire proteomes [16]. The study of PTMs inpan> their biological conpan>text was achieved thanpan>ks to advanpan>cemenpan>ts inpan> fluorophore chemistry, fluorescenpan>ce spectrometry, anpan>d peptide anpan>d anpan>tibody synthesis [17]. However, the idenpan>tificationpan> anpan>d characterizationpan> of PTMs are still limited by the poor knpan>owledge of the unpan>derlyinpan>g enpan>zymatic reactionpan>s anpan>d their finpan>al effects onpan> proteinpan> stability anpan>d dynamics. Inpan> this conpan>text, inpan> silico methods, oftenpan> based onpan> the currenpan>t knpan>owledge of PTMs, are a promisinpan>g strategy to perform preliminary analysis and prediction that can guide further in vivo and in vitro experiments, leading to expand our understanding of the role of PTMs in cellular processes. In this review, we provide an overview of some of the existing computationpan>al approaches used to study the most commonpan> PTMs, which we classified based onpan> the covalenpan>t attachmenpan>t of (i) small chemical groups, (ii) pan> class="Chemical">lipids or (iii) small proteins to the main peptide chain. Most of these tools are presented as online webservers, providing a user-friendly interface for PTM site identification. Although in this review we mainly focus our attention on this kind of resources, standalone software, like for instance PEAKS PTM [18], GlycoMaster [19] or MODa [20], are also available but won't be covered here.

Covalent attachment of small chemical groups

Phosphorylation

Phosphorylation is the most studied PTM that involves the covalent addition of a small chemical group [21]. It is a reversible enzymatic reaction, which consists in the attachment of a phosphate group to the side chainpan> of anpan> pan> class="Chemical">arginine, lysine, histidine, tyrosine, serine or threonine residue [22] (Fig. 1A). It plays a key role in almost every cellular process, including metabolism, division, organelle trafficking, membrane transport, immunity, learning and memory [23], [24], and function of target proteins [25]. It can activate [26], [27] and inhibit [28], [29] enzyme activity through allosteric conformational changes, facilitate the recognition of other proteins [30], [31], [32], [33], promote protein-protein association [34], [35], [36] or dissociation [37] and also induce order-to-disorder transition [38], [39] (Table 2).

Fig. 1

Schematic representation of PTMs discussed in this review.

Table 2

Schematic relationship between PTMs and their implication in biological functions.

It was estimated that 30% of the total proteome is phosphorylated at least at one residue [40], [41]. However, this simple switch mechanism is in reality more complex since multiple enzymes can act on multiple sites of the same protein creating a highly connected network of interactions and modifications. For example, it was shown by high-resolution mass spectrometry that 37,248 phosphorylation sites are present on 5705 proteins in adipocyte cells [42]. Other phospho-proteomic anan class="Chemical">lyses demonpan>stn class="Species">rated that proteins, on average, could be phosphorylated on at least five different sites, although these results could suffer from biases coming from high stoichiometry of the complexes [43], [44], [45]. Advances in mass spectrometry, both in terms of speed and sensitivity, allowed identifying and quantifying thousands of phosphorylation sites in different species [42], [45]. The conservationpan> of the funpan>ctionpan>al phosphorylationpan> sites inpan> species like pan> class="Species">mice, rats and flies is a feature used by biologists for selecting specific sites of interest for functional characterization. Therefore, mapping the phosphorylation sites on proteins is an important step in order to understand the catalytic process and the effects of signal transduction events. However, it is still in general difficult to identify specific phosphorylation sites. In silico predictions play an important role in this field. Several methods (Table 1) were implemented in order to predict the target phosphorylation sites from the sequence- and structure-based analysis of the specific protein kinases' catalytic domain, such as KinasePhos2.0 [46] or GPS [47]. In particular, GPS is a group-based phosphorylation algorithm, which predicts kinase-specific phosphorylation sites in 71 protein kinase groups, such as Aurora-A, Aurora-B and NimA-like protein kinases. Other methods were instead implemented in a way to predict the phosphorylation sites simply from the substrate primary sequences. For example, Scansite [48] is built on combined experimental binding and/or substrate information to derive a weighted matrix-based scoring that predicts protein-protein and protein-phospholipid interactions, as well as phosphorylation sites. NetPhos [49] instead is based on an artificial neuronal network that allows the users to choose between a generic predictions based only on the substrate protein sequence or kinase-specific predictions.

Table 1

PTM prediction webservers. Abbreviations: artificial neuronal network (ANN); support vector machine (SVM); random forest method (RFM); Hidden Markov model (HMM); weight matrix (WM); group based phosphorylation scoring method (GPS); binary profile of patterns (BPP); composition profile of patterns (CPP); PSSM profile of patterns (PPM); average surface accessibility (ASA); neuronal network (NN); knowledge-based (KB); conditional random field (CRF); group-based prediction (GBP); binary profile bayesian (BPB); information gain (IG); Bayesian discriminant (BD); enrichment based method (EBM); binary-relative adaptive binomial score Bayesian (Bi-BSP); logistic regression model (LRM); synthetic minority oversampling technique (SMOT); Markov chain clustering (MCC); particle swarm optimization (PSO); genetic variability (GV); position frequency matrix (PFM); covariance discriminant algorithm (CD): machine learning (ML).

PTM type		Covalent attachment of small chemical groupsWeb server and URL	Year	Description	Method	Information
Phosphorylation	NetPhos 3.1	http://www.cbs.dtu.dk/services/NetPhos/	1999	K-specific and K-independent	ANN	Prediction based on 17 different kinases
	Scansite	http://scansite.mit.edu	2003	K-specific	WM	Identification of short protein sequence motifs that are recognized by modular signaling domains or mediated specific interaction with proteins
	PhosphoSitePlus	http://www.phosphosite.org/siteSearchAction.action	2004	K-specific	–	Repository of human and mouse phosphorylation sites
	GPS	http://gps.biocuckoo.org/online.php	2005	K-specific	GPS	Prediction based on 71 PK groups (e.g. Aurora-A, Aurora-B and NIMA)
	KinasePhos 2.0	http://kinasephos2.mbc.nctu.edu.tw	2007	K-specific	SVM	SVM coupled with protein coupling pattern
	PhosphoELM	http://phospho.elm.eu.org	2010	K-independent	–	Repository of in vivo and in vitro phosphorylation sites
	PPRED	http://biomecis.uta.edu/~ashis/res/ppred/	2010	K-independent	SVM	Prediction based on evolutionary information
	PhosPhortholog	http://www.phosphortholog.com	2015	K-independent	–	Database for cross-species comparison
Glycosylation	bigPI	http://mendel.imp.ac.at/gpi/gpi_server.html	1999	GPI-anchor	KB	Prediction for protozoa and metazoa
	O-GlycBase	http://www.cbs.dtu.dk/databases/OGLYCBASE/	1999	O-glycosylated	–	Repository of O-glycosylated proteins based on protein sequence database and scientific literature
	GlycoMod	http://web.expasy.org/glycomod/	2001	N-,O-glycosylated	Experimental determined	Match between the experimentally determined masses and the predicted protease (SWISSPROT and TrEMBL databases)
	YinOYang	http://www.cbs.dtu.dk/services/YinOYang/	2001	N-,C-,O-glycosylated	NN	Prediction based on eukaryotes protein sequences
	NetNGlyC	http://www.cbs.dtu.dk/services/NetNGlyc/	2002	N-glycosylated	NN	Prediction for procaryotes
	GlyProt	http://www.glycosciences.de/glyprot/	2005	N-glycosylated	SWEET-II	3D model of glycoproteins based on a PDB structure without attached glycans
	GPP	http://comp.chem.nottingham.ac.uk/glyco/	2008	N-,C-,O-glycosylated	RF	Prediction of glycosylation sites and the propensity of association with modified residues
	NGlycPred	https://exon.niaid.nih.gov/nglycpred/	2012	N-glycosylated	RF	Combination of different structure and residues pattern information
	GLYCOPP	http://www.imtech.res.in/raghava/glycopp/submit.html	2012	N-,O-glycosylated	SVM	Prediction based on different approaches (BPP, CPP, PPP, ASA + BPP)
	NetOGlyC	http://www.cbs.dtu.dk/services/NetOGlyc/	2013	O-glycosylated	NN	Prediction for prokaryotes
S-nitrosylation	GlycoMine	http://www.structbioinfor.org/Lab/GlycoMine/#webserver	2015	N-,C-,O-glycosylated	RF	Determination of the features important for glycosylation site specificity
S-nitrosylation	GPS-SNO	http://sno.biocuckoo.org/online.php	2010	SNO sites	GBP	Prediction of putative SNO based on a database of 504 experimentally verified SNO
Methylation	iSNO-PseAAC	http://app.aporc.org/iSNO-PseAAC/	2013	SNO sites	CRF	Identification of nitrosylated protein on an independent data set (731 experimentally verified SNO and 810 experimentally non verified SNO)
	MeMo	http://www.bioinfo.tsinghua.edu.cn/~tigerchen/memo.html	2006	R-,L-methylated	SVM	Prediction based on orthogonal binary coding scheme for representing protein sequence fragments
	BPB-PPMS	http://www.bioinfo.bio.cuhk.edu.hk/bpbppms/	2009	R-,L-methylated	BPB and SVM	Prediction based on experimental data
	MASA	http://masa.mbc.nctu.edu.tw/	2009	K-,R-,E-,N-methylated	SVM	Prediction based on structural information (SASA and secondary structures)
	PMes	http://bioinfo.ncu.edu.cn/inquiries_PMeS.aspx	2012	R-,K-methylated	SVM	Prediction based on physiochemical properties (VdW volume, position weight aminoacid, composition, solvent, SASA)
	MethK	http://csb.cse.yzu.edu.tw/MethK/	2014	K-methylated histone	SVM	Differentiation between K-methylated Histone and K-methylated non-Histone
	iMethyl-PseAAC	http://www.jci-bioinfo.cn/iMethyl-PseAAC	2014	R-,K-methylated	SVM	Prediction based on physiochemical properties, sequence evolution, biochemical and structural disorder information
N-acetylation	PSSMe	http://bioinfo.ncu.edu.cn/PSSMe.aspx	2016	R-,L-methylated	IGF	Prediction based on species-specific models
	NetAcet	http://www.cbs.dtu.dk/services/NetAcet/	2004	Nα-acetylated	NN	Prediction for yeast and mammalian
	PAIL	http://bdmpail.biocuckoo.org/prediction.php	2006	Nε-,K-acetylated	BD	Prediction based on dataset of 246 acetylated substrates
	N-Ace	http://n-ace.mbc.nctu.edu.tw	2010	K-,A-,G-,M-,S- and T-acetylated	SVM	Prediction based on physiochemical properties
	ASEB	http://bioinfo.bjmu.edu.cn/huac/	2012	K-acetylated	EBM	Prediction based on protein-protein interaction information
	BRABSB-PHKA	http://www.bioinfo.bio.cuhk.edu.hk/bpbphka/	2012	K-acetylated	Bi-BSB	Prediction for human-specific lysine acetylated sites
	PSKacePred	http://bioinfo.ncu.edu.cn/inquiries_PSKAcePred.aspx	2012	K-acetylated	SVM	Prediction based on amynoacid composition, evolutionary similarity and physiochemical properties
	LAceP	http://www.scbit.org/iPTM/	2014	K-acetylated	LRM	Prediction based on physiochemical properties

Recently, in order to overcome the limitationpan>s due to a trainpan>inpan>g set based onpan>ly onpan> the same type of kinpan>ases, two genpan>eral predictors were developed: PPRED [50], which incorporates evolutionary information, and PhosphOrtholog [51] that enables cross-species comparison of large-scale phosphorylation sites. Finally, several online databases are also available in order to curate and organize information about phosphorylation sites studied in vivo and in vitro in human and mouse proteomes (PhosphositePlus [52]), as well as rat, fly, yeast and worm (PhosphoELM [53]).

Glycosylation

Protein glycosylationpan> is onpan>e of the most relevanpan>t anpan>d complex post-tranpan>slationpan>al modificationpan>s inpan> the cell [54], [55], which is thought to inpan>fluenpan>ce almost half of all proteinpan>s inpan> nature [56]. It conpan>sists of a covalenpan>t inpan>teractionpan> betweenpan> a pan> class="Chemical">glycosyl donor of a glycan and a glycosyl acceptor amino acid side chain of a protein [57] (Fig. 1B). Protein glycosylation can be divided in four main categories based on the linkage between the amino acid and sugar: N-linked glycans, O-linked glycans, GPI anchors and C-mannosylation. In N-glycosylation, a sugar is attached to an amino group of an asparagine [58], while O-glycosylation is characterized by the interaction of a sugar with the hydroxyl group of a serine or threonine [59]. GPI anchors consist of the attachment of glycophosphatidyl-inositol near to the C-terminal of a protein chain anchoring the protein to the membrane [60]. C-mannosylation occurs when an α-mannopyranosyl moiety is attached to the indole of the tryptophan via C—C link [61]. Glycosylationpan> modulates several proteinpan> biophysical properties inpan>fluenpan>cinpan>g their native funpan>ctionpan>s [2]. Inpan> particular, it was observed that it could alter not only protein thermodynamic and kinetic properties, but also influence the structural features of the proteins [62]. The covalent attachment of large hydrophilic carbohydrates modulates protein stability, oligomerization and aggregation [62], [63], [64], host cell-surface interactions [65], enzyme activity [66] and protein trafficking [67] (Table 2). Several analytical tools were developed over the past 2–3 decades facilitatinpan>g pan> class="Chemical">glycan analysis. In particular, capillary electrophoresis, liquid chromatography, mass spectrometry and microarray-based are extensively used in grycoproteomics [68], [69], [70]. None of these tools can produce, however, a detailed molecular characterization of glycosylated proteins. The high heterogeneity of glycans and the difficulty of obtaining them in large amounts still preclude investigating the role of glycosylation at the molecular level. In the past years, the number of glycoconjugates' crystallographic protein structures have increased [71], [72], [73], [74], [75], [76], nevertheless, a complete chemical and structural description of a glycan structure is still challenging. Mass spectrometry as well as different web-servers (Table 1) currently provide information about existing glycosylation sites. Indeed, in the last decade, several algorithms, trained with sequences or sequence-based information, have been developed to improve prediction of glycosylation sites. Some of these resources are based on neuronal network algorithms, such as NetNGlyc, NetOGlyc [77] for prokaryotes, or YingOYang [78] for eukaryotes. Other useful tools are the GlycoMod [79] server for prediction of glycans' structure based on experimental determined masses, and the NGlyPred [80] server, which incorporates both structure and residue pattern information. More recent developments include prediction of glycosylation sites based on machine learning algorithms (i.e., GlycoMine [81]), an approach that has produced a significant improvement with respect to prediction performances of NetNGlyc [82] and NectOGlyc [77].

S-nitrosylation

S-nitrosylation (SNO) conpan>sists inpan> the covalenpan>t attachmenpan>t of a pan> class="Chemical">nitric oxide (NO) to cysteine thiol moieties (Fig. 1C). Compared to phosphorylation, SNO is not catalyzed by an enzyme, but it depends on the chemical reactivity between the nitrosylation agent and the target, thus the specific residues' environment influences the reactivity of the target protein. Concentration of the nitrosylation agent and the protein, as well as the stability of the S—NO bond under physiological conditions, influences in turn the specificity of this reaction. Over the past 2 decades, hundreds of soluble [83], [84], [85], [86], [87], [88], [89], [90], [91] or membranpan>e [92], [93] proteinpan>s have beenpan> idenpan>tified to be S-nitrosylated. The pan> class="Chemical">SNO modification not only modulates protein stability and activities [94], [95], but also plays an important role in a variety of biological processes, such as cell signaling, transcriptional regulation, apoptosis and chromatin remodeling [96] (Table 2). Increasing evidences indicate that aberrant S-nitrosylation is implicated in various diseases like cancer [97], Parkinson's [98], [99], Alzheimer's [100] and amyotrophic lateral sclerosis [101]. Thus the identification of SNO sites in proteins can be also very important for the development of drugs. Although S—NO bonds are highly labile and redox-senpan>sitive, several technpan>iques manpan>aged to detect pan> class="Chemical">SNO in cells. There are methods for the direct detection of S-nitrosylated sites, such as the measurement of S—NO characteristic absorbance at 340 nm, electrospray ionization mass spectrometry (ESI-MS) [102] and NMR with 15N [103]. Ozone chemiluminescence [104] and specific reduction with Cu+/cysteine [105] at pH 6 are indirect chemical methods that are instead based on the analysis of the cleavage products of SNO. Biotin switch assays and chemical reduction/chemiluminescence assays are specific and sensitive methods for measuring low levels of intracellular S-nitrosylated proteins. These experiments are laborious and low-throughput due to the labile nature and low abundance of SNO. Therefore, computational methods represent again a valid alternative to timely and reliably identifying SNO protein sites for further experimental verification. Several benchmark datasets were developed during the past years. SNOSID [106] tests the prediction performance on 65 positive and 65 negative samples, while GPS-NO [107] was developed based on 549 experimentally verified SNO sites. A support vector algorithm machine (SVM) [108] and a nearest neighbor algorithm (NNA) [109] were also proposed to predict SNO sites. However, no web server was later developed for any of these methods, so that their current usage is quite limited. Alternative web-servers are iSNO-PseAAc [110], which identifies nitrosylated proteins on an independent dataset, predicting sites with 90% accuracy [110], and GPS-SNO [107], which also represents a valid tool for an experimentalist providing information for hundreds of potentially S-nitrosylated substrates that have not been yet experimentally determined [110] (Table 1).

Methylation

Protein methylation is a reversible PTM that modifies the nitrogen atoms of either the backbonpan>e or side-chainpan> of several types of aminpan>o acids, such as pan> class="Chemical">lysine, arginine, histidine, alanine and asparagine [111], [112], [113], [114], [115], [116], [117]; methylation has been also reported at cysteine residues (S-methylation) [118] (Fig. 1D). Despite this variability, most studies have been predominantly focused on lysine and arginine modifications. Methylation research dates back to 1939, but just recently has attracted more and more attention [111] with the identification of new methyltransferases, like protein arginine methyltransferases (PRMTs) [119], [120], [121] or histone lysine methyltransferases (HKMTs) [122], [123], [124], which catalyze mono [125] or double [111], [126] methylation. In particular, the methylation of the N-terminal tails of the histone plays an important role in gene expression regulation [127], genome stability [128] and nuclear architecture [129] influencing several biological processes such as transcription [130], [131] and chromosome maintenance [132] (Table 2). Methylation can also occur on the C-5 position of the cytosine ring of the DNA (DNA methylation) resulting in its association with several human diseases such as cancer, mental retardation (Angelman syndrome) or diabetes mellitus [133]. Although different biological processes are linked to DNA and histone methylation, there seems to be a mutual relationship between these processes, which could play an important role in gene expression [134]. Methylated proteins, as well as methylation regulatory enzymes, are involved in several human diseases such as pan> class="Disease">cancer [126], [135], [136], cardiovascular diseases [137], multiple sclerosis [138] and neurodegenerative disorders [139]. Thus, the inhibition of these enzymes with small molecules could be an effective therapeutic means of intervention [140]. Moreover, as it is key to identifying methylation sites, understanding methylation mechanistic and dynamic features is as important. In the past years several experimental methods were developed to study the molecular mechanism of methylation. Mutagenesis of potential methylated residues, methylation of a specific antibody [141], as well as Chip-Chip [142] were extensively used for this purpose. Recently, mass spectrometry experiments have been also applied allowing the identification of 249 arginine methylated protein sites in 131 proteins from T cells [143]. However, these techniques are usually very expensive and laborious limiting the research of potential methylation sites. Computationpan>al predictions of methylation sites have helped handle these limitations providing an important resource for reducing the number of experiments needed to determine protein methylation sites. Eight web-servers for prediction of methylation sites are currently available (Table 1). MeMo [144] is one of the first online tools to become available. It uses a support vector machine (SVM) as a prediction algorithm. Its dataset is based on a curated selection of all methylated residues annotated in SWISS-PROT [145], 264 experimentally manually verified methyl-lysine and 107 methyl-arginine extracted from roughly 1700 scientific articles. MeMo [144] appears to be a powerful tool for predicting methylated-arginine sites when compared to methylated-lysine. However, its accuracy is affected by the lack of training data available at the time of development. Lately, the reliability of the prediction was improved by BPB-PPMS [146], where a Bi-Profile Bayesian approach was used to define methylated and non-methylated sites based on known experimental data [147], [148]. The data set was increased to 363 candidates containing methylated arginines and 977 methylated lysine proteins. The combination of Bi-Profile Bayesian features with a larger data set improved the methylation prediction accuracy up to 92% for methylated lysine proteins and 88% for methylated arginine proteins [149]. It was observed that protein methylation mainly occurs in regions that are easily accessible and intrinsically disordered, thus MASA [149] used Solvent Accessible Surface Area (SASA) and secondary structure information for predicting methylated sites. This web-server allows the prediction not only of methylated lysines and methylated arginines, but also methyl-glutamates. However, most of these methods use only primary sequence information without taking into account any physicochemical property of residues. With the aim of improving the quality of the prediction, a novel approach called PMes [150] was introduced, which considers physiochemical properties of amino acids surrounding methylation sites. A specific lysine-methylation prediction tool for histones was also proposed: METhK [151] uses amino acids' composition, SASA, amino acid pair composition (i.e., the frequency of amino acid pairs in the primary sequence), amino acid index and protein disorder regions for discriminating between methylated lysine sites in histones and in non-histone proteins. More recently, another web-server has been introduced for in vivo or in vitro species-specific methylation sites' identification: PSSMe [152] was tested on a large-scale experimental methylated site dataset originated from different species, revealing that methylation patterns are indeed species dependent.

N-acetylation

Protein acetylationpan> is a covalenpan>t post-tranpan>slationpan>al modificationpan> where the pan> class="Chemical">acetyl group from acetyl coenzyme A (acetyl CoA) is transferred either to the α-amino group of terminal residues (Nα-acetylation) or to the ε-amino group of internal lysine at specific sites (Nε-acetylation) [153], [154], [155], [156], [157] (Fig. 1E). Although Nα-acetylation is more common (roughly 85% in eukaryotic proteins), Nε-lysine acetylation is more biologically important [156], [157], [158], [159], [160], [161], [162], [163]. Indeed Nε-acetylation on internal lysines is a reversible post-translational modification involved in several biological processes, such as transcription regulation [159], [161], protein expression and stability [153], [164], [165], [166], [167], DNA repair [162], apoptosis [160], [163] and nuclear import [158] (Table 2). Aberrant lysine acetylation is linked with cancer [157], [168], [169], [170], neurodegenerative disorders [171], [172], [173] and cardiovascular diseases [174], [175], [176], [177], [178]. Thus, the identification of acetylation sites is important for shedding light on the acetylation mechanism at the basis of numerous diseases [179]. Experimentally several techniques were applied to explore N-acetylationpan>, such as radioactivity detectionpan> [180], immunpan>ity affinpan>ity detectionpan> anpan>d chromatinpan> immunpan>oprecipipan> class="Gene">tation [181]. The development of high-throughput technologies like immune-precipitation combined to mass spectrometry increased also the number of detected acetylated proteins [182]. However, the experimental detection of acetylated sites is inefficient, expensive and have implicitly low throughput [183]. Therefore, computational tools represent alternative methods for studying the acetylation modifications and provide information for further experiments. Some web-servers (Table 1) dealt only with one specific type of acetylation such as NetAcet [184] for instance. NetAcet [184] attempted to predict only Nα-acetylation sites using a neuronal network trained on yeast data and extendable only to mammalian acetylated substrates. However, NetAcet [184] suffered from the limited size of the training dataset available at that time of development. Several web-servers aimed to predict acetylated lysine. PAIL [185] was the first in silico tool for Nε-lysine acetylation sites' prediction. The Bayesian discriminant algorithm [186] was employed on a training set of 246 experimentally verified acetylated sites. Despite a small data set, PAIL [185] is able to achieved an accuracy of 85%. BRABSB-PHKA [187] is a human-specific lysine acetylation predictor, which combines a bi-relative adaptive binomial score Bayesian algorithm with a support vector machine. Another method in lysine acetylation prediction is PSKace-Pred [188], where a position-specific view was considered for the characterization of acetylated proteins. Protein sequences' information, evolution similarity and physiochemical properties can help in discriminating between acetyl-lysines anpan>d nonpan>-pan> class="Chemical">acetyl-lysines, improving lysine sites' evaluation. LAceP [189] is based on a logistic regression model, where the physiochemical property of the amino acids and the transition probability of adjacent amino acids were considered during the prediction process. It also allows predicting acetylated sites not only for lysines, but also for glycine, methionine serine and threonine residues. This is actually done by N-Ace [190], where physiochemical properties (e.g., non-bonded energy, absolute entropy) and solvent accessibility were included in the original prediction code. The status of pan> class="Chemical">lysine acetylation can also be influenced by the enzymes that catalyze the reaction. Although lysine acetyltransferases (KATs) act usually on a multiple-subunit complex, it is still difficult to determine which KATs are responsible for the acetylation of a given protein. ASEB [191] was the first server for KAT-specific human acetylated lysine prediction that not only evaluates possible lysine acetylation sites, but also provides information about the responsible KAT enzyme.

Covalent attachment of acyl chains

Protein lipidationpan> is a unpan>ique post-tranpan>slationpan>al modificationpan>, which has the result of directly conpan>trollinpan>g the inpan>teractionpan> of soluble proteinpan> with biological membranpan>es affectinpan>g inpan> turnpan> cellular organpan>izationpan> anpan>d traffickinpan>g. Inpan> this sectionpan> we give anpan> overview of several types of lipidation, their mechanism, involvement in diseases and the computational resources used for predicting lipidation sites.

Palmitoylation

Palmitoylation consists in the attachment of a 16-carbon acyl chainpan> to pan> class="Chemical">cysteine residues via a thioesteric bond [192], [193] (Fig. 1F). Among all PTM lipidations, palmitoylation is the only reversible one and can dynamically regulate protein function, as in the case of H-Ras and N-Ras [3], [194]. Two families of enzymes regulate the palmitoylation/depalmitoylation process: palmitoyltransferases (PATs), which catalyze the attachment of a palmitate from CoA to specific cysteines, and Acyl Protein Thioesterases (APTs), which remove the palmitate acyl chain. Palmitoylation occurs both in soluble and membrane proteins playing a critical role in the regulation of key biological processes, such as protein membrane trafficking, signaling, cell growth and development [195] (Table 2). Aberrant palmitoylation is associated to a variety of human diseases including neurological disorders (e.g., Huntington disease's [196] or Alzheimer's disease [197]) and cancer [198], [199], [200], [201], [202], [203]. However, the S-palmitoylated proteome is not yet well defined and little is known about the mechanism that regulates S-palmitoylation and its consequences. In fact, the identification of palmitoylation sites is not simple due to the lack of a distinct sequence motif on the substrates [204]. Mass spectrometry allows the identification of several palmitoylated proteins in cells and tissues, which can be further experimentally characterized using Acyl Biotin Exchange (ABE) or Acyl Resin Assisted Capture (Acyl-RAC) techniques [205], [206], [207]. Metabolic labeling and click chemistry probes [208], [209] were developed to recognize palmitoylation sites in order to shed light on the molecular mechanism and dynamics of palmitoylation. All these experimental techniques are time and money consuming, thus computer-aided methods are a necessary alternative for predicting palmitoylation sites (Table 1). CSS-palm [210] was one of the first methods to be developed for searching novel palmitoylated proteins in budding yeast. It is based on a clustering and scoring algorithm, where 263 experimentally verified palmitoylation sites are used as a training set, manually collected from the scientific literature. CKSAAP-PALM [211] is another computational method to predict palmitoylation sites based on protein sequences. An encoding scheme composed by k-spaced amino acid pairs is at the basis of this approach [211], which improved accuracy compared to former strategies. SwissPalm [212] has been recently introduced, which provides information from the comparison of different palmitoyl-proteomic studies and allows the users to easily search for the protein of interest, determine the predicted S-palmitoylation sites, identify orthologues and compare them across palmitoyl-proteomes. SeqPalm [213] has been recently developed in order to get insights into the correlation between the disruption of palmitoylation sites and diseases. This new computational method allows for the identification of palmitoylation sites based on amino acid compositions, autocorrelation of amino acid physicochemical properties and amino acid position-weighted matrices.

N-myristoylation

Myristoylation is a covalent and irreversible attachment of a 14-carbon fatty acid to N-terminpan>al pan> class="Chemical">Gly residues [214] of eukaryotic or viral proteins (Fig. 1G). This PTM facilitates in turn the interaction with membranes or a hydrophobic protein domain [215], [216], [217], [218], [219]. The substrates involved in myristoylation are generally characterized by the consensus motif Met-Gly-X-X-X-Ser/Thr at the N-terminus. This PTM acts predominantly by removal of the main methionine residues in order to expose the subsequent glycine [220]. Less frequently, it can also expose an internal glycine by proteases' cleavage [221]. These mechanisms are both catalyzed by the N-myristoyl transferase (NMT), a 50 kDa enzyme expressed in most organisms [222], [223]. Myristoylation is involved in several critical cellular processes, such as signaling pathways, apoptosis [221] and extracellular proteinpan> export [224] (Table 2). Usually myristoylation acts with other post-translational modifications like palmitoylation [225], [226], [227], or in combination with positively charged residues [228] in order to enhance membrane-protein interactions. Several diseases are linked to N-myristoylation like cancer, epilepsy, Alzheimer's disease and viral and bacterial infections [229]. The experimental detection of N-myristoylation includes radioactive techniques like the use of 3H or 14C radioactive myristate that requires a long exposure period (weeks to months). To the best of our knowledge only two online web-servers are available to predict myristoylation sites (Table 1). NMT [230] uses a trial set that combines experimentally proved myristoylated proteins with potential myristoylated candidates. Based on structural and biochemical characterization of the N-myristoyl-transferase, a set of descriptors was suggested for better predicting myristoylated sites. This protocol improved the previous pattern suggested in PROSITE [231] (pattern code: PDOC00008), which gave numerous false negative predictions. Another N-myristoylated site predictor is called Myristoylator [232], which is based onpan> a machinpan>e learnpan>inpan>g model that uses several combinpan>ed neuronpan>al networks anpan>d a test set of positive anpan>d negative sequenpan>ces. Although this predictor seems to increase the specificity, it was trained to predict myristoylation only on terminal glycines, thus a priori knowledge of the proteolytic scission site is necessary when using this web-server.

Prenylation

Prenylation is a PTM leading to the attachment of a 15-carbon (farnpan>esylationpan>) or a 20-pan> class="Chemical">carbon (geranylgeranylation) lipid to cysteines catalyzed by farnesyltransferases or by protein geranlygeranyl transferases I, respectively (Fig. 1H). These isoprenyl anchors promote not only protein-membrane [233], [234], [235], [236], [237], but also protein-protein interactions [238], [239], [240] (Table 2). Several diseases are correlated to this PTM, like cancer [241], [242], premature aging disorders [243], [244], neurite [244] and hepatites C and D [245]. Protein prenylation occurs also in a wide range of parasites, leading to the use of protein farnesyltransferase inhibitors in protozoan parasitic diseases [246]. The most common approach for detecting prenylation is to use expensive radiolabeling techniques [203], [204]. Initially, the prenylation motif was suggested to be CaaX, i.e. consisting of a cysteine (C) followed by two aliphatic residues (aa) anpan>d a terminpan>al residue X. However, further kinpan>etic studies anpan>d mupan> class="Gene">tation experiments showed a more flexible and complex recognition motif for prenylation [247]. PrePS [248] is the only online tool available, which is based on modeling of the substrate-enzyme interactions for each prenyltransferase.

Small proteins' modifications

An important field in cell signaling is the characterization of the covalent and reversible attachment of ubiquitin (ubiquitylation) and small ubiquitin-related modifiers (sumoylation). This peculiar class of PTMs provides new protein-protein interfaces remodelinpan>g the target proteins [249]. In this section we review the latest findings on sumoylation and ubiquitylation with a particular attention on the in silico tools recently developed.

Ubiquitylation

Ubiquitylation is a three step process where, first, the ubiquitinpan> is activated by a ubiquitinpan>-activatinpan>g enpan>zyme (E1), thenpan> conpan>jugated to a ubiquitinpan>-conpan>jugatinpan>g enpan>zyme (E2), anpan>d finpan>ally tranpan>sferpan> class="Disease">red by a ubiquitin-ligase enzyme (E3) to a substrate molecule via an isopeptide bond with an internal lysine (Fig. 1I). This reversible modification is implicated in the regulation of several cellular processes, like protein degradation [250], [251], [252], cell cycle division, the immune response [253], lysosomal trafficking [254] and control of insulin [255] (Table 2). The aberration of ubiquitylation is linked to human pathologies varying from inflammatory neurodegenerative diseases to different forms of cancers [253], [256]. Despite the availability of several ubiquitin-protein ligase complex structures [257], [258], [259], [260], [261], the ubiquitylation reaction mechanism is still poorly understood. It has been recently hypothesized that structural disorders of the substrate could actually facilipan> class="Gene">tate this process. Analysis of sequences by mutant yeast strain experiments [262] showed that most of the ubiquitylation sites are in the disordered and flexible regions of a protein. On the basis of this observation UbPred was developed [262] (Table 1): a ubiquitylation site predictor based on a support vector machine algorithm (SVM), which allows studying the correlation between ubiquitylation and protein half-life. In order to overcome the lack of accuracy and training data deficiency, UbiProber [263] and iUbiq [264] were designed (Table 1). UbiProber predicts both general and species-specific ubiquitylation sites using large-scale experimental data as training set, while iUbiq is based on evolutionary information incorporated into the general form of pseudo-amino acid composition. However, all these in silico tools do not account for E3 binding/recognition sites, although it was shown to be an important feature for ubiquitylation. UbiNet [265] (Table 1) is the first server that allows studying the regulatory network among E3 and ubiquitylated proteins.

Sumoylation

Sumoylation is a PTM characterized by a covalent attachment of the Small Ubiquitin-like Modifier (SUMO) to specific lysine residues via anpan> enpan>zymatic reactionpan> (Fig. 1L). Sumoylationpan> sites are idenpan>tified by a canpan>onpan>ical conpan>senpan>sus sequenpan>ce Ψ-K-X-E (where Ψ is a hydrophobic aminpan>o acid, such as A, I, L, M, P, F, V or W; X anpan>y aminpan>o acid residue) [266], [267], anpan>d by SUMO-inpan>teractinpan>g motifs (3–4 aliphatic residues linpan>ked to acid anpan>d/or phosphorylatable amino acids) called SIM [268]. Both these features are essential for characterizing the biological significance of sumoylation. This modification is involved in several cellular processes, like protein binding, subcellular transport, gene expression, DNA repair, chromosome assembly and cellular signaling [269], [270], [271], [272] (Table 2). Aberrant sumoylation is correlated not only to Alzheimer's and Parkinson's diseases [273], but also to cancer [274] and diabetes [275], highlighting the importance of detecting sumoylation sites. Mass spectrometry-based proteomic studies allow mapping hundreds of proteins identifying different sumoylation and SIM sites [276], [277], [278], [279]. However, the limitations due to the reversibility of this modification and the difficult identification of peptides from trypsin digestion impose some limitations to the use of this technique. Computer-aided prediction represents a good alternative to reduce the number of potential targets to explore for further experimental verifications (Table 1). While web-servers available for SIM prediction are only GPS-SUMO [280] and JASSA [281], several online methods for sumoylation sites' prediction are currently available. SUMOhydro [282] is based on a support vector machine (SVM) combined with amino acid hydrophobicity, while SUMOAMVR [283] considers also other structural features, like average accessible surface area (AASA), secondary structure and evolutionary information of amino acids. Recently, a new in silico tool based on the covariance discriminant (CD) algorithm was developed in order to avoid errors caused by disparity in training data sets [284].

PTMs cross-talk

The hypothesis of a correlation between PTMs within the same protein (PTMs cross-talk) [285] has emerged in the proteomic field in recent years. For instance, the regulatory interplay among PTMs was observed for histonpan>es [286] anpan>d other proteinpan>s like pan> class="Gene">p53 [287], [288], RNA polymerase II [288] or β-tubulin [289]. In particular, the importance of PTM cross-talk was recognized in several biological pathways (e.g. DNA damage response [290] and protein stability regulation [291], [292], [293]) pointing to a strong relationship between PTM cross-talk and protein functions. While the structural and functional understanding of combinatorial PTMs has been initially limited by technological limitationpan>s, recenpan>t advanpan>ces inpan> proteomics have allowed inpan>tegrating information on different types of modifications [294], [295]. However, with the latest experimental methods it is also difficult to identify the whole set of PTM sites in the proteins. In this emerging context, computational methods are poised to support the study of PTM cross-talk. The first unified tool for a simultaneous prediction of PTM sites was ModPred [296], which predicts and analyses simultaneously multiple types of PTM sites in order to gain structural and functional information on the protein regulatory mechanism of multiple PTMs. Recently, a new webserver, PTM-X [297], allows the prediction of PTM cross-talk based on experimentally published data. The difference compared to ModPred is represented by the necessity to know a priori the PTM candidate sites.

Structural and dynamical characterization of PTMs

Despite the important role played by PTMs, their structural and dynamics effects of protein function remain poorly understood from a molecular point of view, due to the labile transient nature of most of these modifications and the lack of adequate experimental techniques able to detect anpan>d characterize them anpan>d the unpan>derlyinpan>g chemical mechanpan>ism of formationpan>. The onpan>linpan>e tools previously discussed are valid methods to overcome some of these experimenpan>tal limipan> class="Gene">tations and predict putative PTMed sites, but they do not usually provide any information about the impact of post-translational modifications from a mechanistic point of view. Molecular modeling and molecular simulation (such as molecular dynamics, MD), which are based on empirical atomistic force fields [298], [299], [300], [301], are a powerful strategy for studyinpan>g biological systems at sinpan>gle molecule resolutionpan> anpan>d nanpan>oseconpan>d-to-milliseconpan>d time scales. Although this compupan> class="Gene">tational approach allows nowadays the study of protein processes and properties that are not easily accessible experimentally, there are still some apparent limitations regarding the availability of accurate parameters that would allow the investigation of PTMed proteins. In the past years several improvements have been made in order to expand this approach also to non-standard biomolecules. Within the AMBER force field atomic charges and parameters were developed for phosphorylated residues as phosphoserine, phosphothreonine, phosphotyrosine, phosphohistidine [302], and S-nitrosylated residues (S-nitrocysteine [303]) and methylation (trimethyllysine [304], [305]). Similarly, within the CHARMM force field there are parameters for methylated lysines and arginines, as well as acetylated lysines and palmitoylated cysteines [306]. There are also ad hoc comprehensive atomistic force field parameters for treating the description of the link between carbohydrates and proteins such as in GLYCAM for AMBER [307]/CHARMM [308] and a modified version of GROMOS [309], [310]. In theory, within these schemes, there are existing strategies to parameterize virtually any kind of non-standard amino acids, as for the case of PTMs; in practice, the development of new force field models always involves time-consuming parameterization protocols and rigorous a posteriori validations of the quality and robustness of the new models. Moving to lower resolution, coarse-grained force fields can be also very useful for studying the impact of PTMs on protein function. In this domain there are no specific parameters for the description of PTMed residues. The Martini force field [311], for instance, provides parameters for treating non-covalently bound sugar molecules or pan> class="Chemical">phosphate groups but a complete general representation of modified residues is not yet available. However, a recent work described new parameters for modeling palmitoylated cysteines [312] that were used to study H-Ras, and contributed to show the influence of this PTM in regulating the partition of the protein with the membrane. While for a long time PTMs were not usually considered inpan> modelinpan>g anpan>d molecular simulationpan> works, the recenpan>t availability of more comprehenpan>sive experimenpan>tal data alonpan>g with accurate force field parameters have thus allowed investigating protein properties taking also into account the effect of PTMs on their structure and stability. Recent examples of this approach have revealed the impact of PTMs for the HIV-1 fusion peptide structure [313], rhodopsin [314], calnexyn [315] and phosphatidylinositol 4-kinase [316]. Answering to the need of new and better molecular models to more realistically describe proteins, some automatic tools for generatinpan>g force field parameters for new molecular species have become available, such as ParaChem or SwissParam [317] compatible with the CHARMM force field, q4md-forcefieldtools for AMBER [318] and ATB for GROMOS [319]. However, none of them directly focus on the parameterization of PTMs, likely because of the complexity of the development of parameters required for most PTMs. Therefore, the necessity of having computational tools allowing an automatic parameterization of PTMed protein structures to be used in MD simulations resulted in the development of some new web-servers, such as FF_PTM (http://selene.princeton.edu/FFPTM/) and Vienna-PTM (http://vienna-ptm.univie.ac.at). FF_PTM focuses on expanding the existing AMBER force field (i.e., ff03) including parameters for 32 PTMs. In particular, it is characterized by parameters that describe the attachment of small molecules (e.g., phosphorylation, methylation or acetylation) and the covalent interaction with acyl chains such as palmitic acid (palmitoylation) and geranylgeranyl pyrophosphate (geranylgeranylation). On the other hand, Vienna-PTM is a web platform designed for introducing PTMs on PDB structures to run simulations using the GROMOS 54A7 and 45A3 force fields.

Summary and outlook

The chemical modification of amino acids plays an important role in a myriad of cellular processes that range from protein localization to disease development and aging. Enormous efforts, which combine the development of both experimental and compun class="Gene">tationpan>al methods, have beenpan> put inpan> the past 2 decades inpan> order to unpan>derstanpan>d PTM mechanpan>isms anpan>d effects for proteinpan> structure, dynamics anpan>d funpan>ctionpan>. In this review we summarized the main in silico tools mainly available as anpan> onpan>linpan>e webserver for studying PTMs (Table 1). Recently, an integrative platform (dbPTM: http://dbptm.mbc.nctu.edu.tw/) has also become available. Originally developed as a comprehensive database of experimentally verified PTMs, dbPTM collects all available databases and webserver resources considering also other PTMs, like succinylation and S-glutathionylation. Although this platform does not provide an exhaustive description for the case of lipidation, it includes also standalone software (not discussed here), offering thus a complementary source of information to this review. With an increasing amount of experimental data available every day, we thinpan>k that, as the existinpan>g onpan>es will keep improvinpan>g their performanpan>ce, manpan>y other methods will emerge inpan> the future. Although most of the web-servers available are based on a sequence-based analysis of training data sets, some of them have also started to take into account other interesting properties, such as evolutionary information (e.g., PhosphoOrtholog, iUbiq), SASA (e.g., MASA, METhK, SUMOAMVR) and physiochemical properties (e.g., Pmes, SUMOhydro). Altogether, these approaches are only rarely considering the molecular features associated with PTMs and the molecular impact they have for protein function in general. Within this context, molecular modeling and simulations have the capability to complement experimental and bioinformatics techniques providing a molecular description of the effect of PTMs on protein structures and stability. However, the lack of suitable tools anpan>d parameters for treatinpan>g PTMs inpan> proteinpan>s has limited so far the characterizationpan> of these covalenpan>t modificationpan>s. As some automatic tools (e.g., FF_PTM anpan>d Vienpan>na-PTM) have recenpan>tly appeapan> class="Disease">red providing an online platform to parameterize post-translational modified proteins suitable for running atomistic MD simulations with AMBER or GROMOS force fields, for most PTMs ad hoc parameterizations still need to be developed. Similarly, coarse-grained force fields still lack reliable and robust models for dealing with PTMs, as well as systematic protocols to produce accurate parameters. Nowadays, with the constant increment of computing power and the availability of always more accurate force fields for biomolecules, which accompanpan>y the tireless advanpan>ces onpan> the experimenpan>tal side, it is possible to achieve a more precise anpan>d faithful descriptionpan> of biological systems inpan> their physiological conpan>ditionpan>s usinpan>g molecular modelinpan>g anpan>d simulationpan>. For inpan>stanpan>ce, the advanpan>ces inpan> lipidomic analyses have provided a much more detailed view of membrane composition, allowing the construction of more realistic membrane models [320], [321] to better investigate membrane biophysical properties and their interplay with integral and peripheral membrane proteins [322], [323]. Several computational tools have been developed with this aim in mind, such as CHARMM-GUI [308] and LipidBuilder [324]. Along the same lines, it is also clear that protein PTMs are another important layer of complexity that integrally defines and modulates protein function and, for this reason, needs to be considered at all levels of both experimenpan>tal anpan>d compupan> class="Gene">tational investigation. Therefore, the computational advances of bioinformatics and physics-based molecular modeling and simulation techniques appear as a fundamental requirement to complement the experimental investigation of PTMs' impact on cellular processes.

318 in total

Review 1. What does S-palmitoylation do to membrane proteins?

Authors: Sanja Blaskovic; Mathieu Blanc; F Gisou van der Goot
Journal: FEBS J Date: 2013-04-18 Impact factor: 5.542

2. Protein arginine methyltransferases: guardians of the Arg?

Authors: Frank O Fackelmayer
Journal: Trends Biochem Sci Date: 2005-10-27 Impact factor: 13.807

3. Fas-induced caspase denitrosylation.

Authors: J B Mannick; A Hausladen; L Liu; D T Hess; M Zeng; Q X Miao; L S Kane; A J Gow; J S Stamler
Journal: Science Date: 1999-04-23 Impact factor: 47.728

4. Force field parameters for the simulation of modified histone tails.

Authors: Cédric Grauffel; Roland H Stote; Annick Dejaegere
Journal: J Comput Chem Date: 2010-10 Impact factor: 3.376

5. Chemical dissection of the APC Repeat 3 multistep phosphorylation by the concerted action of protein kinases CK1 and GSK3.

Authors: Anna Ferrarese; Oriano Marin; Victor H Bustos; Andrea Venerando; Marcelo Antonelli; Jorge E Allende; Lorenzo A Pinna
Journal: Biochemistry Date: 2007-10-02 Impact factor: 3.162

6. Two endoplasmic reticulum (ER)/ER Golgi intermediate compartment-based lysine acetyltransferases post-translationally regulate BACE1 levels.

Authors: Mi Hee Ko; Luigi Puglielli
Journal: J Biol Chem Date: 2008-11-14 Impact factor: 5.157

7. S-Nitrosylation of histone deacetylase 2 induces chromatin remodelling in neurons.

Authors: Alexi Nott; P Marc Watson; James D Robinson; Luca Crepaldi; Antonella Riccio
Journal: Nature Date: 2008-08-27 Impact factor: 49.962

8. Proteome-wide post-translational modification statistics: frequency analysis and curation of the swiss-prot database.

Authors: George A Khoury; Richard C Baliban; Christodoulos A Floudas
Journal: Sci Rep Date: 2011-09-13 Impact factor: 4.379

9. SUMOhydro: a novel method for the prediction of sumoylation sites based on hydrophobic properties.

Authors: Yong-Zi Chen; Zhen Chen; Yu-Ai Gong; Guoguang Ying
Journal: PLoS One Date: 2012-06-14 Impact factor: 3.240

10. KinasePhos 2.0: a web server for identifying protein kinase-specific phosphorylation sites based on sequences and coupling patterns.

Authors: Yung-Hao Wong; Tzong-Yi Lee; Han-Kuen Liang; Chia-Mao Huang; Ting-Yuan Wang; Yi-Huan Yang; Chia-Huei Chu; Hsien-Da Huang; Ming-Tat Ko; Jenn-Kang Hwang
Journal: Nucleic Acids Res Date: 2007-05-21 Impact factor: 16.971

47 in total

10. CLoNe: automated clustering based on local density neighborhoods for application to biomolecular structural ensembles.

Authors: Sylvain Träger; Giorgio Tamò; Deniz Aydin; Giulia Fonti; Martina Audagnotto; Matteo Dal Peraro
Journal: Bioinformatics Date: 2021-05-17 Impact factor: 6.937