Literature DB >> 24741624

Model for vaccine design by prediction of B-epitopes of IEDB given perturbations in peptide sequence, in vivo process, experimental techniques, and source or host organisms.

Humberto González-Díaz1, Lázaro G Pérez-Montoto2, Florencio M Ubeira2.   

Abstract

Perturbation methods add variation terms to a known experimental solution of one problem to approach a solution for a related problem without known exact solution. One problem of this type in immunology is the prediction of the possible action of epitope of one peptide after a perturbation or variation in the structure of a known peptide and/or other boundary conditions (host organism, biological process, and experimental assay). However, to the best of our knowledge, there are no reports of general-purpose perturbation models to solve this problem. In a recent work, we introduced a new quantitative structure-property relationship theory for the study of perturbations in complex biomolecular systems. In this work, we developed the first model able to classify more than 200,000 cases of perturbations with accuracy, sensitivity, and specificity >90% both in training and validation series. The perturbations include structural changes in >50000 peptides determined in experimental assays with boundary conditions involving >500 source organisms, >50 host organisms, >10 biological process, and >30 experimental techniques. The model may be useful for the prediction of new epitopes or the optimization of known peptides towards computational vaccine design.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 24741624      PMCID: PMC3987976          DOI: 10.1155/2014/768515

Source DB:  PubMed          Journal:  J Immunol Res        ISSN: 2314-7156            Impact factor:   4.818


1. Introduction

National Institute of Allergy and Infectious Diseases (NIAID) supported the launch, in 2004, of the Immune Epitope Database (IEDB), http://www.iedb.org/ [1-4]. The IEDB system withdrew information from approximately 99% of all papers published to date that describe immune epitopes. In doing so, IEDB system analyses over 22 million PubMed abstracts and subsequently curated ≈13 K references, including ≈7 K manuscripts about infectious diseases, ≈1 K about allergy topics, ≈4 K about autoimmunity, and 1 K about transplant/alloantigen topics [5]. IEDB lists a huge amount of information about the molecular structure as well as the experimental conditions (c ) in which different ith molecules were determined to be immune epitopes or not. This explosion of information makes necessary both query/display functions for retrieval of known data from IEDB as well predictive tools for new epitopes. Salimi et al. [5] reviewed advances in epitope analysis and predictive tools available in the IEDB. In fact, IEDB analysis resource (IEDB-AR: http://tools.iedb.org/) is a collection of tools for prediction of molecular targets of T- and B-cell immune responses (i.e., epitopes) [6, 7]. On the other hand, Quantitative Structure-Activity/Property Relationships (QSAR/QSPR) techniques are useful tool to predict new drugs, RNA, drug-protein complexes, and protein-protein complexes. In general, QSAR/QSPR-like methods transform molecular structures into numeric molecular descriptors (λ ) in a first stage and later fit a model to predict the biological process. For example, DRAGON [8-10], CODESSA [11, 12], MOE [13], TOPS-MODE [14-17], TOMOCOMD [18, 19], and MARCH-INSIDE [20] are among the most used softwares to calculate molecular descriptors based on quantum mechanics (QM) and/or graph theory [21-27]. The software STATISTICA [28] and WEKA [29] are often used to perform multivariate statistics and/or machine learning (ML) analysis in order to preprocess data and later fit the final QSAR/QSPR model using techniques like principal component analysis (PCA), linear discriminant analysis (LDA), support vector machine (SVM), or artificial neural networks (ANN) [28]. QSAR/QSPR models are also important in immunoinformatics to predict the propensity of different molecular structures to play different roles in immunological processes. They include skin vaccine adjuvants and sensitizers [30-38], drugs and their activity/toxicity protein targets in the immune system [39], and epitopes [40-49]. Moreover, Reche and Reinherz [50] implemented PEPVAC (promiscuous epitope-based vaccine), a web server for the formulation of multiepitope vaccines that predict peptides binding to five distinct HLA class I supertypes (A2, A3, B7, A24, and B15). PEPVAC can also identify conserved MHC ligands, as well as those with a C-terminus resulting from proteasomal cleavage. The Dana-Farber Cancer Institute hosted the PEPVAC server at the site http://immunax.dfci.harvard.edu/PEPVAC/. To close with a last example, Lafuente and Reche [51] reviewed the available methods for predicting MHC-peptide binding and discussed their most relevant advantages and drawbacks. In many complex QSPR-like problems in immunoinformatics, like in other areas, we know the exact experimental result (known solution) of the problem, but we are interested in the possible result obtained after a change (perturbation) on one or multiple values of the initial conditions of the experiment (new solution). For instance, we often know, for large collections of ith molecules (m ), organic compounds, drugs, xenobiotics, and/or peptide sequences, the efficiency of the compound ε(c ) as adjuvant, action as epitope, immunotoxicity, and/or the interaction (affinity, inhibition, etc.) with immunological targets. In addition, we often known for each molecule the exact conditions (c ) of assay for the initial experiment including structure of the molecule m (drug, adjuvant, and sequence of the peptide), source organism (so), host organism (ho), immunological process (ip), experimental technique (tq), concentration, temperature, time, solvents, and coadjuvants. This is the case of big data retrieved from very large databases like IEDB [1-4] and CHEMBL [52]. However, we do not know the possible result of the experiment if we change at least one of these conditions (perturbation). We refer to small changes or perturbations in both structure and condition for input or output variables. It means that we include changes in ho, so, ip, and tq, changes of the compound by one analogue compound with similar structure, changes in the sequence of the epitope (artificial by organic synthesis or natural mutations), and polarity of the solvent or coadjuvants. In these cases, we could use a perturbation theory model to solve the QSAR/QSPR problem. Perturbation theory includes methods that add “small” terms to a known solution of a problem in order to approach a solution to a related problem without known solution. Perturbation models have been widely used in all branches of science from QM to astronomy and life sciences including chaos or “butterfly effect,” Bohr's atomic theory, Heisenberg's mechanics, Zeeman's and Stark's effects, and other models with applications in like protein spectroscopy and others [53-57]. In a very recent work Gonzalez-Diaz et al. [58] formulated a general-purpose perturbation theory or model for multiple-boundary QSPR/QSAR problems. However, there is not report in the immunoinformatics literature of a general QSPR perturbation model for IEDB B-epitopes. Here we report the first example of QSPR-perturbation model for B-epitopes reported in IEDB able to predict the probability of occurrence of an epitope after a perturbation in the sequence, the experimental technique, the exposition process, and/or the source or host organisms.

2. Materials and Methods

2.1. Molecular Descriptors for Peptides

We calculated the molecular descriptors of the structure of peptides using the software MARCH-INSIDE (MI) based on the algorithm with the same name [59]. The MI approach uses a Markov Chain method to calculate the kth mean values of different physicochemical molecular properties λ(m ) for ith molecules (m). These λ(m ) values are calculated as an average of λ(m ) values for all atoms placed at topological distance d ≤ k; which are in turn the means of atomic properties (λ ) for all atoms in the molecule and its neighbors placed at d = k. For instance, it is possible to derive average estimations of molecular refractivities MR(m ), partition coefficients P(m ), and hardness η(m ) for atoms placed at different topological distances d ≤ k. In this first work, we calculated only one type of λ(m ) values. We calculated for all peptides the average value χ(m ) of all the atomic electronegativities χ for all δ atoms connected to the ith atom (i → j) and their neighbors placed at a distance d ≤ 5 [59]: We calculate the probabilities p(λ ) for any atomic property including p(χ ) using a Markov Chain model for the gradual effects of the neighboring atoms at different distances in the molecular backbone. This method has been explained in detail in many previous works so we omit the details here [59].

2.2. Electronegativity Perturbation Model for Prediction of B-Epitopes

Very recently Gonzalez-Diaz et al. [58] formulated a general-purpose perturbation theory or model for multiple-boundary QSPR/QSAR problems. We adapted here this new theory or modeling method to approach to the peptide prediction problem from the point of view of perturbation theory. Let be a set of ith peptide molecules denoted as m with a value of efficiency ε as epitopes experimentally determined under a set of boundary conditions c ≡ (c 0, c 1, c 2, c 3,…, c ). We put the main emphasis here on peptides reported in the database IEDB. In this sense, the boundary conditions c used here are the same reported in this database, c 0 = is the specific peptide, c 1 = so, c 2 = ho, c 3 = ip, and c 4 = tq. In general, so is the organism that expresses the peptide (but it can include also artificial peptides, cellular lines, etc.), ho is the host organism exposed to the peptide by means of the bp detected with tq. As our analysis, based on the data reported by IEDB we are unable to work with continuous values of epitope activity ε . Consequently, we have to predict the discrete function of B-epitope efficiency λ(ε ) = 1 for epitopes reported in the conditions c and λ(ε ) = 0, otherwise. Our main aim is to predict the shift or change in a function of the output efficiency Δλ(ε ) = λ(ε )ref − λ(ε )new that takes place after a change, variation, or perturbation (ΔV) in the structure and/or boundary conditions of a peptide of reference. But we know the efficiency of the process of reference λ(ε )ref in addition to the molecular structure and the set of conditions c for initial (reference) and final processes (new). Consequently, to predict Δλ(ε ) we have to predict only λ(ε )new the efficiency function of the new state obtained by a change in the structure of the peptide and/or the boundary conditions. Let ΔV be a perturbation in a function λ; we can define V as the state information function for the reference and new states. According to our recent model [58], we can write V as a function of the conditions and structure of the peptide m as follows. In fact, the variational state functions V have to be written in pairs in order to describe the initial (reference) and final (new) states of a perturbation, as follow: The state function V is for the ith peptide measured under a set of c boundary conditions in output, final, or new state. The conjugated state function V is for the qth peptide measured under a set of c boundary conditions for the input, initial, or reference state. The difference ΔV between the new (output) state and the reference (input) state is the additive perturbation [58]. Consider Equation (3) described before opens the door to test different hypothesis. A simple hypotheses is H0: existence of one small and constant value of the perturbation function ΔV = e 0 for all the pairs of peptides and a linear relationship between perturbations of input/output boundary conditions with coefficients a , b , c , and d . Consider We can use elemental algebraic operations to obtain from these equations an expression for efficiency as epitope of the peptide λ(ε )new. In this case, considering b ≈ d , we can obtain the different expressions; the last may be very useful to solve the QSRR problem for the large datasets formed by IEDB B-epitopes. Consider

3. Results and Discussion

We propose herein, for the first time, a QSRR-perturbation model able to predict variations in the propensity of a peptide to act as B-epitope taking into consideration the propensity of a peptide of reference and the changes in peptide sequence, immunological process, host organism, source organisms, and the experimental technique used. The best QSPR-perturbation model found here with LDA was The first input term is the value λ(ε )ref is the scoring function λ of the efficiency of the initial process ε (known solution). The function λ(ε )ref = 1 if the ith peptide could experimentally be demonstrated to be a B-epitope in the assay of reference (reference) carried out in the conditions c , λ(ε )ref = 0 otherwise. The variational-perturbation terms ΔΔχ are at the same time terms typical of perturbation theory and moving average (MA) functions used in Box-Jenkin models in time series [60]. These new types of terms account both for the deviation of the electronegativity of all amino acids in the sequence of the new peptide with respect to the peptide of reference and with respect to all boundary conditions. In Table 2, we give the overall classification results obtained with this model. Speck-Planche et al. [61-63] introduced different multitarget/multiplexing QSAR models that incorporate this type of information based on MAs. The results obtained with the present model are excellent compared with other similar models in the literature useful for other problems including moving average models [64, 65] or perturbation models [58]. Notably, this is also the first model combining both perturbation theory and MAs in a QSPR context.
Table 1

Results of QSPR-perturbation model for IEDB B-Epitopes.

DataStat.Pred.Predicted epitope perturbations
subsetparam.% λ(ε ij) = 1 λ(ε ij) = 0
λ(ε ij) = 1Sp97.0 84607 2660
λ(ε ij) = 0Sn93.64354 63548

Total trainAc95.5

λ(ε ij) = 1Sp97.1 28060 840
λ(ε ij) = 0Sn93.31485 20641

Total cvAc95.4

Bold font is used to highlight the number of cases correctly classified by the model.

The other input terms are the following. The first Δχ seq = χ(m )ref − χ(m )new is the perturbation term for the variation or in the mean value of electronegativity for all amino acids in the sequence of the peptide of reference. The remnant input variables of the model ΔΔχ = Δχ − Δχ = [χ(m )ref − *χ(c )ref]−[χ(m )new − *χ(c )new] quantify values of the conditions of the new assay cj-new that represent perturbations with respect to the initial conditions c -ref of the assay of reference. The quantities *χ(c ) and *χ(c ) are the average values of the mean electronegativity values χ(m ) and χ(m ) for all new and reference peptides in IEDB that are epitopes under the jth or rth boundary condition. The values of these terms have been tabulated for >500 source organisms, >50 host organisms, >10 biological process, and >30 experimental techniques. We must substitute the values of χ(m ) and χ(m ) of the new and reference peptides and the tabulated values of *χ(c ) and *χ(c ) for all combinations of boundary conditions to predict the perturbations of the action as epitope of peptides. In doing so we can found the optimal sequence and boundary conditions towards the use of the peptide in the development of a vaccine. In Table 2 we give some of these values of *χ(c ) and *χ(c ).
Table 2

Average values and count of input-output cases for different organisms, process, and techniques.

Source organism (so) N in N out *χ
Homo sapiens 38920392742.685
Plasmodium falciparum 1066994462.704
Hepatitis C virus 9935102392.683
Bos taurus 567157802.690
Canine parvovirus 565556372.693
Foot-mouth disease virus 406241762.676
Triticum aestivum 376938872.703
Bacillus anthracis 360236002.699
Human papillomavirus 331634142.693
Human herpesvirus 302631322.684
Gallus gallus 285028292.689
Arachis hypogaea 264826702.687
Mycobacterium tuberculosis 263725932.688
Clostridium botulinum 258827222.685
SARS coronavirus 255027042.686
Mus musculus 233422872.682
Hepatitis B virus 200720662.680
Helicobacter pylori 195817962.695
Hevea brasiliensis 193819582.697
Hepatitis E virus 192819412.685
Shigella flexneri 187817012.699
Dengue virus 2 176718282.679
Staphylococcus aureus 175716612.694
Treponema pallidum 173917552.691
Escherichia coli 172116782.689
Murine hepatitis virus 157516032.692
Haemophilus influenzae 154515872.695
Streptococcus mutans 152315372.697
Puumala virus (strain) 150515742.689
Chlamydia trachomatis 140215462.704
Human respiratory  virus 134713982.682
Borrelia burgdorferi 122812372.698
Hepatitis delta virus 118211992.690
Streptococcus pyogenes 118112512.697
Porphyromonas gingivalis 114310852.688
Human enterovirus 110611322.689
Influenza A virus 108510862.687
Mycoplasma hyopneumoniae 104410242.695
Rattus norvegicus 102510392.689
Bordetella pertussis 10119602.685
Human T-lymphotropic virus 99610312.680
Anaplasma marginale 9778572.707
Measles virus strain 8048102.688
Fasciola hepatica 8038572.685
Neisseria meningitidis 7898532.696
Human poliovirus 7667802.690
Tityus serrulatus 7647752.680
Torpedo californica 7527882.687
Cryptomeria japonica 7197942.680
Mycobacterium bovis 7177332.688
Trypanosoma cruzi 6917772.704
Andes virus CHI-7913 6796872.690
Bovine papillomavirus 6726652.692
Human hepatitis 6706962.688
Leishmania infantum 6597352.688
Human parvovirus 6496912.683
Poa pratensis 6486642.692
Aspergillus fumigatus 6427092.677
Duck hepatitis 5876032.688
Olea europaea 5715772.692
Porcine reproductive 5155142.681
Fagopyrum esculentum 5094972.685
Juniperus ashei 5055682.672
Mycobacterium leprae 4895422.690
Glycine max 4775092.685
D. pteronyssinus 4554642.680
Plasmodium vivax 4534462.690
Chlamydophila pneumoniae 4464622.690
Pseudomonas aeruginosa 4434542.691
Vibrio cholera 4274262.694
Streptococcus sp. 4264252.691
Mycobacterium avium 4254152.689
Dermatophagoides farinae 4103902.693
Human coxsackievirus 4063922.694
Equine infectious  virus 4044192.688
Babesia equi 3833712.696
Prunus dulcis 3833792.708
Human adenovirus 3754052.686
Theileria parva 3663712.713
Candida albicans 3653702.690
Porcine endogenous 3553512.692
Ovis aries 3523502.683
Chironomus thummi 3473382.691
Sus scrofa 3433622.686
Bovine leukemia virus 3333292.676
Ricinus communis 3293142.692
Androctonus australis 3223572.685
Renibacterium salmoninarum 3193502.690
Orientia tsutsugamushi 3093722.705
Anacardium occidentale 2933062.693
Conus geographus 2892952.660

Host organism (ho) N in N out *χ

Homo sapiens 257293910932.6856
Mus musculus 107867514662.6873
Oryctolagus cuniculus 65053314332.6900
Bos taurus 1533320722.6909
Rattus norvegicus 945035622.6876
Aotus sp. 904439332.6879
Sus scrofa 772534642.6873
Gallus gallus 75079972.6790
Canis lupus 660433342.6906
Macaca mulatta 526125692.6993
Ovis aries 395316532.6836
Equus caballus 394320992.6842
Cavia porcellus 345816882.6833
Capra hircus 218211272.6830
Aotus nancymaae 16598522.6837
Pan troglodytes 16147322.6757
Marmota monax 11005092.7011
Felis catus 9012792.6838
Myodes glareolus 8143882.6863
Anas platyrhynchos 6883422.6880
Homo sapiens  (human) 5082702.6851
Trichosurus vulpecula 4561262.6921
Mesocricetus auratus 4381042.6909
Macaca cyclopis 3821932.6871
O. tshawytscha 3331592.6929
Macaca fuscata 1881002.6667
Cricetulus migratorius 1711422.7008
Camelus dromedarius 171892.6886
Dicentrarchus labrax 121552.6759
Macaca fascicularis 96522.6793
Saimiri sciureus 92442.6900
Canis familiaris 77422.6850
Rattus rattus 72312.6760
Callithrix pygmaea 67302.6920
Chinchilla lanigera 41242.6729
Aotus lemurinus 30192.6860
Papio cynocephalus 27132.7267
Aotus griseimembra 26122.7000
Mustela vison 18102.7000
Chlorocebus aethiops 15102.6875
Bos indicus 1342.6925
Oncorhynchus mykiss 1042.6700
M. macquariensis 962.6600
Cricetulus griseus 842.6900
Aotus trivirgatus 742.7000

Process type (pt) N in N out *χ

AID1111971085362.6876
OOID32419326172.6868
OAID19210189542.6801
OOA15863163032.6902
NI13430152062.6845
EWEIR481848642.6843
EEE311335462.6906
OOD280627992.6887
AICD107710952.6812
EWED6966862.6879
DEWED2803372.6804
TT2602152.6806
OOC1531372.6800

Technique (tq) N in N out *χ

ELISA1334581351092.6871
WI33627332922.6887
ACAbB778090682.6862
PhDIP745044962.6771
RIA524152182.6858
IFAIH445445812.6879
NIAA422243162.6892
FIA225522762.6897
PAC131212192.6837
IP112710892.6886
SPR7586392.6860
FACS6086472.6907
Other5024952.6813
SAC4843932.6878
ELISPOT3964122.6979
RDAT3663232.6859
EDAT2843302.6800
XRC2312272.6880
MS2091792.6849
PFF1711532.6820
AbDPO1622952.6968
CdC1462052.6895
IAbBA1441832.6940
IOT1241062.6835
HAGGI1151222.6834
IgMHR89902.6929
EAAA841392.6922
HS82672.6791
AbdCC731182.6897
AGG50602.6980
CM50572.6863

The indicates that quantities like χ is the average value of the mean electronegativity (m ) for all the peptides in IEDB that are epitopes for the same boundary condition.

In Table 3 we depict the sequences and input-output boundary conditions for top perturbations present in IEDB. All these perturbations have observed value of λ(ε )new = 1 and predicted value also equal to 1 with a high probability. See Supplementary Material available online at http://dx.doi.org/10.1155/2014/768515 file contains a full list of >200,000 cases of perturbations.
Table 3

Top100 values of p1 for positive perturbations in training series.

New experimentExperiment of referenceInput perturbation terms
IDESequencehosoiptqIDESequencehosoiptqΔχ ΔΔχ ho ΔΔχ so ΔΔχ ip ΔΔχ tq
115153MKGVVCTRIYEKV Homo sapiens Homo sapiens OAIDELISA115155NNQRKKAKNTPFNMLKRERN Mus musculus Dengue virus 2 AIDWI0.010.0120.0040.0170.012
52124QQQPP Homo sapiens Triticum aestivum OOAELISA52128QQQQGGSQSQKGKQQ Homo sapiens Glycine max OOAWI00−0.01800.002
3639APLGVT Homo sapiens Hepatitis E virus EWEIRELISA3652APLTRGSCRKRNRSPER Homo sapiens Human herpesvirus OOIDELISA0.040.040.0390.0430.04
135959LTRAYAKDVKFG Homo sapiens Homo sapiens OAIDELISA135963NGQEEKAGVVSTGLIGGG Mus musculus MDAIDELISA−0.04−0.038−0.047−0.033−0.04
108075PREPQVY Homo sapiens Homo sapiens OAIDELISA108078PTSPSGVEEWIVTQVVPGVA Oryctolagus cuniculus Homo sapiens AIDACAbB−0.01−0.006−0.01−0.003−0.011
25113HVVDLP Homo sapiens Hepatitis E virus EWEIRELISA25126HWGNHSKSHPQR Mus musculus MDAIDELISA0.020.0220.0130.0230.02
48780PPFSPQ Homo sapiens Hepatitis delta virus OOIDELISA48782PPFTSAVGGVDHRS Mus musculus MDAIDSAC0.020.0220.0080.0210.021
40988LYVVAYQA Mus musculus Viscum album AIDELISA41004MAARLCCQLDPARDV Homo sapiens Hepatitis B virus OOIDELISA0.020.0180.0060.0190.02
50439QDAYNAAG Mus musculus Mycobacterium scrofulaceum AIDELISA50445QDCNCSIYPGHASGHRMAWD Homo sapiens Hepatitis C virus OOIDELISA0.040.0380.0280.0390.04
98849KIPAVFKIDA Homo sapiens Bos taurus DEWEDWI98850KKGSEEEGDITNPIN Homo sapiens Arachis hypogaea OOAIFAIH−0.05−0.05−0.053−0.04−0.051
116171TQDQDPBBHFFKNIVTPR Homo sapiens Homo sapiens OAIDACAbB116286CGKGLSATVTGGQKGRGSR Oryctolagus cuniculus Mus musculus AIDMS0.010.0140.0080.0170.009
123442LLKDLRKN Homo sapiens Borna disease virus EWEIRWI123443LLTEHRMTWDPAQPPRDLTE Homo sapiens Homo sapiens OOIDELISA−0.02−0.02−0.025−0.017−0.022
47858PHVVDL Homo sapiens Hepatitis E virus EWEIRELISA47860PHWIKKPNRQGLGYYS Capra hircus Human T-lymphotropic virus AIDACAbB0.010.0070.0050.0130.009
61783STNKAVVSLS Bos taurus Bovine respiratory AIDELISA61792STNPKPQRKTKRNTNRRPQD Homo sapiens Hepatitis C virus OOIDELISA0.030.0250.0190.0290.03
118210VMLYQISEE Homo sapiens Homo sapiens OAIDWI118217VTKYITKGWKEVH Oryctolagus cuniculus Homo sapiens AIDELISA0.050.0540.050.0570.048
130944LFKHS Oryctolagus cuniculus Rattus norvegicus AIDELISA130956LPPRVTPKWSLDAWSTWR Homo sapiens MDOODWI0.010.006−0.0020.0110.012
23028GVKYA Homo sapiens MDOAIDWI23032GVLAKDVRFSQV Homo sapiens MDOOIDELISA0000.007−0.002
51199QKKAIE Oryctolagus cuniculus Vibrio cholerae AIDELISA51204QKKNKRNTNRRPQDV Homo sapiens Hepatitis C virus OOIDELISA0.030.0260.0180.0290.03
144783SHVVTMLDNF Homo sapiens Homo sapiens NIELISA144786SMNRGRGTHPSLIWM Mus musculus MDAIDACAbB0.030.0320.0230.0330.029
134343DLYIK Mus musculus Human papillomavirus AIDNIAA134344DMAQVTVGPGLLGVSTL Mus musculus Homo sapiens AIDWI00−0.00900
38321LNQLAGRM Anas platyrhynchos Duck hepatitis AIDELISA38323LNQTARAFPDCAICWEPSPP Oryctolagus cuniculus Bovine leukemia virus AIDACAbB−0.01−0.008−0.022−0.01−0.011
144657GQITVDMMYG Homo sapiens Homo sapiens OAIDELISA144661GREGYPADGGCAWPACYC Oryctolagus cuniculus MDAIDWI0.020.0240.0130.0270.022
21084GLQN Mus musculus Chlamydia trachomatis AIDELISA21093GLRAQDDFSGWDINTPAFEW Mus musculus Mycobacterium tuberculosis AIDWI0.030.030.0130.030.032
98453SGFSGSVQFV Oryctolagus cuniculus Neisseria meningitidis AIDELISA98456SICSNNPTCWAICKRIPNKK Mus musculus Human respiratory virus AIDIFAIH0.040.0370.0270.040.041
98453SGFSGSVQFV Mus musculus Neisseria meningitidis AIDELISA98456SICSNNPTCWAICKRIPNKK Mus musculus Human respiratory virus AIDIFAIH0.040.040.0270.040.041
107107EAIQP Rattus norvegicus Homo sapiens AIDELISA107110EKERRPSPIGTATLL Homo sapiens MDOOAELISA0.050.0480.0430.0530.05
110857FTGEAYSYWSAK Homo sapiens Mycoplasma penetrans EWEIRELISA110859GEESRISLPLPNFSSLNLRE Mus musculus Homo sapiens AIDFIA00.002−0.0170.0030.003
36315LGSAYP Mus musculus Mycobacterium leprae MDELISA36317LGSGAFGTIYKG Mus musculus Avian erythroblastosis virus AIDACAbB0.010.0100.0110.009
122034WNPAD Rattus norvegicus Torpedo californica AIDELISA122035WNPADYGGIKWNPADYGGIK Rattus norvegicus MDAIDRIA0.010.010.0010.010.009
25013HVADIDKLID Mus musculus Puumala virus Kazan AIDELISA25021HVAPTHYVTESDASQRVTQL Homo sapiens Hepatitis C virus OOIDELISA0−0.002−0.014−0.0010
36162LGIHE Oryctolagus cuniculus Candida albicans AIDELISA36166LGIMGEYRGTPRNQDLYDAA Mus musculus Human respiratory  virus AIDRIA0−0.003−0.0070−0.001
67253TWEVLH Mus musculus Plasmodium vivax AIDELISA67257TWGENETDVLLLNNTRPPQ Homo sapiens Hepatitis C virus OOIDACAbB−0.02−0.022−0.027−0.021−0.021
50990QGYRVSSYLP Homo sapiens Hevea brasiliensis OOAWI50998QHEQDRPTPSPAPSRPFSVL Homo sapiens Hepatitis E virus OOIDELISA0.010.01−0.0020.0070.008
100458RDVLQLYAPE Mus musculus Bacillus anthracis AIDELISA100462RFSTRYGNQNGRIRVLQRFD Homo sapiens Arachis hypogaea EWEDELISA0.030.0280.0180.030.03
111036TESTFTGEAYSV Homo sapiens Mycoplasma penetrans EWEIRELISA111039TGVPIDPAVPDSSIVPLLES Bos taurus Bovine papillomavirus AIDELISA0.030.0350.020.0330.03
117919IFIEME Homo sapiens Homo sapiens OAIDWI117921IGIIDLIEKRKFNQ Mus musculus Homo sapiens AIDWI0.030.0320.030.0370.03
7127CTDTDKLF Oryctolagus cuniculus Shigella flexneri AIDELISA7128CTDVSTAIHADQLTPAW Homo sapiens SARS coronavirus OOIDELISA0.010.006−0.0030.0090.01
112253PGQSPKL Homo sapiens Homo sapiens OAIDELISA112255PIRALVGDEVELPCRISPGK Mus musculus Homo sapiens AIDELISA0.010.0120.010.0170.01
122034WNPAD Rattus norvegicus Torpedo californica AIDELISA122038WNPDDYGGVKWNPDDYGGVK Rattus norvegicus MDAIDRIA00−0.0090−0.001
131878FLMLVGGSTL Homo sapiens Homo sapiens OAIDACAbB131879FLVAHTRARAPSAGERARRS Mus musculus Mus musculus AIDNIAA0.030.0320.0280.0370.033
70664VQVVYDYQ Homo sapiens Treponema pallidum OOIDELISA70667VQWMNRLIAFAFAGNHVSP Homo sapiens Hepatitis C virus OOIDELISA0.050.050.0410.050.05
71545VTV Homo sapiens Helicobacter pylori OOIDELISA71559VTVRGGLRILSPDRK Homo sapiens Arachis hypogaea OOAWI0.040.040.0320.0430.042
127856TDVRYKD Mus musculus Mus musculus AIDACAbB127857TDVRYKDDMYHFFCPAIQAQ Mus musculus Mus musculus AIDPFF0.010.010.010.010.006
112149GVGWIRQ Homo sapiens Homo sapiens OAIDELISA112152HHPARTAHYGSLPQKSHGRT Homo sapiens Homo sapiens AIDELISA0000.0070
119581FSCSVMHE Homo sapiens Homo sapiens OAIDELISA119582GLQLIQLINVDEVNQI Mus musculus Homo sapiens AIDRIA−0.01−0.008−0.01−0.003−0.011
144657GQITVDMMYG Homo sapiens Homo sapiens OAIDELISA144659GREGYPADGGAAGYCNTE Oryctolagus cuniculus MDAIDWI−0.01−0.006−0.017−0.003−0.008
25013HVADIDKLID Mus musculus Puumala virus Kazan AIDELISA25022HVAPTHYVVESDASQRVTQV Homo sapiens Hepatitis C virus OOIDELISA0−0.002−0.014−0.0010
144652GMRGMKGLVY Homo sapiens Homo sapiens OAIDELISA144654GPHPTLEVVPMGRGS Mus musculus MDAIDELISA−0.02−0.018−0.027−0.013−0.02
104515HDCRPKKI Mus musculus La Crosse virus AIDIFAIH104521IGTLKKILDETVKDKIAKEQ Rattus norvegicus Streptococcus pyogenes AIDELISA−0.04−0.04−0.053−0.04−0.041
7367CYGDWA Homo sapiens Triticum aestivum OOAELISA7374CYGLPDSEPTKTNGK Mus musculus Tityus serrulatus AIDWI−0.02−0.018−0.043−0.023−0.018
7367CYGDWA Homo sapiens Triticum aestivum OOAELISA7374CYGLPDSEPTKTNGK Mus musculus Tityus serrulatus AIDWI−0.02−0.018−0.043−0.023−0.018
144610DFFTYK Mus musculus Porcine transmissible AIDWI144611DFNGSFDMNGTITA Oryctolagus cuniculus Escherichia coli AIDELISA−0.01−0.007−0.018−0.01−0.012
112047ASTRESG Homo sapiens Homo sapiens OAIDELISA112048ATASTMDHARHGFLPRHRDT Homo sapiens Homo sapiens AIDELISA0.050.050.050.0570.05
36136LGGVFT Homo sapiens Dengue virus 2 OOIDELISA36137LGGWKLQSDPRAYAL Homo sapiens Ambrosia artemisiifolia OOARIA0.010.010.0070.0130.009
115256FRELKDLKGY Homo sapiens Bos taurus DEWEDWI115261GDLEILLQKWENGECAQKKI Homo sapiens Bos taurus OOAFIA−0.01−0.01−0.010−0.009
129024KADQLYK Homo sapiens Homo sapiens OAIDELISA129026KAKKPAAAAGAKKAKS Oryctolagus cuniculus Homo sapiens AIDELISA0.030.0340.030.0370.03
148481YTRDLVYK Rattus norvegicus Homo sapiens AIDWI148483YVPIVTFYSEISMHSSRAIP Oryctolagus cuniculus MDAIDELISA00.002−0.0070−0.002
150850GY Mus musculus Human papillomavirus AIDELISA150853HIGGLSILDPIFGVL Homo sapiens Dermatophagoides farinae OOAACAbB0.040.0380.040.0430.039
107366FPPKPKD Homo sapiens Homo sapiens OAIDELISA107376GDRSGYSSPGSPG Mus musculus Homo sapiens AIDACAbB−0.03−0.028−0.03−0.023−0.031
114859ICGTDGVTYT Homo sapiens Gallus gallus OOAWI114865IVERETRGQSENPLWHALRR Rattus norvegicus Human herpesvirus AIDELISA0.040.0420.0350.0370.038
62149SVHLF Homo sapiens MDOAIDWI62150SVIALGSQEGALHQALAGAI Equus caballus West Nile virus AIDIFAIH−0.02−0.021−0.012−0.013−0.021
98455SGSVQFVPIQ Mus musculus Neisseria meningitidis AIDELISA98456SICSNNPTCWAICKRIPNKK Mus musculus Human respiratory  virus AIDIFAIH0.040.040.0270.040.041
61783STNKAVVSLS Bos taurus Bovine respiratory AIDELISA61791STNPKPQRKTKRNTNRRPQ Homo sapiens Hepatitis C virus EWEIRACAbB0.040.0350.0290.0370.039
100318NAPKTFQFIN Mus musculus Bacillus anthracis AIDELISA100319NASSELHLLGFGINAENNHR Homo sapiens Arachis hypogaea EWEDELISA0−0.002−0.01200
118947PFSAPPPA Homo sapiens Homo sapiens OAIDELISA118948PGAIEQGPADDPGEGPSTGP Homo sapiens Human herpesvirus NIACAbB−0.04−0.04−0.041−0.036−0.041
78323YSFRD Mus musculus Bluetongue virus 1 AIDELISA78341AALTAENTAIKKRNADAKA Homo sapiens Streptococcus mutans EEEELISA0.010.0080.0070.0130.01
145831IPLGTRP Mus musculus Human papillomavirus AIDELISA145841KEDFRYAISSTNEIGLLGA Sus scrofa Classical swine AIDPAC−0.04−0.04−0.045−0.04−0.043
119592HTFPAVLQ Homo sapiens Homo sapiens OAIDELISA119596IHIPSEKIWRPDLVLY Mus musculus Homo sapiens AIDRIA0.010.0120.010.0170.009
58780SKAANLSIIKMD Beet necrotic MDWI58783SKAFSNCYPYDVPDYASL Oryctolagus cuniculus Influenza A virus AIDRIA−0.01−0.004−0.014−0.009−0.013
115153MKGVVCTRIYEKV Homo sapiens Homo sapiens NIELISA115155NNQRKKAKNTPFNMLKRERN Mus musculus Dengue virus 2 AIDELISA0.010.0120.0040.0130.01
133629LPLRF Oryctolagus cuniculus Gallus gallus AIDACAbB133630LPPGLHVFPLASNRS Mus musculus MDAIDSPR−0.01−0.013−0.021−0.01−0.01
96215EEEEAEDKED Homo sapiens Homo sapiens OAIDELISA96216EEEGLLKKSADTLWNMQK Mus musculus Mus musculus AIDELISA0.080.0820.0780.0870.08
39782LTAASV Homo sapiens Triticum aestivum OOAELISA39788LTAELKIYSVIQAEINKHL Oryctolagus cuniculus Yersinia pestis AIDACAbB0.010.014−0.0010.0070.009
107479KFNWYVD Homo sapiens Homo sapiens OAIDELISA107482KGEPGLPGRGFPGFP Mus musculus Homo sapiens AIDACAbB00.00200.007−0.001
63569TETVNSDI Macaca mulatta Shigella flexneri AIDELISA63573TEVELKERKHRIEDAVRNAK Homo sapiens Mycobacterium leprae OOIDELISA0.050.0360.0410.0490.05
134028DDTIS Mus musculus Homo sapiens AIDNIAA134029DEDENQSPRSFQKKTR Oryctolagus cuniculus Homo sapiens AIDELISA0.050.0530.050.050.048
23028GVKYA Homo sapiens MDOAIDWI23032GVLAKDVRFSQV Homo sapiens MDEWEIRACAbB0000.004−0.003
115293IMCVKKILDK Homo sapiens Bos taurus DEWEDWI115295INPSKENLCSTFCKEVVRNA Homo sapiens Bos taurus OOAFIA−0.03−0.03−0.03−0.02−0.029
65105TLTPENTL Mus musculus Shigella flexneri AIDELISA65110TLTSGSDLDRCTTFDDV Oryctolagus cuniculus SARS coronavirus AIDELISA0.010.013−0.0030.010.01
134471PKPEQ Mus musculus Streptococcus pneumoniae AIDFACS134472PLLPGTSTTSTGPCKT Homo sapiens Hepatitis B virus AIDELISA0.020.0180.0190.020.016
142228LYCYEQLNDSSE Homo sapiens Human papillomavirus NIELISA142250NWGDEPSKRRDRSNSRGRKN Felis catus Feline infectious AIDELISA0.070.0680.0710.0730.07
21549GNYDFWYQS Homo sapiens Staphylococcus aureus OOIDELISA21553GNYNYKYRYLRHGKLRPFER Mus musculus SARS coronavirus AIDELISA0.030.0320.0210.0310.03
21549GNYDFWYQS Homo sapiens Staphylococcus aureus OOIDELISA21553GNYNYKYRYLRHGKLRPFER Oryctolagus cuniculus SARS coronavirus AIDELISA0.030.0340.0210.0310.03
34908LAPLGE Homo sapiens Hepatitis E virus EWEIRELISA34914LAPSTLRSLRKRRLSSP Homo sapiens Human herpesvirus OOIDELISA0.060.060.0590.0630.06
130454CLFPNNSYC Mus musculus MDAIDNIAA130456CRPQVNNPKEWSCAAC Homo sapiens MDOODACAbB0.010.0080.010.0110.007
19644GFVPSM Homo sapiens Hepatitis delta virus OOIDELISA19647GFVSASIFGFQAEVGPNNTR Oryctolagus cuniculus Vaccinia virus WR AIDELISA−0.01−0.006−0.013−0.009−0.01
20678GKRPE Mus musculus Streptococcus pyogenes AIDWI20680GKSKRDAKNNAAKLAVDKLL Mus musculus Vaccinia virus WR AIDWI0.010.010.0010.010.01
123278GYLKDLPTT Ovis aries Fasciola hepatica AIDELISA123282HACQKKLLKFEALQQEEGEE Rattus norvegicus Gallus gallus AIDPFF−0.02−0.016−0.016−0.02−0.025
141067PLSLEPDP Mus musculus Homo sapiens AIDFACS141073REGVRWRVMAIQ Mus musculus Homo sapiens AIDELISA0.060.060.060.060.056
139248SFAGTVIE Mus musculus Classical swine AIDWI139305TAAQITQRKWEAAREAEQRR Oryctolagus cuniculus Homo sapiens AIDIP0.030.0330.0260.030.03
104515HDCRPKKI Mus musculus La Crosse virus AIDIFAIH104520IAKEQENKETIGTLKKILDE Rattus norvegicus Streptococcus pyogenes AIDELISA−0.06−0.06−0.073−0.06−0.061
113517HLYADGLTD Mus musculus Human papillomavirus AIDIFAIH113518HNKIQAIELEDLLRYSKLYR Mus musculus Homo sapiens AIDELISA0.020.020.0110.020.019
156970VERHQ Homo sapiens Homo sapiens OAIDELISA156975WSSTVLRVSPTRTVP Mus musculus MDAIDELISA0.010.0120.0030.0170.01
78252PVQNLT Mus musculus Porphyromonas gingivalis AIDWI78253QGGCGRGWAFSATGAIEA Mus musculus Glycine max AIDELISA0.020.020.0170.020.018
6068CCPDKNKS Mus musculus Human herpesvirus AIDWI6074CCRHKQKDVGDVKQTLPPS Ovis aries MDAIDELISA0.010.0060.0040.010.008
53109RAGVCY Homo sapiens Triticum aestivum OOAELISA53116 RAILTAFSPAQDIWGTS Oryctolagus cuniculus SARS coronavirus AIDELISA−0.02−0.016−0.037−0.023−0.02
147041IPEQ Homo sapiens Triticum aestivum OOAWI147064KHQGAQYVWNRTA Homo sapiens Bos taurus OOIDWI0.060.060.0470.0570.06
70664VQVVYDYQ Oryctolagus cuniculus Treponema pallidum AIDELISA70667VQWMNRLIAFAFAGNHVSP Homo sapiens Hepatitis C virus OOIDELISA0.050.0460.0410.0490.05
134343DLYIK Mus musculus Human papillomavirus AIDPhDIP134344DMAQVTVGPGLLGVSTL Mus musculus Homo sapiens AIDPhDIP00−0.00900

4. Conclusions

It is possible to develop general models for vaccine design able to predict the results of multiple input-output perturbations in peptide sequence and experimental assay boundary conditions using ideas of QSPR analysis, perturbation theory, and Box and Jenkins MA operators. The electronegativity values calculated with MARCH-INSIDE seem to be good molecular descriptors for this type of QSPR-perturbation models.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper. Supplementary Material includes the sequences of peptides obtained from IEDB, boudary conditions (source organisms, host organisms, techniques, biological process) as well as the values of the input/output variables of the models calculated for all the cases present in the dataset used. These values have been obtained for all the input-output boundary conditions for the perturbations. Click here for additional data file.
  61 in total

1.  Quantitative structure-activity relationships for predicting skin and eye irritation.

Authors:  Grace Patlewicz; Rosemary Rodford; John D Walker
Journal:  Environ Toxicol Chem       Date:  2003-08       Impact factor: 3.742

2.  TOMOCOMD-CARDD, a novel approach for computer-aided 'rational' drug design: I. Theoretical and experimental assessment of a promising method for computational screening and in silico design of new anthelmintic compounds.

Authors:  Yovani Marrero-Ponce; Juan A Castillo-Garit; Ervelio Olazabal; Hector S Serrano; Alcidez Morales; Nilo Castañedo; Froylán Ibarra-Velarde; Alma Huesca-Guillen; Elisa Jorge; Arletys del Valle; Francisco Torrens; Eduardo A Castro
Journal:  J Comput Aided Mol Des       Date:  2004-10       Impact factor: 3.686

3.  The immune epitope database and analysis resource: from vision to blueprint.

Authors:  Alessandro Sette
Journal:  Genome Inform       Date:  2004

4.  Chemical graph theory and n-center electron delocalization indices: a study on polycyclic aromatic hydrocarbons.

Authors:  Marcos Mandado; María J González-Moa; Ricardo A Mosquera
Journal:  J Comput Chem       Date:  2007-07-30       Impact factor: 3.376

Review 5.  Applications for T-cell epitope queries and tools in the Immune Epitope Database and Analysis Resource.

Authors:  Yohan Kim; Alessandro Sette; Bjoern Peters
Journal:  J Immunol Methods       Date:  2010-10-31       Impact factor: 2.303

6.  Using the TOPS-MODE approach to fit multi-target QSAR models for tyrosine kinases inhibitors.

Authors:  Giovanni Marzaro; Adriana Chilin; Adriano Guiotto; Eugenio Uriarte; Paola Brun; Ignazio Castagliuolo; Francesca Tonus; Humberto González-Díaz
Journal:  Eur J Med Chem       Date:  2011-03-11       Impact factor: 6.514

7.  Codessa-based theoretical QSPR model for hydantoin HPLC-RT lipophilicities.

Authors:  A R Katritzky; S Perumal; R Petrukhin; E Kleinpeter
Journal:  J Chem Inf Comput Sci       Date:  2001 May-Jun

8.  Unified multi-target approach for the rational in silico design of anti-bladder cancer agents.

Authors:  Alejandro Speck-Planche; Valeria V Kleandrova; Feng Luan; M N D S Cordeiro
Journal:  Anticancer Agents Med Chem       Date:  2013-06       Impact factor: 2.505

9.  In silico discovery and virtual screening of multi-target inhibitors for proteins in Mycobacterium tuberculosis.

Authors:  Alejandro Speck-Planche; Valeria V Kleandrova; Feng Luan; M Natália D S Cordeiro
Journal:  Comb Chem High Throughput Screen       Date:  2012-09       Impact factor: 1.339

10.  Prediction of cross-recognition of peptide-HLA A2 by Melan-A-specific cytotoxic T lymphocytes using three-dimensional quantitative structure-activity relationships.

Authors:  Theres Fagerberg; Vincent Zoete; Sebastien Viatte; Petra Baumgaertner; Pedro M Alves; Pedro Romero; Daniel E Speiser; Olivier Michielin
Journal:  PLoS One       Date:  2013-07-16       Impact factor: 3.240

View more
  5 in total

1.  A study of the Immune Epitope Database for some fungi species using network topological indices.

Authors:  Severo Vázquez-Prieto; Esperanza Paniagua; Hugo Solana; Florencio M Ubeira; Humberto González-Díaz
Journal:  Mol Divers       Date:  2017-05-31       Impact factor: 2.943

2.  Carbon Nanotubes' Effect on Mitochondrial Oxygen Flux Dynamics: Polarography Experimental Study and Machine Learning Models using Star Graph Trace Invariants of Raman Spectra.

Authors:  Michael González-Durruthy; Jose M Monserrat; Bakhtiyor Rasulev; Gerardo M Casañola-Martín; José María Barreiro Sorrivas; Sergio Paraíso-Medina; Víctor Maojo; Humberto González-Díaz; Alejandro Pazos; Cristian R Munteanu
Journal:  Nanomaterials (Basel)       Date:  2017-11-11       Impact factor: 5.076

3.  Prediction of B cell epitopes in proteins using a novel sequence similarity-based method.

Authors:  Alvaro Ras-Carmona; Alexander A Lehmann; Paul V Lehmann; Pedro A Reche
Journal:  Sci Rep       Date:  2022-08-12       Impact factor: 4.996

Review 4.  Deciphering Human Leukocyte Antigen Susceptibility Maps From Immunopeptidomics Characterization in Oncology and Infections.

Authors:  Pablo Juanes-Velasco; Alicia Landeira-Viñuela; Vanessa Acebes-Fernandez; Ángela-Patricia Hernández; Marina L Garcia-Vaquero; Carlota Arias-Hidalgo; Halin Bareke; Enrique Montalvillo; Rafael Gongora; Manuel Fuentes
Journal:  Front Cell Infect Microbiol       Date:  2021-05-28       Impact factor: 5.293

5.  Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning.

Authors:  Cristian R Munteanu; Marcos Gestal; Yunuen G Martínez-Acevedo; Nieves Pedreira; Alejandro Pazos; Julián Dorado
Journal:  Int J Mol Sci       Date:  2019-09-05       Impact factor: 5.923

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.