| Literature DB >> 29970096 |
Gandharva Nagpal1, Kumardeep Chaudhary1, Piyush Agrawal1, Gajendra P S Raghava2,3.
Abstract
BACKGROUND: Evidences in literature strongly advocate the potential of immunomodulatory peptides for use as vaccine adjuvants. All the mechanisms of vaccine adjuvants ensuing immunostimulatory effects directly or indirectly stimulate antigen presenting cells (APCs). While numerous methods have been developed in the past for predicting B cell and T-cell epitopes; no method is available for predicting the peptides that can modulate the APCs.Entities:
Keywords: A-cell epitopes; Adjuvants; Antigen presenting cells; Immunomodulatory peptide; Support vector machine
Mesh:
Substances:
Year: 2018 PMID: 29970096 PMCID: PMC6029051 DOI: 10.1186/s12967-018-1560-1
Source DB: PubMed Journal: J Transl Med ISSN: 1479-5876 Impact factor: 5.531
Fig. 1An illustrative mechanism of antigen presenting cell (APC) activation caused by immunomodulatory peptides through innate immune receptors leading to the induction of adaptive immune cells. The immunomodulatory peptides are ligands of innate immune receptors that evoke cytokine expression through cellular signaling pathways. The cytokines lead to the maturation of naïve cells into mature adaptive immune cells such as various types of T-lymphocytes. Since the immunomodulatory peptides activate the APCs leading to the activation of the adaptive immune cells, they may be used as vaccine adjuvants and be called ‘A-cell epitopes’. The figure was drawn using ScienceSlides, made available at http://www.scienceslides.com/ by VisiScience
Fig. 2Barplots showing the comparison of percent average amino acid composition of A-cell epitopes (blue) with non-epitopes (red) and Swiss-Prot human proteins (green)
Fig. 3Two-sample logo of the 3 residue positions at the a N-terminus and b C-terminus of the A-cell epitopes and non-epitopes. Enriched label represents the positive dataset whereas depleted label represents the negative dataset. In a two-sample logo, the height of a symbol at a residue position is proportional to the difference in symbol frequency between the positive and the negative datasets at that residue position. In the case of A-cell epitopes (as positives) and non-epitopes (as negatives), R is a preferred amino acid at terminal positions apart from I and V. The symbol colors are in accordance with the WebLogo default color scheme provided by the server available at http://www.twosamplelogo.org/cgi-bin/tsl/tsl.cgi. In the default WebLogo color scheme, residues G, S, T, Y and C appear in green color, N and Q are colored purple, K, R and H are depicted in blue, D and E are drawn red and P, A, W, F, L, I, M and V are shaded black
Fig. 4Comparison of occurrence of a tripeptides, b tetrapeptides, c pentapeptides and d hexapeptides divided into 8 bins in the ascending order of occurrence (most rarely occurring to most abundant) in Swiss-Prot proteins
The performance of SVM-based models developed using various features; models were evaluated on training dataset using fivefold cross-validation (internal cross-validation)
| Features | Threshold | Sensitivity (%) | Specificity (%) | Accuracy (%) | MCC | AUROC | Parameters |
|---|---|---|---|---|---|---|---|
| AAC | − 0.1 | 94.49 ± 0.80 | 92.38 ± 1.33 | 93.30 ± 0.84 | 0.87 ± 0.01 | 0.98 ± 0.00 | g = 0.001, c = 3, j = 3 |
| N5 AAC | 0 | 88.54 ± 0.75 | 90.25 ± 1.87 | 89.44 ± 1.26 | 0.79 ± 0.02 | 0.94 ± 0.00 | g = 0.0005, c = 2, j = 1 |
| C5 AAC | 0 | 91.13 ± 1.42 | 92.94 ± 1.20 | 92.08 ± 1.05 | 0.84 ± 0.02 | 0.97 ± 0.00 | g = 0.001, c = 9, j = 1 |
| N5C5 AAC | − 0.2 | 93.73 ± 0.60 | 92.83 ± 0.76 | 93.26 ± 0.40 | 0.87 ± 0.00 | 0.98 ± 0.00 | g = 0.0005, c = 1, j = 1 |
| DPC | 0 | 93.79 ± 1.12 | 95.68 ± 0.78 | 94.84 ± 0.72 | 0.90 ± 0.01 | 0.99 ± 0.01 | g = 0.0005, c = 1, j = 2 |
| N5 DPC | − 0.1 | 83.42 ± 1.77 | 87.73 ± 2.00 | 85.69 ± 1.10 | 0.71 ± 0.02 | 0.93 ± 0.00 | g = 1e−05, c = 9, j = 1 |
| C5 DPC | − 0.1 | 90.21 ± 0.91 | 93.62 ± 0.96 | 92.00 ± 0.50 | 0.84 ± 0.01 | 0.97 ± 0.00 | g = 0.0005, c = 1, j = 2 |
| N5C5 DPC | − 0.2 | 93.60 ± 0.72 | 92.67 ± 1.16 | 93.11 ± 0.70 | 0.86 ± 0.01 | 0.98 ± 0.00 | g = 0.0001, c = 1, j = 1 |
| N5 bin | − 0.1 | 86.91 ± 0.82 | 88.81 ± 1.47 | 87.91 ± 0.73 | 0.76 ± 0.01 | 0.94 ± 0.00 | g = 0.5, c = 2, j = 1 |
| C5 bin | − 0.2 | 91.18 ± 0.92 | 86.61 ± 1.68 | 88.80 ± 1.14 | 0.78 ± 0.02 | 0.96 ± 0.00 | g = 0.5, c = 1, j = 2 |
| N5C5 bin | 0.2 | 89.20 ± 1.11 | 91.14 ± 1.61 | 90.22 ± 1.05 | 0.80 ± 0.02 | 0.96 ± 0.00 | g = 0.05, c = 1, j = 4 |
| N10 bin | − 0.2 | 86.39 ± 2.73 | 89.68 ± 1.79 | 88.42 ± 1.05 | 0.76 ± 0.02 | 0.94 ± 0.01 | g = 0.1, c = 2, j = 2 |
| C10 bin | − 0.2 | 79.87 ± 2.30 | 86.49 ± 2.43 | 83.96 ± 1.91 | 0.66 ± 0.03 | 0.90 ± 0.01 | g = 0.05, c = 3, j = 1 |
| N10C10 bin | − 0.4 | 86.89 ± 2.70 | 91.62 ± 2.92 | 89.83 ± 1.31 | 0.79 ± 0.02 | 0.96 ± 0.00 | g = 0.1, c = 1, j = 1 |
| AAC + motif | − 0.1 | 95.51 ± 0.86 | 95.35 ± 0.85 | 95.42 ± 0.77 | 0.91 ± 0.01 | 0.99 ± 0.00 | g = 0.001, c = 6, j = 1 |
| DPC + motif | 0 | 94.15 ± 0.92 | 96.94 ± 0.49 | 95.71 ± 0.38 | 0.91 ± 0.00 | 0.99 ± 0.00 | g = 0.0005, c = 1, j = 2 |
This table shows average performance (mean ± standard deviation) of models on randomly generated training datasets (bagging)
MCC Matthews correlation coefficient, AAC amino acid composition, DPC dipeptide composition, N5 first 5 residues from N terminus, C5 first 5 residues from C terminus, N5C5 first 5 residues from N and C terminus respectively, bin binary profile, AAC + motif amino acid composition with MERCI motif score, DPC + motif dipeptide composition with MERCI motif score, SVM parameters g gamma parameter of the radial basis function, c trade-off between training error and margin, j regularization parameter (cost-factor, by which training errors on positive examples outweigh errors on negative examples)
The performance of SVM-based models developed using various features; models were evaluated on independent dataset (external cross-validation)
| Features | Threshold | Sensitivity (%) | Specificity (%) | Accuracy (%) | MCC | AUROC | Parameters |
|---|---|---|---|---|---|---|---|
| AAC* | − 0.1 | 94.10 ± 2.70 | 93.77 ± 3.28 | 93.91 ± 2.00 | 0.88 ± 0.03 | 0.98 ± 0.00 | g = 0.001, c = 3, j = 3 |
| N5 AAC | 0 | 89.75 ± 3.61 | 90.88 ± 3.67 | 90.32 ± 2.27 | 0.81 ± 0.04 | 0.95 ± 0.01 | g = 0.0005, c = 2, j = 1 |
| C5 AAC | 0 | 91.12 ± 3.53 | 91.64 ± 2.90 | 91.40 ± 2.13 | 0.83 ± 0.04 | 0.97 ± 0.01 | g = 0.001, c = 9, j = 1 |
| N5C5 AAC | − 0.2 | 94.61 ± 3.35 | 93.59 ± 3.23 | 94.07 ± 2.16 | 0.88 ± 0.04 | 0.98 ± 0.00 | g = 0.0005, c = 1, j = 1 |
| DPC | 0 | 93.77 ± 2.76 | 95.32 ± 1.40 | 94.64 ± 1.24 | 0.89 ± 0.02 | 0.99 ± 0.00 | g = 0.0005, c = 1, j = 2 |
| N5 DPC | − 0.1 | 81.68 ± 4.25 | 87.36 ± 2.78 | 84.62 ± 2.65 | 0.69 ± 0.05 | 0.93 ± 0.01 | g = 1e−05, c = 9, j = 1 |
| C5 DPC | − 0.1 | 92.31 ± 3.36 | 94.71 ± 2.45 | 93.55 ± 1.40 | 0.87 ± 0.02 | 0.98 ± 0.01 | g = 0.0005, c = 1, j = 2 |
| N5C5 DPC | − 0.2 | 94.10 ± 3.06 | 93.75 ± 2.33 | 93.90 ± 1.49 | 0.88 ± 0.03 | 0.98 ± 0.01 | g = 0.0001, c = 1, j = 1 |
| N5 bin | − 0.1 | 88.46 ± 2.90 | 89.43 ± 3.32 | 88.98 ± 2.36 | 0.78 ± 0.04 | 0.95 ± 0.01 | g = 0.5, c = 2, j = 1 |
| C5 bin | − 0.2 | 93.70 ± 3.03 | 87.88 ± 4.25 | 90.63 ± 2.43 | 0.82 ± 0.04 | 0.97 ± 0.01 | g = 0.5, c = 1, j = 2 |
| N5C5 bin | 0.2 | 90.95 ± 3.18 | 91.13 ± 3.44 | 91.03 ± 2.77 | 0.82 ± 0.05 | 0.97 ± 0.01 | g = 0.05, c = 1, j = 4 |
| N10 bin | − 0.2 | 89.38 ± 6.68 | 90.46 ± 4.67 | 90.01 ± 3.26 | 0.79 ± 0.06 | 0.95 ± 0.03 | g = 0.1, c = 2, j = 2 |
| C10 bin | − 0.2 | 85.02 ± 8.02 | 85.24 ± 5.15 | 85.19 ± 4.09 | 0.69 ± 0.09 | 0.93 ± 0.03 | g = 0.05, c = 3, j = 1 |
| N10C10 bin | − 0.4 | 88.73 ± 5.95 | 92.33 ± 5.69 | 91.04 ± 2.52 | 0.81 ± 0.05 | 0.97 ± 0.02 | g = 0.1, c = 1, j = 1 |
| AAC + motif | − 0.1 | 93.11 ± 1.86 | 95.33 ± 3.13 | 94.35 ± 1.67 | 0.89 ± 0.03 | 0.99 ± 0.00 | g = 0.001, c = 6, j = 1 |
| DPC + motif | 0 | 93.28 ± 2.38 | 96.36 ± 1.70 | 95.00 ± 1.25 | 0.90 ± 0.02 | 0.99 ± 0.00 | g = 0.0005, c = 1, j = 2 |
This table shows average performance (mean ± standard deviation) of models on randomly generated independent datasets (bagging)
MCC Matthews correlation coefficient, AAC amino acid composition, DPC dipeptide composition, N5 first 5 residues from N terminus, C5 first 5 residues from C terminus, N5C5 first 5 residues from N and C terminus respectively, bin binary profile, AAC + motif amino acid composition with MERCI motif score, DPC + motif dipeptide composition with MERCI motif score, SVM parameters g gamma parameter of the radial basis function, , c trade-off between training error and margin, j regularization parameter (cost-factor, by which training errors on positive examples outweigh errors on negative examples)
Fig. 5Comparison of the support vector machine-based prediction models on the training–testing and the independent datasets. The striped bars correspond to the Matthews correlation coefficient (MCC) values obtained for the models on the training–testing dataset and the solid line joins the MCC values of the models on the independent dataset. For each model, the MCC values for the training–testing dataset and the independent dataset are comparable indicating the reliable prediction capabilities of the models
The average performance of A-cell epitope prediction models on training and independent dataset
| Features | Threshold | Sensitivity (%) | Specificity (%) | Accuracy (%) | MCC | Parameters |
|---|---|---|---|---|---|---|
| Internal validation: performance on training dataset, evaluated using fivefold cross-validation | ||||||
| DPC | − 0.2 | 87.49 ± 1.41 | 98.70 ± 0.16 | 97.68 ± 0.22 | 0.86 ± 0.01 | g: 0.0005, c: 1, j: 4 |
| DPC + Motif | − 0.2 | 87.81 ± 1.01 | 99.30 ± 0.10 | 98.25 ± 0.17 | 0.89 ± 0.01 | g: 0.0005, c: 1, j: 4 |
| External validation: performance on independent dataset | ||||||
| DPC | − 0.2 | 87.54 ± 4.31 | 98.87 ± 0.28 | 97.84 ± 0.41 | 0.87 ± 0.02 | g: 0.0005, c: 1, j: 4 |
| DPC + Motif | − 0.2 | 77.86 ± 5.84 | 99.28 ± 0.30 | 97.33 ± 0.58 | 0.83 ± 0.04 | g: 0.0005, c: 1, j: 4 |
These training and independent datasets were created from alternate datasets using bagging. In alternate dataset, negative or non-epitopes were derived from human proteins. The performance values have been reported as mean ± standard deviation for each model
MCC Matthews correlation coefficient, DPC dipeptide composition, DPC + motif dipeptide composition with MERCI motif score, SVM parameters g gamma parameter of the radial basis function, c trade-off between training error and margin, j regularization parameter (cost-factor, by which training errors on positive examples outweigh errors on negative examples)