| Literature DB >> 35962028 |
Alvaro Ras-Carmona1, Alexander A Lehmann1,2, Paul V Lehmann2, Pedro A Reche3.
Abstract
Prediction of B cell epitopes that can replace the antigen for antibody production and detection is of great interest for research and the biotech industry. Here, we developed a novel BLAST-based method to predict linear B cell epitopes. To that end, we generated a BLAST-formatted database upon a dataset of 62,730 known linear B cell epitope sequences and considered as a B cell epitope any peptide sequence producing ungapped BLAST hits to this database with identity ≥ 80% and length ≥ 8. We examined B cell epitope predictions by this method in tenfold cross-validations in which we considered various types of non-B cell epitopes, including 62,730 peptide sequences with verified negative B cell assays. As a result, we obtained values of accuracy, specificity and sensitivity of 72.54 ± 0.27%, 81.59 ± 0.37% and 63.49 ± 0.43%, respectively. In an independent dataset incorporating 503 B cell epitopes, this method reached accuracy, specificity and sensitivity of 74.85%, 99.20% and 50.50%, respectively, outperforming state-of-the-art methods to predict linear B cell epitopes. We implemented this BLAST-based approach to predict B cell epitopes at http://imath.med.ucm.es/bepiblast .Entities:
Mesh:
Substances:
Year: 2022 PMID: 35962028 PMCID: PMC9374694 DOI: 10.1038/s41598-022-18021-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Absolute and relative amino acid frequencies in B-cell epitopes. (a) Figure shows the frequency in percentage (Y axis) of each of the 20 distinct amino acids (X axis) in B cell epitopes included in BEPIBD. (b) Figure represents the same amino acid frequencies but relative to those in SWISSPROT, represented as log2 values.
Performance of BLAST-based discrimination of B and non-B cell epitopes.
| Negative dataset | % SE | % SP | % ACC | MCC |
|---|---|---|---|---|
| RANDPEP | 63.49 ± 0.43 | 99.15 ± 0.15 | 81.32 ± 0.20 | 0.67 ± 0.01 |
| IEDBNB | 63.49 ± 0.43 | 81.59 ± 0.37 | 72.54 ± 0.27 | 0.46 ± 0.01 |
Table reports the sensitivity (% SE), specificity (% SP), accuracy (% ACC), and Matthew’s correlation coefficient (MMC) of BLAST-based discrimination of B cell epitopes in BEPIBD from non-B cell epitopes included in the RANDPEP and IEDBNB datasets. Values were obtained in tenfold cross-validation experiments.
Comparative performance of B cell epitope prediction methods.
| Negative dataset | Method/tool | % SE | % SP | % ACC | MCC |
|---|---|---|---|---|---|
| IRPEP | BLAST | 50.50 | 99.20 | 74.85 | 0.57 |
| BepiPred | 37.60 | 65.01 | 51.35 | 0.03 | |
| LBtope | 42.21 | 76.34 | 59.32 | 0.20 | |
| IBCE-EL | 77.80 | 14.91 | 46.26 | − 0.09 | |
| INB | BLAST | 50.50 | 88.47 | 69.48 | 0.42 |
| BepiPred | 37.60 | 66.60 | 52.14 | 0.04 | |
| LBtope | 42.41 | 77.73 | 60.02 | 0.20 | |
| IBCE-EL | 77.80 | 82.11 | 79.96 | 0.60 |
Table reports the sensitivity (% SE), specificity (% SP), accuracy (% ACC) and Matthew’s correlation coefficient (MMC) of the BLAST-based method, BepiPred, LBtope and IBCE-EL discriminating B cell epitopes in the BECIP dataset from non-B cell epitopes in two different datasets (IRPEP and INB). B cell epitope predictions with LBtope and IBCE-EL were carried out at the relevant web sites and BepiPred predictions were carried out using the standalone version of BepiPred (details in “Methods”).
Figure 2BepiBlast web server. (a) BepiBlast interface. (b) Representative BepiBlast output obtained with default settings. The shown results were obtained for hemagglutinin from Influenza A virus (UniProt Id: P03437). BepiBlast main result consists of a table displaying the following information (from left to right): peptide starting position; peptide ending position; predicted B cell epitope; bit score; accessibility value and flexibility value.
Comparison of available web-based tools for predicting linear B cell epitopes.
| Tool | Algorithm | Training dataset | Validation | URL | Reference | |
|---|---|---|---|---|---|---|
| B cell epitopes | Non-B cell epitopes | |||||
| BepiBlast | BLAST | 62,730 | – | X, I | – | |
| Bceps | Support vector machine | 555 | 555 (a) | X, I, E | [ | |
| BepiPred 2.0a | Random forest | 3542 | 36,785 | X, I, E | [ | |
| LBtopeb | Support vector machine | 14,876 | 23,321 (b) | X, I | [ | |
| IBCE-EL | Random tree with boosting | 4440 | 5485 (b) | X, I | [ | |
| DLBEpitope | Deep neural network | 22,012 | 201,563 (b) | X, I | [ | |
| ILBE | Random Forest | 4440 | 5485 (b) | X, I | [ | |
| ABCPred | Neural network | 700 | 700 (a) | X, I | [ | |
| BCPREDS | Support vector machine | 701 | 701 (a) | X, I, E | [ | |
| SVMtrip | Support vector machine | 4925 | 4925 (b) | X | [ | |
For each tool, table reports the underlying algorithm; the number of B and non-B cell epitopes for model building; the method used for validation (X: cross-validation; I: independent dataset; E: case example); the URL of the tool and the reference. The letter between parenthesis indicates the type of non-B cell epitopes in the training dataset: a, random peptide sequences; b, peptide sequences with reported negative B cell epitope assays. aFor BepiPred, B and non-B cell epitope figures correspond to antigen residues that in the tertiary structure of antibody-antigen complexes contact the antibody or not, respectively. bData for default model in LBtope.