| Literature DB >> 23514199 |
Ying-Tsang Lo1, Tun-Wen Pai, Wei-Kuo Wu, Hao-Teng Chang.
Abstract
BACKGROUND: A conformational epitope (CE) in an antigentic protein is composed of amino acid residues that are spatially near each other on the antigen's surface but are separated in sequence; CEs bind their complementary paratopes in B-cell receptors and/or antibodies. CE predication is used during vaccine design and in immuno-biological experiments. Here, we develop a novel system, CE-KEG, which predicts CEs based on knowledge-based energy and geometrical neighboring residue contents. The workflow applied grid-based mathematical morphological algorithms to efficiently detect the surface atoms of the antigens. After extracting surface residues, we ranked CE candidate residues first according to their local average energy distributions. Then, the frequencies at which geometrically related neighboring residue combinations in the potential CEs occurred were incorporated into our workflow, and the weighted combinations of the average energies and neighboring residue frequencies were used to assess the sensitivity, accuracy, and efficiency of our prediction workflow. r> RESULTS: We prepared a database containing 247 antigen structures and a second database containing the 163 non-redundant antigen structures in the first database to test our workflow. Our predictive workflow performed better than did algorithms found in the literature in terms of accuracy and efficiency. For the non-redundant dataset tested, our workflow achieved an average of 47.8% sensitivity, 84.3% specificity, and 80.7% accuracy according to a 10-fold cross-validation mechanism, and the performance was evaluated under providing top three predicted CE candidates for each antigen. r> CONCLUSIONS: Our method combines an energy profile for surface residues with the frequency that each geometrically related amino acid residue pair occurs to identify possible CEs in antigens. This combination of these features facilitates improved identification for immuno-biological studies and synthetic vaccine design. CE-KEG is available at http://cekeg.cs.ntou.edu.tw.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23514199 PMCID: PMC3599093 DOI: 10.1186/1471-2105-14-S4-S3
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1CE prediction workflow.
Figure 2A cartoon of protein surface representation.
Figure 33D morphology operations used for surface rate calculations. Shown in the figure are the original, dilated, and eroded structures, the difference between the dilated and eroded structures, and the final atomic surface region.
Figure 4The distribution of surface rates for residues in known CE epitopes and all surface residues in the antigen dataset.
Variables used in the statistical analysis of geometrically related amino acid pairs (GAAP).
| Variables | Description |
|---|---|
| The number of times a geometrically related residues pair occurs in the known CE epitope dataset. | |
| The number of times a geometrically related amino acid pair occurs in the non-CE epitope dataset. | |
| The frequency (%) that a geometrically related amino acid pair occurs in the known CE epitope dataset. | |
| The frequency (%) that a geometrically related amino acid pair occurs in the non-CE epitope dataset. | |
| The total number of times that all geometrical amino acid pairs occur in the known CE epitope dataset. | |
| The total number of times that all geometrical amino acid pairs occur in the non-CE epitope dataset. | |
| CEI for a geometrically related amino acid pair. | |
Figure 5Example of predicted CE clusters and true CE. (A) Protein surface of KvAP potassium channel membrane protein (PDB ID: 1ORS:C). (B) Surface seed residues possessing energies within the top 20%. (C) Top three predicted CEs for 1ORS:C. Predicted CEs were obtained by filtering, region growing, and CE cluster ranking procedures. The filtering step removing neighboring residues located within 12 Å according to the energy ranked seed. Region growing formulated the CE cluster from previous filtered seed residues to extend neighboring residues within 10 Å radius. CE clusters were ranking by calculating the combination of weighted CEI and Energy scores. (D) Experimentally determined CE residues.
Average performance of the CE-KEG for using average energy function of local neighboring residues.
| Weighing Combinations | SE | SP | PPV | ACC |
|---|---|---|---|---|
| 0%EG+100% GAAP | 0.478 | 0.831 | 0.266 | 0.796 |
| 10%EG + 90% GAAP | 0.490 | 0.831 | 0.273 | 0.797 |
| 20%EG + 80% GAAP | 0.492 | 0.831 | 0.275 | 0.797 |
| 30%EG + 70% GAAP | 0.497 | 0.831 | 0.277 | 0.798 |
| 40%EG + 60% GAAP | 0.493 | 0.832 | 0.280 | 0.799 |
| 50%EG + 50% GAAP | 0.503 | 0.834 | 0.284 | 0.801 |
| 60%EG + 40% GAAP | 0.504 | 0.834 | 0.284 | 0.801 |
| 70%EG + 30% GAAP | 0.519 | 0.839 | 0.294 | 0.808 |
| 80%EG + 20% GAAP | 0.840 | 0.300 | 0.811 | |
| 90%EG + 10% GAAP | 0.521 | 0.839 | 0.294 | 0.809 |
| 100%EG + 0% GAAP | 0.496 | 0.837 | 0.279 | 0.805 |
The performance used combinations of weighting coefficients for the average energy (EG) and frequency of geometrically related pairs of predicted CE residues (GAAP) within a 8-Å radius sphere. The highest SE is denoted by a bold-italic face.
Average performance of the CE-KEG for energy function of single residue.
| Weighting Combinations | SE | SP | PPV | ACC |
|---|---|---|---|---|
| 0%EG+100% GAAP | 0.478 | 0.831 | 0.266 | 0.796 |
| 10%EG + 90% GAAP | 0.463 | 0.827 | 0.260 | 0.790 |
| 20%EG + 80% GAAP | 0.473 | 0.827 | 0.265 | 0.791 |
| 30%EG + 70% GAAP | 0.476 | 0.828 | 0.268 | 0.792 |
| 40%EG + 60% GAAP | 0.483 | 0.832 | 0.275 | 0.796 |
| 50%EG + 50% GAAP | 0.466 | 0.831 | 0.273 | 0.795 |
| 60%EG + 40% GAAP | 0.476 | 0.833 | 0.280 | 0.797 |
| 70%EG + 30% GAAP | 0.832 | 0.281 | 0.797 | |
| 80%EG + 20% GAAP | 0.480 | 0.830 | 0.278 | 0.796 |
| 90%EG + 10% GAAP | 0.481 | 0.831 | 0.275 | 0.797 |
| 100%EG + 0% GAAP | 0.463 | 0.830 | 0.265 | 0.795 |
The performance used combinations of weighting coefficients for the energy (EG) of individual residues and the frequency of occurrence for geometrically related pairs (GAAP). The highest SE is denoted by a bold-italic face.