| Literature DB >> 35746624 |
Katherine Li1,2, Connor Lowey3, Paul Sandstrom1,2, Hezhao Ji1,2.
Abstract
In silico methods for immune epitope prediction have become essential for vaccine and therapeutic design, but manual intra-species comparison of putative epitopes remains challenging and subject to human error. Created initially for analyzing SARS-CoV-2 variants of concern, comparative analysis of variant epitope sequences (CAVES) is a novel tool designed to carry out rapid comparative analyses of epitopes amongst closely related pathogens, substantially reducing the required time and user workload. CAVES applies a two-level analysis approach. The Level-one (L1) analysis compares two epitope prediction files, and the Level-two (L2) analysis incorporates search results from the IEDB database of experimentally confirmed epitopes. Both L1 and L2 analyses sort epitopes into categories of exact matches, partial matches, or novel epitopes based on the degree to which they match with peptides from the compared file. Furthermore, CAVES uses positional sequence data to improve its accuracy and speed, taking only a fraction of the time required by manual analyses and minimizing human error. CAVES is widely applicable for evolutionary analyses and antigenic comparisons of any closely related pathogen species. CAVES is open-source software that runs through a graphical user interface on Windows operating systems, making it widely accessible regardless of coding expertise. The CAVES source code and test dataset presented here are publicly available on the CAVES GitHub page.Entities:
Keywords: antigenic variation; bioinformatics; comparative genomics; computational biology; epitope prediction; evolution; sequence analysis
Mesh:
Substances:
Year: 2022 PMID: 35746624 PMCID: PMC9227564 DOI: 10.3390/v14061152
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
Figure 1CAVES user workflow. CAVES performs comparative analyses between epitopes from two closely related sequences (sequences A and B). Target sequences are used to generate epitope predictions and database search results from the Immune Epitope Database (IEDB) resources and for multiple sequence alignment. The resulting files are input directly into the CAVES user interface for multi-level comparisons, and CAVES sorted results are output in xlsx format.
Figure 2CAVES graphical user interface (GUI). CAVES opens as a single page and is compatible with Windows OS. The required fields are the epitope prediction files, database search files, and multiple sequence alignment files under the Input File Paths header. The Minimum Peptide Length, Level Selection, and Results File fields are optional parameters set to defaults unless otherwise specified.
Figure 3CAVES matching criteria. (a) An exact match occurs when two epitopes have identical amino acid characters over the entire length of at least one of the two epitopes being compared. (b) A partial match occurs when two epitopes have some identical amino acid characters but not enough to cover the full length of either epitope. (c) Novel epitopes occur either when two epitopes form an exact or partial match but contain a mutation (substitution, insertion, or deletion) in at least one position or when an epitope did not find any match in the opposing file.
Figure 4CAVES two-level comparison approach. (a) CAVES Level-One (L1) compares two files of epitope predictions, sorting them into categories of exact matches (L1E), partial matches (L1P), and novel epitopes (L1N). (b) CAVES Level-Two (L2) compares the epitopes from the three resulting L1 categories to files of database search results, sorting each L1 category into an additional triplet of exact (L2E), partial (L2P), and novel (L2N) categories.
Number of epitopes in each category as determined by CAVES comparison.
| CAVES Results | Number of Epitopes b | Number Derived from Reference Sequence | Number Derived from Alpha VOC Sequence |
|---|---|---|---|
| L1E | 93 | ||
| L1P | 159 | ||
| L1N | 25 | 12 | 13 |
| L1E_L2E | 82 | 42 | 40 |
| L1E_L2P | 444 | 229 | 215 |
| L1E_L2N | 15 | 7 | 8 |
| L1P_L2E | 85 | 44 | 41 |
| L1P_L2P | 462 | 229 | 233 |
| L1P_L2N | 16 | 7 | 9 |
| L1N_L2E | 6 | 6 | 0 |
| L1N_L2P | 49 | 28 | 21 |
| L1N_L2N | 3 | 0 | 3 |
CAVES results categories naming as follows: L1/2, L1, or L2; E, exact matches; P, partial matches; N, novel epitopes. b Refers to pairs of matching epitopes for all exact and partial match categories, including duplicate epitopes that found multiple matches. Refers to individual unique epitopes for novel categories.