Q fever is a globally prevalent zoonotic disease. It is caused by Coxiella burnetii, which is a gram-negative, obligate intracellular bacterium identified by Harold Cox and MacFarlane Burnet in 1935 [1, 2]. This bacterium belongs to the Coxiellaceae family capable of an intracellular intravacuolar persistence in both invertebrate and vertebrate hosts. C. burnetii tolerates a variety of environments, including acidic conditions of up to the pH of about 4.5, high temperatures of up to 62°C for 30 min, UV irradiation, and pressure up to 300 000 kPa [3, 4]. Domestic animals such as cattle, sheep, and goats are the main reservoirs of C. burnetii. This infection may transmit to humans through infected insects such as ticks and mosquitoes, or the direct contact with infected animals, or the consumption of meat and other food products from infected animals [3, 5]. Symptoms of Q fever in humans are initially similar to influenza, but may later lead to secondary chronic conditions such as hepatitis, acute endocarditis, vasculitis, lymphadenitis, etc. [6]. So far, this bacterial zoonosis has caused three major outbreaks: In 1955, the first cases of Q fever were reported in nine African countries. Between 2007 and 2010, the Netherlands faced with the large wave of Q fever infections. The largest zoonotic outbreak of Q fever took place in Cayenne, the capital of French Guiana [6, 7].As mentioned earlier, Q fever may transmit to humans or other animals by direct contact or through infected dairy products. However, horizontal (human-to-human) transmission of this disease has not been reported [1]. Since major outbreaks of the infection have been reported, it is important to decrease the rate of the infection and develop vaccine-based prevention. So far, only one vaccine against Q fever, Q-VAX, was available as a whole-cell formalin-inactivated preparation of phase 1 Heinzerling strain of C. burnetii [8].Recent advances in genomic sequencing have improved our understanding of microorganisms [9, 10], and led to creating the organism-specific protein and nucleotide sequence databases and the servers for epitope prediction [11, 12]. The prediction of B cell epitopes and the T cell epitopes was done separately [13, 14]. In present reverse vaccinology study, several databases and servers were used to design immunogenic peptides/proteins as vaccine candidates against C. burnetii.
EXPERIMENTAL
Epitopes data collection. In this section, suggested epitopes were investigated in the IEDB database (https://www.iedb.org/) and some surface epitopes were chosen. Notice that filters were adjusted based on the human vaccine for Coxiella burnetii (strain RSA 493) with all linear and structural epitopes. The result of this search was about 62 epitopes and surface protein epitopes that were selected from the suggested epitopes. In addition, the primary amino acid sequence of Vibrio cholerae toxin (B subunit) was obtained from NCBI (https://www.ncbi.nlm.nih.gov/protein/) with the accession number of AAV67882.1. Also, the three-dimension structure of MHC I and MHC II were retrieved from Protein Data Bank (https://www.rcsb.org/) in PDB format with PDB entry of 4UQ3 and 1DLH, respectively.Multiple sequence alignment and antigen selection. The complete sequence of the target surface protein was retrieved from the UniPort database (https://www.uniprot.org/) in FASTA format [15]. In the next step, the sequences were blasted with human proteins using NCBI BLAST (https://blast.ncbi. nlm.nih.gov/Blast.cgi) and the results confirmed that there were no similarities between them.Antigenicity of predicted epitopes. Antigenicity of both B cell and T cell epitopes was predicted using the VaxiJen 2.0 server with a prediction accuracy of 70 to 89% (http://www.ddg-pharmfac.net/vaxijen/VaxiJen/) [16].Allergenicity and toxicity prediction. The allergenicity and toxicity properties of the epitopes are crucial in peptide-based vaccines because some suggested epitopes are allergen or toxic and this may cause cross-reactions by the immune system. The related peptide was checked for the allergenicity properties using the AllerTOP server (https://www.ddgpharmfac.net/AllerTOP/) [17]. Also, the toxicity of epitopes was analyzed using (https:// webs.iiitd.edu.in/raghava/toxinpred/algo.php) database.Peptide designing for the chimeric protein. In this part, candidate epitopes were arranged in a peptide sequence and were attached together via KK rigid linker as shown in Fig 1. In the next step, adjuvant sequences (AAV67882.1) of toxin B subunit related to Vibrio cholerae were added to the first and the end of the sequence to improve the immunization of the suggested peptide. Also, PAPAP linkers were used to connect the adjuvant sequences to the sequence of the epitope.
Fig. 1.
The designed peptide includes sequences of adjuvant and linkers. Adjuvant sequence is colored in blue, epitopes are in black, KK-linkers that connect the epitopes are in green, and PAPAP-linkers that are attached to adjuvant sequence are in red.
The designed peptide includes sequences of adjuvant and linkers. Adjuvant sequence is colored in blue, epitopes are in black, KK-linkers that connect the epitopes are in green, and PAPAP-linkers that are attached to adjuvant sequence are in red.Physicochemical properties and stability of related protein. To study the protein chemical and physical properties such as molecular weight, net charge, and half-life, the protein sequence was considered in the PepCalc (https://pepcalc.com) [18] and ProtParam (https://web.expasy.org/cgi-bin/protparam/protparam) databases. The protein stability was considered and simulated using the IUpred 2.0 (https://iupred2a.elte.hu/) [19], IsUnstruct (v2.02) (http://bioinfo.protres.ru/ IsUnstruct/) [20] and FoldUnfold server (http://bioinfo.protres.ru/ogu/) [21].Secondary to tertiary structure prediction. The secondary structure of the designed peptide was predicted by the PRABI database using the GorIV method (https://npsa-prabi.ibcp.fr/cgibin/npsa_automat.pl? page=/NPSA/npsa_gor4.html). Also, the three-dimensional structure of the protein was predicted using the ITASSER database (https://zhanglab. ccmb.med.umich.edu/I-TASSER/) [22]. This server predicted the three-dimensional structure of the protein using de novo modeling. I-TASSER is a ranked approach to protein structure and function prediction based on the similarity level of the input and template structures available in PDB.Model refinement and quality assessment. Refinement of the predicted model was done using the 3Drefine server (http://sysbio.rnet.missouri.edu/3Drefine/) to decrease the possible structural mistakes in the predicted tertiary structure of the related protein vaccine. Also, the geometry quality of the vaccine was validated based on the Ramachandran plot using the RAPAGE server (https://servicesn.mbi.ucla.edu/SAVES/) and Z-score of the ProSA server (https://prosa.services. came.sbg.ac.at/prosa.php ) [23].Molecular docking study. The molecular docking process was performed to validate the binding affinity of the designed protein sequence to MHC I (with HLA-A0201) with the entry of 4UQ3 and MHC II (with HLA-DR) with PDB entry of 1DLH. Molecular docking was carried out using the ClusPro online server (https://cluspro.bu.edu/login.php?redir=/ queue.php) with default complex type [24‒26]. In the next step, the docking process was repeated using HEX 6.0 software. The selected parameters for the docking study were FFT Mode—3D fast life, Distance Range—40, Twist range—360, Correlation type—Shape only, Grid Dimension—0.6, Receptor range— 180, and Ligand Range—180.Peptide reverses translation and ORF checking. Finally, the peptide sequence was reverse translated from amino acid sequence to nucleotide sequence using (https://www.bioinformatics.org/sms2/rev_ trans.html) database. The open reading frame (ORF) of the related sequence with the default of Escherichia coli was investigated in the ORFfinder database (https://www.ncbi.nlm.nih.gov/orffinder).
RESULTS
Epitope Selection and Sequence Alignment
Table 1 shows the epitopes selected from the IEDB server. The peptide (epitopes) sequence for C. burnetii vaccine was designed, selected, and used with a protein BLAST. Also, to confirm that there is no similarity between the peptide and human protein sequences, complete sequences from candidate epitopes containing surface proteins were matched to human proteins. The results of protein BLAST are shown in Table 2.
Table 1.
The selected epitopes in this study
Uniprot (Gene names)
Protein
Epitopes
H7C7D7 (CBU_1910)
Com1
DIQSIVHHYLVNHPEVL
GNVTLVEFFDY
KYYAFHDALLS
SEQITLQTAEKVGLNVA
TPTFVIGNKALTKFGF
Q83DK8 (CBU_0718)
Hypothetical membrane associated protein
DDVAKLRGDLSSIIHKLTSFSKTEASM
Q83AL4 (CBU_1869)
Hypothetical exported protein
PITKKQLKTMSNYEVIAK
IKLPRNRYRLVFTQQ
GKHFDGIKVLKLSPQNTI
Q83F71 (CBU_0077)
Hypothetical membrane spanning protein
EVLTLLLNWVNYHE
Q83EL2 (CBU_0307)
Outer membrane protein
GVAYTYNRANAGLPTNK
VPGYRNASSKRFVAP
Q83DT1 (OmpH)
OmpH
QELFVAQNKAMSDFM
Q83CG1 CBU_1157
Hypothetical exported protein
ISLLVFKNSHRVQLWAK
RFDLSLMLNYPNSADRY
Table 2.
The results of C. burnetii proteins BLAST and antigenicity prediction
NCBI Blast
Epitopes
Minimum identity (%)
Homo sapiens
Maximum identity (%)
Homo sapiens
Minimum identity (%)
Coxiella burnetii
Maximum identity (%)
Coxiella burnetiid
H7C7D7
(CBU_1910)
35.14
35.90
96.92
100.00
Q83DK8
(CBU_0718)
0
0
98.94
100.00
Q83AL4
(CBU_1869)
0
0
98.62
100.00
Q83F71
(CBU_0077 )
0
0
99.24
100.00
Q83EL2
(CBU_0307)
0
0
35.11
100.00
Q83DT1
(ompH)
0
0
99.14
100.00
Q83CG1
CBU_1157
0
0
99.14
100.00
The selected epitopes in this studyDIQSIVHHYLVNHPEVLGNVTLVEFFDYKYYAFHDALLSSEQITLQTAEKVGLNVATPTFVIGNKALTKFGFPITKKQLKTMSNYEVIAKIKLPRNRYRLVFTQQGKHFDGIKVLKLSPQNTIGVAYTYNRANAGLPTNKVPGYRNASSKRFVAPISLLVFKNSHRVQLWAKRFDLSLMLNYPNSADRYThe results of C. burnetii proteins BLAST and antigenicity predictionMinimum identity (%)Homo sapiensMaximum identity (%)Homo sapiensMinimum identity (%)Coxiella burnetiiMaximum identity (%)Coxiella burnetiidH7C7D7(CBU_1910)Q83DK8(CBU_0718)Q83AL4(CBU_1869)Q83F71(CBU_0077 )Q83EL2(CBU_0307)Q83DT1(ompH)Q83CG1CBU_1157
Prediction and Selection of T cell Epitopes
Intracellular nature of infection with C. burnetii prompted us to limiting our candidate peptides to T cell epitopes only. To predict T cell epitopes, a three-step screening of antigenicity (Threshold: 0.4), allergenicity of the epitopes was implemented (Table 3). All epitopes were predicted to be non-toxic.
Table 3.
Predicted properties of selected T cell epitopes
Epitopes
The probable protective allergen (AllertTop)
Antigen VaxiJen 2.0
DIQSIVHHYLVNHPEVL
ALLERGEN
ANTIGEN
GNVTLVEFFDY
ALLERGEN
ANTIGEN
KYYAFHDALLS
NON-ALLERGEN
NON-ANTIGEN
SEQITLQTAEKVGLNVA
NON-ALLERGEN
ANTIGEN
TPTFVIGNKALTKFGF
NON-ALLERGEN
ANTIGEN
DDVAKLRGDLSSIIHKLTSFSKTEASM
NON-ALLERGEN
NON-ANTIGEN
PITKKQLKTMSNYEVIAK
NON-ALLERGEN
NON-ANTIGEN
IKLPRNRYRLVFTQQ
NON-ALLERGEN
NON-ANTIGEN
GKHFDGIKVLKLSPQNTI
ALLERGEN
ANTIGEN
EVLTLLLNWVNYHE
NON-ALLERGEN
NON-ANTIGEN
GVAYTYNRANAGLPTNK
NON-ALLERGEN
NON-ANTIGEN
VPGYRNASSKRFVAP
NON-ALLERGEN
ANTIGEN
ISLLVFKNSHRVQLWAK
ALLERGEN
NON-ANTIGEN
RFDLSLMLNYPNSADRY
NON-ALLERGEN
ANTIGEN
QELFVAQNKAMSDFM
NON-ALLERGEN
NON-ANTIGEN
Predicted properties of selected T cell epitopes
Vaccine Engineering and Physicochemical Properties
According to immunoinformatics analysis, five T cell epitopes were selected. The designed vaccine candidate included 344 amino acids, which were divided into the segments as follows: CTxB as an adjuvant, T cell epitopes, and appropriate linkers. Physical and chemical properties of the final construct were predicted using the PepCalc server. The results confirmed that the vaccine protein with a molecular mass of about 38 261.89 Da is a stable soluble protein with a pI of 9.92 and estimated net charge at about 14.7 (Fig. 2a).
Fig. 2.
Characteristics of the designed protein. Protein sequence (a). According to the results, the total number of residues is 344 amino acids with a molecular mass of 38281.89 Da and they are water-soluble. Also, the iso-electric point was 9.92, and net charge at pH 7 was estimated to be about 14.7. (b) Plot of the protein stability that was designed using the IUpred2.0 server. This plot confirms the stability of the protein because the predicted protein disorders are lower than the threshold of 0.5. (c) The IsUnstruct results and the prediction of disordered residues based on the Ising model. (d) The FoldUnfold results according to amino acid sequence. The FoldUnfold server examines the amino acid in the sequence and the folded and unfolded regions in the sequence have shown in blue and red, respectively.
Characteristics of the designed protein. Protein sequence (a). According to the results, the total number of residues is 344 amino acids with a molecular mass of 38281.89 Da and they are water-soluble. Also, the iso-electric point was 9.92, and net charge at pH 7 was estimated to be about 14.7. (b) Plot of the protein stability that was designed using the IUpred2.0 server. This plot confirms the stability of the protein because the predicted protein disorders are lower than the threshold of 0.5. (c) The IsUnstruct results and the prediction of disordered residues based on the Ising model. (d) The FoldUnfold results according to amino acid sequence. The FoldUnfold server examines the amino acid in the sequence and the folded and unfolded regions in the sequence have shown in blue and red, respectively.
The Protein Stability
The stability of the related protein was considered using the IUpred 2.0, IsUnstruct (v2:02) and FoldUnfold server (Scale: Expected number of contacts 8 Å, Threshold: 20.4, Averaging frame: 11). The designed protein stability was confirmed respectively according to Figs. 2b–2d.In the following amino acid composition, protein stability and half-life were predicted using the ProtParam. According to the results illustrated in Fig. 3, the protein structure was in a stable form and the half-life of protein was estimated about 30 h in mammalian cells, more than 20 h in yeast, and more than 10 h in E. coli (Fig. 3).
Fig. 3.
The Protparam results. Alanine and lysine are the most repeated amino acids in protein sequence and the estimated protein half-life in mammalian cells in the in vitro conditions is about 30 h.
The Protparam results. Alanine and lysine are the most repeated amino acids in protein sequence and the estimated protein half-life in mammalian cells in the in vitro conditions is about 30 h.
Prediction of the Secondary and Tertiary Structure
The prediction of the secondary and 3D structure of chimeric peptide is illustrated in Fig. 4. The results showed that 31.58, 19.30, and 49.12% of the total 344 amino acids were organized in alpha helix, extended strand, and random coil, respectively. Furthermore, the primary 3D model of the proposed molecule was predicted by the I-TASSER online server.
Fig. 4.
Secondary structure of the protein. From a total of 344 amino acid residues, 110, 66, and 168 ones form α-helix, β-strand and random coil, respectively (a). Predicted tertiary structure of the related protein using the I-TASSER server (b).
Secondary structure of the protein. From a total of 344 amino acid residues, 110, 66, and 168 ones form α-helix, β-strand and random coil, respectively (a). Predicted tertiary structure of the related protein using the I-TASSER server (b).
Model Refinement and Quality Assessment
Refinement processes were performed by 3Drefine server for the selected peptide model of the related immunogene. For this aim, the whole protein structure including secondary structure elements, loop regions, and protein side-chains was refined. Five factors including 3Drefine score, GDT-TS, GDT-HA, RMSD, RWplus, and MolProbity are the basic factors for the refinement process, which indicated potential energy (3Drefine score and RWplus), similarity score (GDT-TS and GDT-HA), division score and physical realism score, respectively. In the following, the refined model with proper features was selected for further evaluations concerning the mentioned factors. The results showed, refined model by server 3Drefine has GDT-TS: 1.0000, GDT-HA: 0.9644, RMSD: 0.375, MolProbity: 3.395, 3Drefine score: 22028.8 and RWPlus: ‒63478.84. In the next step, the geometric quality of primary and refined models was analyzed using the Ramachandran Plot (Fig. 5).
Fig. 5.
The Ramachandran plot analysis of the peptide sequence.
The Ramachandran plot analysis of the peptide sequence.
Molecular Docking Studies
The results of the molecular docking study confirmed the affinity of chimeric peptides to both MHC I and MHC II classes. It was revealed that chimeric peptide had more affinity to MHC II with an e-value of ‒920.88 while it was determined ‒772.51 for MHC I. In this section, albumin protein was used as a neutral protein (Table 5). Also, the binding energy to MHC I and MHC II classes for the recommended vaccine was investigated using the online Cluspro server (https://cluspro.bu.edu/login.php). The results are shown in Tables 4 and 5.
Table 5.
The results of molecular docking study using the ClusPro server for proposed vaccine sequence with MHC class II
MHC II selected model
Representative
Weighted score
0
Center
–824.2
Lowest energy
–824.2
1
Center
–631.3
Lowest energy
–708.2
2
Center
–648.4
Lowest energy
–737.4
Table 4.
The results of molecular docking study using the ClusPro server for proposed vaccine sequence with MHC class I
MHC I SELECTED MODEL
Representative
Weighted score
0
Center
–829.0
Lowest energy
–829.0
1
Center
–805.6
Lowest energy
–805.6
2
Center
–659.0
Lowest energy
–745.5
The results of molecular docking study using the ClusPro server for proposed vaccine sequence with MHC class IThe results of molecular docking study using the ClusPro server for proposed vaccine sequence with MHC class II
Protein Reverses Translation and Constructs Design
To construct the expression cassette in a plasmid vector for protein production, it is necessary to reverse translate the amino acid sequence into nucleotide sequence. For this, the final peptide sequence was converted into a nucleotide sequence using the bioinformatics database. Also, the ORF of the sequence was investigated on the ORFfinder database. Finally, the nucleotide sequence was used to simulate the restriction cloning into PET 21 vector using the SnapGene offline software. The results of this section are shown in Fig. 6.
Fig. 6.
Simulated and designed DNA encoding peptide in PET 21 expression vector. In silico cloning was done through NdeI and HindIII restriction sites of the pET system.
Simulated and designed DNA encoding peptide in PET 21 expression vector. In silico cloning was done through NdeI and HindIII restriction sites of the pET system.
DISCUSSION
Recently, several methods have been developed to induce vaccine immunogenicity and reduce associated risks. A common zoonotic disease, Q fever, has only one commercial vaccine available, and this vaccine is a formalin-inactivated form of the bacteria [8]. Here were described a peptide-based candidate vaccine against Q fever, which we have developed by predicting the functional and immunodominant epitopes followed by evaluating vaccine efficiency in the lab. Several databases were used for epitope prediction [27-29]. In peptide vaccines, these epitopes may be fused with a protein tags, for example, Arg-tag, calmodulin-binding peptide, cellulose-binding domain, DsbA, c-myc-tag, glutathione S-transferase, FLAG-tag, HAT-tag, His-tag, maltose-binding protein, NusA, S-tag, etc. [30]. Further protein engineering may be applied to solubilize the insoluble protein or to stabilize unstable proteins [31]. After selection of the epitopes, the protein is reverse translated into the nucleic acid sequence, and the resultant backbone then synthesized and cloned into a suitable host for protein expression [32].Recently, many studies have been reported to develop peptide-based vaccines against infectious diseases. For example, Farhadi et al. [33] presented a peptide-based vaccine that consisted of B cell and linear CD4+ T cell epitopes selected from outer membrane proteins (Omps) of Klebsiella pneumoniae. In other study, Nosrati et al. [34] conducted a multi-epitope recombinant vaccine against linear B cell and T cell binding epitopes from Gc- and Gn-glycoproteins of the Crimean-Congo
hemorrhagic fever virus. Their final optimized peptide was of 382 amino acids, organized in four domains including linear B cell epitopes, T cell epitopes, and adjuvants. In 2020, Aryanzad and co-workers [35] designed and manufactured an immunogenic chimeric protein against IpaD and IpaB antigen from Shigella dysenteriae. In 2021, Sohali et al. [36] described in silico procedure for prediction of the T cell epitope of SARS-CoV-2 [36]. In another work, Jaydari et al. [37] reported B and T cell epitopes against C. burnetii. In their study, various orders of Com1 and OmpH epitopes were arranged in 3 groups of T cell, B cell and common T and B cells, respectively, and the data indicated that the scaffold made from the B cell epitopes has the highest antigenicity in both Com1 and OmpH antigens.In present work, we studied all T cell epitopes from seven antigen of C. burnetii including Com1 and OmpH proteins (Table 1). In designing peptide vaccines, the determination of immunogenic B cell and T cell epitopes is paramount. T cells play a significant role in presenting antigens during intracellular infections in the immune system. As C. burnetii is an obligatory intracellular pathogen, the immune responses to C. burnetii are based on the T cell-mediated immunity mostly [38]. Because of that, we have concentrated on T cell epitopes, with a high level of antigenicity and affinity to MHC class I and II. In present work, we studied all T cell epitopes from seven antigen of C. burnetii including Com1 and OmpH proteins (Table 1). After validation of the antigenicity, allergenicity, physicochemical characteristics and immune interaction between candidate vaccine and MHC molecules, the suitability of candidate vaccine was confirmed.
CONCLUSION
The immunoinformatics approach is a cost-effective, protective, and fast method for the development of vaccines. This work aimed at designing a multi-epitope vaccine against the zoonotic C. burnetii bacterim that causes Q fever. We have designed peptide vaccine with high antigenicity and stability; this vaccine candidate is suitable for further development.