| Literature DB >> 32695898 |
Nastaran Khodadad1, Seyed Saeed Seyedian2, Afagh Moattari3, Somayeh Biparva Haghighi4, Roya Pirmoradi1, Samaneh Abbasi5, Manoochehr Makvandi1.
Abstract
OBJECTIVE: Chronic hepatitis B (CHB) virus infection is the most prevalent chronic liver disease and has become a serious threat to human health. In this study, we attempted to specify and predict several properties including physicochemical, mutation sites, B-cell epitopes, phosphorylation sites, N-link, O-link glycosylation sites, and protein structures of S protein isolated from Ahvaz.Entities:
Keywords: Bioinformatics; Chronic hepatitis B; Genetics; Immunology; In silico; Iran; Microbiology; Molecular biology; Mutations
Year: 2020 PMID: 32695898 PMCID: PMC7365991 DOI: 10.1016/j.heliyon.2020.e04332
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
Figure 1The phylogenetic tree was constructed using maximum likelihood method and Tamura-Nei model. The preS/S gene among HBV isolates with accession number (MK355500, MK355501) are retrieved from Ahvaz (represent with red diamond) in cluster with other HBV genotype D isolates from different regions of the world. The Bootstrap value was 1000 replicates. The scale bar indicates 2% nucleotide sequence divergence.
Figure 2Amino acid changes in preS1/preS2/S regions of Ahvaz isolates and RefSeq using by CLC Sequence software. §: N-glycosylation site; ∗: Occurred mutations at MHR region; and †: Immune escape mutations.
Physicochemical properties of Ahvaz sequences predicted by ProtParam tool.
| Properties | pI | Half-life in | Instability index | Class | Aliphatic index | GRAVY |
|---|---|---|---|---|---|---|
| RefSeq | 8.40 | 10h | 54.98 | Unstable | 82.24 | 0.146 |
| MK355500 | 8.98 | 10h | 54.81 | Unstable | 80.46 | 0.081 |
| MK355501 | 8.40 | 10h | 58.83 | Unstable | 78.95 | 0.065 |
Prediction of B cell epitopes performed by five methods of the Immune Epitope Database and Analysis Resource.
| Hydrophobicity prediction | Flexibility prediction | Accessibility prediction | Beta-Turn prediction | Antigenicity prediction | |
|---|---|---|---|---|---|
| 303–309 (TKPSDGN) | 133–139 (GGSSSGT) | 21–45 (PAFRANTANPDWD | 133–139 (GGSSSGT) | 241–262 (RRFIIFLFILLLCLIFLLVLLD) | |
| 133–139 (GGSSSGT) | 133–139 (GGSSSGT) | 23–45 (FKANTANPDWD | 133–139 (GGSSSGT) | 241–262 (RRFIIFLFIPLLCLIFLLVLLD) | |
| 303–309 (TKPSDGN) | 133–139 (GGSSSGT) | 21–45 (PAFRANTANPDWD | 133–139 (GGSSSGT) | 241–262 (RRFIIFLFILLLCLTFLLVLLD) |
The results showed the selected epitopes at positions 21–45, 23–45, 133–139, 241–262, 303–309 with high scores among the other epitopes for B cell prediction.
Results of Bcepred, Bepipred, Algpred, ABCpred, and VaxiJen for Ahvaz sequences.
| Bcepred | Bepipred | ABCpred | VaxiJen | Algpred | |
|---|---|---|---|---|---|
| RefSeq | 236–242 | 4–166, 197–227, | 374 | 0.5333 | NON ALLERGEN |
| MK355500 | 60–66, 236–242, 321–332 | 4–166, 197–228, | 374 | 0.5066 | NON ALLERGEN |
| MK355501 | 236–242 | 4–166, 197–227, | 374 | 0.5383 | NON ALLERGEN |
Based on the default threshold in Algpred, the scale below -0.4 indicates non-allergen protein so both isolates (MK355500, MK355501) from Ahvaz strain and RefSeq showed non-allergen proteins.
Prediction of disulfide bond positions for Ahvaz sequences using SCRATCH and DiANNA servers.
| Sequences | SCRATCH | DiANNA |
|---|---|---|
| RefSeq | 211–232, 228–239, 284–301, 287–302, 300–310, 312–384 | 211–239, 228–232, 253–300, 270–384, 284–287, 301–310, 302–312 |
| MK355500 | 211–232, 228–239, 284–301, 287–302, 300–310, 312–384 | 211–284, 228–384, 232–287, 239–270, 253–300, 301–310, 302–312 |
| MK355501 | 228–239, 270–287, 284–301, 232–253, 300–310, 312–384 | 211–239, 228–284, 232–287, 253–300, 270–384, 301–310, 302–312 |
Results of N-link and O-link glycosylation sites prediction using NetNGlyc and GlycoEP for our selected sequences.
| NetNGlyc | GlycoEP N-link | GlycoEP O-link | |
|---|---|---|---|
| 4, 112, 166, 309 | 4, 112, 166, 222, 309 | 6, 7, 76, 85, 86, 90, 95, 104, 135, 137, 139, 145, 146, 148, 190, 200, 209, 216, 220, 221, 224, 226, 227, 231, 276–281, 286, 289, 290, 294, 295, 303, 311, 352 | |
| 4, 112, 166, 309 | 4, 112 | 6, 7, 27, 76, 79, 85, 86, 90, 95, 104, 135, 137, 139, 145, 146, 148, 190, 200, 209, 216, 220, 221, 224, 226, 227, 231, 276–281, 286, 288, 289, 294, 295, 303, 311, 352 | |
| 4, 112, 166, 309 | 4, 112 | 6, 7, 76, 79, 85, 86, 90, 95, 104, 135, 137, 139, 145, 146, 148, 151, 152, 157, 190, 200, 209, 216, 220, 221, 224, 226, 227, 231, 276–281, 286, 288, 294, 295, 303, 311, 352 |
The high score of N-glycosylation at positions 309 and 4 were computed by NetNGlyc and GlycoEP N-link, respectively. Besides, GlycoEP O-link software determined the high score of O-glycosylation at position 226 of the amino acid sequence of preS/S gene.
Secondary structure percentage using SOPMA software.
| Sequences | Alpha helix % | Beta turn % | Extended strand % | Random coil % |
|---|---|---|---|---|
| RefSeq | 22.37% | 5.40% | 11.31% | |
| MK355500 | 22.37% | 5.14% | 11.31% | |
| MK355501 | 23.39% | 4.63% | 10.28% |
The results showed that most of the structure was the random coil.
Figure 3Secondary structure prediction for Ahvaz sequences and RefSeq. Blue is alpha helix; green is beta turn; purple is random coil and red region is extended strand.
Figure 4Prediction of the tertiary structure using I-TASSER online software for Ahvaz sequences. A. Results of our study revealed that I-TASSER could create more reliable 3D structures compared to other software (A1: 3D structure of MK355500, A2: 3D structure of MK355501). Pink is alpha-helix; yellow is beta-sheet; blue is coil and white region is extended strand; and B. Positions of N-glycosylation sites (B1: MK355500 isolate, B2: MK355501 isolate) and predicted B cell epitope (B3: predicted epitope of pre-S1 (MK355501 isolate)). The selected epitope starts from amino acid (21–45) which is marked in yellow.
Figure 5Ramachandran plot results for tertiary structure of S protein using Swiss-model online software. The results revealed that high amino acids were located at favored regions (69.2%) while low amino acids (19.6%) were distributed in the allowed region. (A: MK355500, B: MK355501, and C: RefSeq).