| Literature DB >> 36105928 |
Lincon Mazumder1, Md Rakibul Hasan1, Kanij Fatema1, Md Zahirul Islam1, Sanjida Khanam Tamanna1.
Abstract
Background: Worldwide, Neisseria gonorrhoeae-related sexually transmitted infections (STIs) continue to be of significant public health concern. This obligate-human pathogen has developed a number of defenses against both innate and adaptive immune responses during infection, some of which are mediated by the pathogen's proteins. Hence, the uncharacterized proteins of N. gonorrhoeae can be annotated to get insight into the unique functions of this organism related to its pathogenicity and to find a more efficient therapeutic target.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36105928 PMCID: PMC9467719 DOI: 10.1155/2022/4302625
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.246
Figure 1Complete flowchart of the hypothetical proteins (HPs) annotation process used in this study.
List of bioinformatics tools and databases used in this study for structural and functional analysis of the HP.
| S.N. | Tools/server | URL | Function | References |
|---|---|---|---|---|
| (A) Sequence similarity search | ||||
| 1 | BLAST |
| Find similar sequences in protein databases | 27 |
| 2. | MUSCLE | Multiple sequence alignment prediction | 28 | |
| 3. | MEGA X | Phylogenetic tree analysis | 29 | |
| (B) Physiochemical characterization | ||||
| 4. | ExPASy – ProtParam |
| Used for predicting physicochemical properties | 30 |
| (C) Subcellular localization identification | ||||
| 5. | PSORT B v3.0 |
| Predict subcellular localization | 35 |
| 6. | PSLpred |
| Predict subcellular localization | 36 |
| 7. | CELLO |
| Predict subcellular localization | 34 |
| (D) Secondary structure prediction | ||||
| 8. | SOPMA |
| Predict the secondary structure of the protein | 37 |
| 9. | PSIPRED |
| Predict secondary structure | 38 |
| (E) 3D structure prediction and quality assessment | ||||
| 10. | HHpred |
| Detect protein homology | 39 |
| 11. | YASARA |
| Utilized to increase the stability of the 3D model structure | 40 |
| 12. | PROCHECK's |
| Used for Ramachandran plot analysis | 42 |
| 13. | Verify3D |
| Structure verification | 44 |
| 14. | ERRAT |
| Used to analyze the statistics of nonbonded interactions between different atoms and verify protein structures | 43 |
| (F) Functional characterization | ||||
| 15. | Conserved domain database |
| Used to search functional domains in a sequence | 48 |
| 16. | Pfam |
| Family relationship identification | 47 |
| 17. | INTERPRO |
| Used to search InterPro for motif discovery | 45 |
| 18. | MOTIF |
| Motif discovery | 46 |
| (G) Active site identification | ||||
| 19. | CASTp |
| Used to find, outline, and estimate inward surface regions on protein 3D structure | 49 |
Similar protein obtained from non-redundant protein sequences (nr) database.
| Description | Scientific name | Max score | Total score |
| Percent identity | Accession |
|---|---|---|---|---|---|---|
| MobA/MobL family protein [Proteobacteria] |
| 984 | 984 | 0 | 100 | WP_032490546.1 |
| MobA/MobL family protein [Haemophilus parainfluenzae] |
| 978 | 978 | 0 | 99.37 | WP_197561055.1 |
| MobA/MobL family protein [Haemophilus haemolyticus] |
| 977 | 977 | 0 | 99.16 | WP_140450219.1 |
| MobA/MobL family protein [Neisseria gonorrhoeae] |
| 936 | 936 | 0 | 96.86 | WP_127514845.1 |
| MobA/MobL family protein [Haemophilus parainfluenzae] |
| 907 | 907 | 0 | 99.11 | MBS6191364.1 |
Similar protein obtained from UniProt/Swiss-Prot (SwissProt) database.
| Description | Scientific name | Max score | Total score |
| Per. Ident | Accession |
|---|---|---|---|---|---|---|
| [ |
| 219 | 219 | 1.00E-62 | 46.96 | P07112.4 |
| [Salmonella enterica subsp. enterica serovar Typhimurium] | Salmonella enterica subsp. enterica serovar Typhimurium | 154 | 154 | 2.00E-41 | 41.01 | P14492.1 |
| [ |
| 86.7 | 86.7 | 3.00E-17 | 27.91 | P20085.1 |
| [ |
| 73.2 | 73.2 | 2.00E-12 | 26.32 | Q8GN32.1 |
| [ |
| 65.9 | 65.9 | 5.00E-10 | 24.58 | Q44363.1 |
Figure 2Phylogenetic relationship among the hypothetical protein and other similar proteins obtained from the non-redundant database by BlastP search. The evolutionary distances were computed using the Poisson correction method and are in the units of the number of amino acid substitutions per site.
ProtParam tool analysis result for the HP of Neisseria gonorrhoeae F0T10 13280.
| Number of amino acids | 478 |
| Molecular weight | 56206.84 |
| Theoretical pI | 8.07 |
| Total number of negatively charged residues (Asp + Glu) | 81 |
| Total number of positively charged residues (Arg + Lys) | 83 |
| Formula | C2461H3884N716O774S10 |
| Instability index (II) | 45.45 |
| Aliphatic index | 63.37 |
| Grand average of hydropathicity (GRAVY) | -1.179 |
| The estimated half-life is | Thirty hours (mammalian reticulocytes, in vitro). |
Figure 3Secondary structure model predicted by the SOPMA server.
Figure 4Secondary structure model by PSIPRED server.
Figure 5Predicted 3D structure of the hypothetical protein visualized by PyMOL (before and after energy minimization).
Figure 6(a) The PROCHECK program validated the Ramachandran plot of the predicted structure. (b) Quality factor 95.556 for ERRAT output. Two lines on the error axis represent the level of confidence required to reject areas that exceed the error value. (c) Verify3D prediction outcome showing 96.30% of the residues have averaged 3D-1D score >= 0.2.
Ramachandran plot statistics of the predicted 3D model for studied protein.
| Ramachandran plot analysis | No. (%) |
|---|---|
| Residues in the most favored regions [A, B, L] | 159 (91.9%) |
| Residues in the additional allowed regions [a, b, l, p] | 13 (7.5%) |
| Residues in the generously allowed regions [-a, -b, -l, -p] | 1 (0.6%) |
| Residues in the disallowed regions | 0 (0.0) |
| No. of non-glycine and non-proline residues | 173 (100.0%) |
| No. of end-residues (excl. Gly and Pro) | 2 |
| No. of glycine residues (shown in triangles) | 8 |
| No. of proline residues | 6 |
| Total no. of residues | 189 |
Quality assessment score before and after energy minimization.
| Criteria | Before energy minimization | After energy minimization |
|---|---|---|
| Energy | - 48361.0 kJ/mol | -11487.9 kJ/mol |
| Quality factor (ERRAT) | 78.453 | 95.5556 |
| Ramachandran plot (PROCHECK) | 90.8% | 93.6% |
| VERIFY 3D | 98.41% of the residues have averaged 3D-1D score >= 0.2 | 96.30% of the residues have averaged 3D-1D score >= 0.2 |
Figure 7Active site (red color) of the studied hypothetical protein.
T cell epitopes predicted by NetCTL server along with their MHC I binding alleles.
| Epitope | Interacting MHC I alleles |
|---|---|
| QSAQAKNDY | HLA-A∗30 : 02 |
| LTDKNQGFL | HLA-A∗01 : 01 |
| GMEVEITQY | HLA-A∗30 : 02 |
| DSGSNKLPY | HLA-B∗35 : 01 |
| HTDKNNHNP | None |
| QANQALEQY | HLA-B∗35 : 01, HLA-B∗58 : 01 |
| KQAQGMGKY | HLA-A∗30 : 02, HLA-B∗15 : 01 |
| FAEDNPQEF | HLA-B∗35 : 01, HLA-B∗53 : 01 |
| NQALEQYGY | HLA-A∗30 : 02, HLA-B∗15 : 01 |
| LDDLQFSGY | HLA-A∗01 : 01 |
| AIYHLNVRY | HLA-A∗30 : 02, HLA-A∗32 : 01, HLA-B∗15 : 01, HLA-A∗03 : 01, HLA-A∗11 : 01 |
| DLQRIQGDY | HLA-A∗30 : 02 |
| TVDSGSNKL | None |
Figure 8Docking analysis revealed by AutodockVina. (a) Three-dimensional structure of the predicted epitope, “AIYHLNVRY” and (b) visualization of binding interactions and residues after the docking of “AIYHLNVRY” with HLA-B∗15 : 01.