Literature DB >> 35241965

Proteome based mapping and molecular docking revealed DnaA as a potential drug target against Shigella sonnei.

Farah Shahid¹, Youssef Saeed Alghamdi², Mutaib Mashraqi³, Mohsin Khurshid⁴, Usman Ali Ashfaq¹.

Abstract

Shigella sonnei is one of the major causes of diarrhea and remained a critical microbe responsible for higher morbidity and mortality rates resulting from dysentery every year across the world. Antibiotic therapy of Shigella diseases plays a critical role in decreasing the prevalence as well as the fatality rate of this infection. However, the management of these diseases remains challenging, owing to the overall increase in resistance against many antimicrobials. The situation necessitates the rapid development of effective and feasible S. sonnei treatments. In the present study, the subtractive genomics approach was utilized to find the potential drug targets for S. sonnei strain Ss046. Various tools of bioinformatics were implemented to remove the human-specific homologous and pathogen-specific paralogous sequences from the bacterial proteome. Then, metabolic pathway and subcellular location analysis were performed of essential bacterial proteins to describe their role in various cellular processes. Only one essential protein i-e Chromosomal replication initiator protein DnaA was found in the proteome of the pathogen that could be used as a potent target for designing new drugs. 3D structure prediction of DnaA protein was carried out using Phyre 2. Molecular docking of 5000 phytochemicals was performed against DnaA to identify four top-ranked phytochemicals (Riccionidin A, Dothistromin, Fustin, and Morin) based on scoring functions and interaction with the active site. This study suggests that these phytochemicals could be used as antibacterial drugs to treat S. sonnei infections in the future. To confirm their efficacy and evaluate their drug potency, further in vitro analyses are required.

Entities: Chemical

Keywords: Bioinformatics; Chromosomal replication initiator protein DnaA; Drug target; Molecular docking; Phytochemicals; Shigella sonnei

Year: 2021 PMID： 35241965 PMCID： PMC8886675 DOI： 10.1016/j.sjbs.2021.09.051

Source DB: PubMed Journal: Saudi J Biol Sci ISSN： 2213-7106 Impact factor: 4.219

Introduction

Shigellosis is widespread internationally and highly endemic in the least developed countries, with almost 0.25 billion cases every year (Khalil et al., 2018). Approximately 13% of deaths caused by diarrhea are due to the Shigella infection (Khalil et al., 2018). About 10 lac cases of shigellosis arise every year in highly industrialized countries, where infants in crowded rooms and childcare centers, soldiers in the field, and people traveling to endemic countries are at a high risk of developing this disease (Pires et al., 2015, Porter et al., 2017). Shigella sonnei prevails as a major source of this infection in high-income countries (Shad & Shad, 2021). S. sonnie, for example, has played a substantial role in foodborne outbreaks in both the Canada and the United States, emphasizing the importance of food as a source of infection (Kimura et al., 2004). However, due to evolving mechanisms and diverse infection causing modes in S. sonnei, the global trend of S. sonnei infections has shifted dramatically over time from industrialized to poor countries (Torraca et al., 2020b). The frequency and intensity of outbreaks vary by region due to a variety of factors including geography, climate, host–pathogen interactions, and controlling strategies. S. sonnei has three primary lineages (I, II, and III) that have recently been identified. S. sonnei evolved quickly in Europe and spread to other continents as a single multidrug-resistant (MDR) lineage (Anandan et al., 2017, Holt et al., 2012). Shigella sonnei is a Gram-negative, rod-shaped, and facultative intracellular pathogen. S. sonnei evolved to specialize in human gut epithelium intracellular infection from Escherichia coli. Its genome includes a Plasmid invasion (pINV) 216 kbp, and a circular chromosome of 4,99 Mbp, required for virulence (Torraca et al., 2020a). After 1 to 7 days of infection with S. sonnei, the person shows symptoms such as cramping rectal pain, acute abdominal pain, acute fever, watery diarrhea, cramping, nausea, blood or mucus or, pus in the stool. Usually, the infection clears up without complications. It can cause serious complications if the diagnosis remains untreated or delayed. The most common of them are severe dehydration that can lead to seizure, reactive arthritis, death, shock, toxic megacolon, and hemolytic uremic syndrome (HUS) (Kotloff et al., 2018). No vaccines or drugs are available for S. sonnei. The best way to prevent shigellosis is washing hands with soap and water thoroughly, repeatedly and cautiously, before handling the food, after and before using the washroom. Strict compliance with standard food and water safety precautions is also important. Water from lakes, untreated swimming pools, or ponds should also not be swallowed. Avoid having sex with diarrhea patients (Prabhurajeshwar, 2018). sonnei is resistant to almost all the antimicrobials, and the management of these infections has become expensive, sometimes challenging, and time-consuming as well, specifically in regions with limited healthcare facilities (Qiu et al., 2013, Taneja and Mewara, 2016). Lately, several mechanisms of antimicrobial resistance have now been identified by investigators, limiting the treatment options for the management of these diseases (Qiu et al., 2013). As a consequence of the significant global burden, medical severity along with the increased reports of evolving antibiotic resistance to first as well as second-line treatments, an effective drug for shigellosis has now become a growing requirement (Gu et al., 2012). Various approaches like comparative along with subtractive genomics have been utilized to find the targets in several human microbes (Jamal et al., 2017, Mahmud et al., 2019). The main objective is to identify the targeted essential genes that show no homology with the Homo sapiens so that the drug targets can be used with minimum off-target effects in humans. Here, we employed an integrated in-silico method to the entire proteome of S. sonnei strain Ss046 to relate the genomic data with the presumed therapeutic targets based on their 3D structure. The approach can also be exploited for the detection of powerful inhibitors, which could potentially give rise to the development of such compounds that possess the capability of inhibiting bacterial growth. The proteome of S. sonnei (Ss046) was retrieved to apply subtractive proteomics. The computational tool Geptop was utilized for the detection of proteins which are considered essential for pathogen survival. Metabolic pathway and subcellular location analysis were carried out to avoid the involvement of bacterial proteins in human metabolism as well as the drugs cross-reactivity with humans, respectively. Chromosomal replication initiator protein DnaA was identified as a drug target. DnaA protein is responsible for the initiation of chromosome replication in bacteria, and its activity is limited to one cell division cycle. DnaA binds to DnaA-boxes, which are high-affinity sites on oriC (the origin of replication), and to other sites on the chromosome with lower affinity. Binding to oriC sites causes the adjacent AT-rich region to unwind, resulting in the opening of the double helix. DnaA recruits the DnaB helicase and the replisome is assembled, which effectively initiates DNA replication. In bacteria, the initiation of chromosome replication by DnaA protein is a highly regulated event because DNA replication timing is critical for steady-state cell growth (Katayama et al., 2010, Regev et al., 2012). The research was further extended to model the 3D structure of the probable drug target DnaA via Phyre 2 to identify a selective and potent inhibitor using docking studies. The present study would be useful in designing an effective drug target against S. sonnei (strain Ss046). Fig. 1 illustrates a comprehensive strategy to identify putative drug targets in Shigella sonnei (Strain Ss046).

Fig. 1

A schematic representation of the identification of novel drug targets in Shigella sonnei.

Methodology

Data collection of proteome

The complete proteome of Shigella sonnei (Strain Ss046) was retrieved from Uniprot (https://www.uniprot.org/). UniProt (Universal Protein Resource) provides a complete as well as an open-access resource for protein sequences and genome annotations. Protein sequences provided by UniProt are obtained by translating the coding sequences. More than 95% of the protein sequences are retrieved in this way (Renaux, 2018).

Paralogs removal

The whole proteome of Shigella sonnei (Strain Ss046) was run at the CD-HIT suite (http://weizhong-lab.ucsd.edu/cdhit_suite/cgi-bin/index.cgi?cmd=cd-hit). The threshold value was set at 60%. CD-HIT is widely utilized for comparing and clustering protein and nucleotide sequences. It removes paralogs or redundant proteins (Huang et al., 2010).

Screening of essential genes

Essential proteins are needed for organisms for their survival and are considered the basis of life. Essential proteins were retrieved by using an online software Geptop 2.0 server (http://guolab.whu.edu.cn/geptop/) (Wei et al., 2013). Geptop serves as a platform for the detection of genes essential in pathogenic species, comparing query protein orthology and phylogeny with the experimentally established essential gene datasets.

Identification of non-homologous essential genes

Screened essential genes should not be homologous to Homo sapiens. For this purpose, these proteins were submitted to Blast p (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PAGE=Proteins) against host proteome with a threshold of e-value 10-4, and query coverage and identity more than 70 % and 30% respectively (Altschul et al., 1990).

Analysis of metabolic pathways

Non-homologous essential proteins were further computed for comparative analysis of metabolic pathways. The screened proteins were used to determine the metabolic pathways with which they are associated. This analysis is conducted to identify drug targets based on common and essential pathway enzymes for bacteria (Anishetty et al., 2005). The analysis of metabolic pathways of the non-human homologous essential proteins of Shigella sonnei (Strain Ss046) was computed at KEGG (https://www.genome.jp/kegg/) (Aoki-Kinoshita & Kanehisa, 2007). Those metabolic pathways were selected that were unique to only Shigella sonnei (Strain Ss046) and were not found in humans. Thus, the proteins having unique metabolic pathways were selected for further assessment.

Subcellular localization analysis

The prediction of subcellular localization provides a fast and relatively cost-effective way to obtain information about a protein particular function. Besides, it has been found that proteins can locate at multiple sites, so localization is a critical aspect of designing any therapeutic agent (Goyal & Citu, 2018). Subcellular localization of the target proteins of Shigella sonnei (Strain Ss046) was identified through Psortb (Yu et al., 2010). Psortb (https://www.psort.org/psortb/) is an online tool used to find the subcellular location of proteins, whether they are periplasmic proteins, cytoplasmic membrane proteins, or cytoplasmic proteins.

Structural analysis of target protein

Protparam (https://web.expasy.org/protparam/) and GOR4 (https://npsa-prabi.ibcp.fr/cgi-bin/npsa_automat.pl?page=/NPSA/npsa_gor4.html) were used for primary structure analysis and secondary structure analysis, respectively (Garnier et al., 1996, ProtParam, 2017). The signal peptide location and presence can be identified by SignalP-5.0 Server (Armenteros et al., 2019). The Transmembrane Topology of the protein was checked through the TMHMM tool (http://www.cbs.dtu.dk/services/TMHMM/) (Krogh et al., 2001).

Structure prediction and validation

Different online tools such as Phyre 2, Swiss Model as well as Raptor X were employed to determine the 3D protein structure (Källberg et al., 2014, Kelley et al., 2015, Schwede et al., 2003). The predicted structure of the target protein was visualized through Chimera (Pettersen et al., 2004). The 3D model of the target protein was refined using the Galaxy WEB server (Ko et al., 2012). To confirm the quality and reliability of the 3D structure of the target protein, different online tools were applied. PROSA web to provide a quality score for a specific structure, ERRAT server to analyze non-bonded interactions, OPM-PPM to check the spatial orientation in the cell membrane, and RAMPAGE to construct the Ramachandran plot of 3D structure (Lengths and Angles, 2018, Lomize et al., 2012).

Molecular docking

The refined structure of the target protein was docked against the phytochemical’s library. For docking, 5000 phytochemicals were retrieved from different databases in 2D conformation i-e form PubChem, MPD3, Zinc database, and MAPS database in sdf file format (Ashfaq et al., 2013, Irwin and Shoichet, 2005, Kim et al., 2016, Mumtaz et al., 2017). Ligand library composed of 5000 phytochemicals was optimized for docking. Optimization entails adding partial charges and minimizing the energy of a selected compound using the Protonate-3D and MMFF94X force fields, respectively. Optimized ligand files were saved in a ligand database, which was later used as an input file for docking studies. The site finder tool in MOE software was utilized to locate the active sites on the target protein (Inc., 2016). After the solvent and additional ligands were removed from the receptor, docking was performed. Parameters set for docking were placement: triangle matcher, refinement: forcefield, retain 10 and rescoring: London dg. Ligands were screened out after confirmation with a receptor molecule.

Physiochemical property profile and toxicity prediction

Molecular descriptors and drug likeliness properties of best docking score phytochemicals were analyzed using the Molinspiration server, which provides a prediction based on the ‘rule of five’ (Lipinski et al., 1997). The criteria relate to the following molecular properties (AlogP values less than 5, less than 10 hydrogen bond acceptors, less than 5 hydrogen bond donors, and a molecular mass less than 500 Daltons). The pharmacokinetic characteristics e.g. distribution, metabolism, adsorption, toxicity, and excretion of compounds can be determined through admetSAR (Yang et al., 2019). Besides this, various toxicity screening models such as oral acute toxicity, immunotoxicity, organ toxicity, genetic endpoint toxicity were applied to the selected molecules using ProTox-II webserver. The ProTox-II is a server used to predict the toxicity of various chemical compounds with different toxicological endpoints (Banerjee et al., 2018).

Results

This study was designed for the identification of novel targets against Shigella sonnei (strain Ss046). The approach of subtractive genomics was utilized to identify the therapeutic proteins necessary for bacterial survival but absent within the host (Fig. 2)

Fig. 2

Summary of subtractive genomics approach employed in this study from core proteome retrieval to essential drug target prediction.

Identification of essential non-homologous genes

The whole proteome retrieved from the UniProt of Shigella sonnei (Ss046) contained 4072 proteins. After paralogs removal, using a CD-HIT suite with a 60% threshold, 3773 out of 4072 non-redundant proteins were obtained. Geptop2.0 was employed to predict the essential Shigella sonnei (Ss046) proteins. The essential proteins obtained from Geptop2.0 were 394 out of 3773 proteins. To prevent drug interactivity with human proteins, essential proteins were submitted to BlastP with a threshold value less than 10-4, and only 51 proteins were screened as non-homologous to humans.

Identification of essential non-homologous proteins involved in unique metabolic pathways

The analysis of the metabolic pathway of these 51 proteins showed the involvement of these proteins in 29 pathways. Comparison of the S. sonnei pathways with the H. sapiens pathways revealed that 7 specific metabolic pathways for S. sonnei and 22 resting pathways are common in pathogen and host. 16 essential proteins of S. sonnei were associated with these 7 pathways. The distribution of proteins in each unique metabolic pathway is shown in (Fig. 3). The classification of seven unique metabolic pathways according to biochemical processes was represented in (Fig. 4).

Fig. 3

Frequency Distribution of proteins involved in unique metabolic pathways of Shigella sonnei (Strain Ss046).

Fig. 4

Classification of seven identified unique metabolic pathways according to biochemical processes.

Frequency Distribution of proteins involved in unique metabolic pathways of Shigella sonnei (Strain Ss046). Classification of seven identified unique metabolic pathways according to biochemical processes. These 16 proteins involved in unique metabolic processes were studied individually using the KEGG database. Of these 16 proteins, 15 proteins played a role in several pathways. These 15 proteins were involved in several unique pathways, as well as in some pathways common between H. sapiens and S. sonnei (Table 1). Therefore, these were not considered for further analysis. Proteins present in unique metabolic pathways can be considered pathogen-specific and serve as potential vaccine and drug targets. Only one protein Chromosomal replication initiator protein DnaA was involved in a unique metabolic pathway i-e two-component regulatory system, so this protein was considered as potential drug target.

Table 1

Unique metabolic pathways of essential nonhomologous proteins.

Protein Name (ID)	Pathway ID	Common Pathway	Unique Pathway
3-methyl-2-oxobutanoate hydroxymethyltransferase (Q3Z5M6)	ssn00770ssn01100ssn01110	Pantothenate and CoA biosynthesisMetabolic pathways	Biosynthesis of secondary metabolites
Glutamate-1-semialdehyde 2,1-aminomutase (Q3Z5K3)	ssn00860ssn01100ssn01110ssn01120	Porphyrin and chlorophyll metabolismMetabolic pathways	Biosynthesis of secondary metabolitesMicrobial metabolism in diverse environments
Acetyl-coenzyme A carboxylase carboxyl transferase subunit alpha (Q3Z5H3)	ssn01100ssn01110ssn00061ssn01200ssn01212ssn00620ssn00640ssn01120ssn01130	Metabolic pathwaysFatty acid biosynthesisCarbon metabolismFatty acid metabolismPyruvate metabolismPropanoate metabolism	Biosynthesis of secondary metabolitesMicrobial metabolism in diverse environmentsBiosynthesis of antibiotics
1-deoxy-D-xylulose-5-phosphate synthase (Q3Z4Y9)	ssn01100ssn01120ssn01130ssn00730ssn00900	Metabolic pathwaysThiamine metabolismTerpenoid backbone biosynthesis	Microbial metabolism in diverse environmentsBiosynthesis of antibiotics
Ferrochelatase (Q3Z4S4)	ssn00860ssn01100ssn01110	Porphyrin and chlorophyll metabolismMetabolic pathways	Biosynthesis of secondary metabolites
Phosphate acyltransferase (Q3Z327)	ssn00561ssn01100ssn01110	Glycerolipid metabolismMetabolic pathways	Biosynthesis of secondary metabolites
Enoyl-[acyl-carrier-protein] reductase [NADH] (Q3Z136)	ssn00061ssn01212ssn01100ssn00780ssn01110ssn01130	Fatty acid biosynthesisFatty acid metabolismMetabolic pathwaysBiotin metabolism	Biosynthesis of secondary metabolitesBiosynthesis of antibiotics
Glutamyl-tRNA reductase (Q3Z0S8)	ssn00860ssn01100ssn01110ssn01120	Porphyrin and chlorophyll metabolismMetabolic pathways	Biosynthesis of secondary metabolitesMicrobial metabolism in diverse environments
Glutamate--tRNA ligase (Q3YZD6)	ssn00860ssn00970ssn01100ssn01110ssn01120	Porphyrin and chlorophyll metabolismAminoacyl-tRNA biosynthesisMetabolic pathways	Biosynthesis of secondary metabolitesMicrobial metabolism in diverse environments
4-hydroxy-tetrahydrodipicolinate synthase (Q3YZ74)	ssn01230ssn00261ssn00300ssn01100ssn01110ssn01120ssn01130	Biosynthesis of amino acidsLysine biosynthesisMetabolic pathways	Monobactam biosynthesisBiosynthesis of secondary metabolitesMicrobial metabolism in diverse environmentsBiosynthesis of antibiotics
Succinyl-diaminopimelate desuccinylase (Q3YZ81)	ssn01230ssn00300ssn01100ssn01120	Biosynthesis of amino acidsLysine biosynthesisMetabolic pathways	Microbial metabolism in diverse environments
Transketolase (Q3YZ88)	ssn01200ssn01230ssn01100ssn01110ssn01120ssn01130ssn00030	Carbon metabolismBiosynthesis of amino acidsMetabolic pathwaysPentose phosphate pathway	Biosynthesis of secondary metabolitesMicrobial metabolism in diverse environmentsBiosynthesis of antibiotics
Chromosomal replication initiator protein DnaA (Q3YWB2)	ssn02020		Two-component system
Membrane protein insertase YidC (Q3YWA8)	ssn02024ssn03070ssn03060	Protein export	Quorum sensingBacterial secretion system
Glycerol-3-phosphate dehydrogenase [NAD(P) + ] (Q3YVX3)	ssn00564ssn01110	Glycerophospholipid metabolism	Biosynthesis of secondary metabolites
Glutamine synthetase (Q3YVA3)	ssn01230ssn01100ssn02020ssn00220ssn00250ssn00630ssn00910ssn01120	Biosynthesis of amino acidsMetabolic pathwaysArginine biosynthesisAlanine, aspartate and glutamate metabolismGlyoxylate and dicarboxylate metabolismNitrogen metabolism	Two-component systemMicrobial metabolism in diverse environments

Unique metabolic pathways of essential nonhomologous proteins. Two-component regulatory system consists of histidine kinase along with a response regulator. Bacteria detect and respond to extracellular signals through this system. Such systems enable the cell to adapt by changing cellular physiology to the prevailing conditions, including initiating gene expression programs, catalyzing reactions, and altering protein–protein interaction. Such pathways are also involved in regulating the growth and production of the bacterial cell cycle (Skerker et al., 2005). This pathway is not found within metazoans including homo sapiens, so enzymes associated with this pathway can be identified as a significant target for anti-fungal as well as anti-bacterial drug designing (Barrett & Hoch, 1998).

Subcellular localization prediction

The subcellular location of the DnaA was identified using Psortb. The subcellular location of protein helps in determining the protein function as well as its potential as a potent target (Goyal & Citu, 2018). DnaA was found to be cytoplasmic protein as predicted by Psortb. The sequence of DnaA protein was obtained from Uniprot (ID: Q3YWB2). Primary structure analysis was predicted by Protparam. Chromosomal replication initiator protein DnaA protein had a molecular mass of 52550.81 Daltons. The isoionic point was 8.77. As this protein has an isoelectric point above 7, it is tagged as a negatively charged protein. The Grand average of hydropathy (GRAVY) describes the hydrophilicity of the protein. The GRAVY value calculated was −0.364. The terminating amino acid at the N-terminus of protein is Methionine (Met). The protein is unstable as the instability index (II) calculated to be is 47.21. The rich quantity of amino acids found in this protein is Alanine (A), Leucine (L), Arginine (R), and Asparagine (N). The secondary structure was predicted by GOR4 and PSIPRED. The secondary structure was reported to have 50.32% of alpha helices, 0 % of beta turns, 9.64% extended strand, and 40.04% of random coils (Fig. 6A). The signal peptide probability value obtained by SignalP is 0.0127, thus no signal peptide was found in a novel protein target. TMHMM showed that DnaA protein does not have any transmembrane helices. Thus, DnaA is attached to the inside of the cell membrane.

Fig. 6

Strctural Analysis of DnaA Protein: (A) the DnaA protein contains α-helix (50.32%, 235), β-strand (9.64%, 45) and random coil (40.04%, 187); (B) the Ramachandran plot of the refined structure shows 98.1%, 1.9% and 0.0% residues in favored, allowed and disallowed region, respectively; (C) the z-score (-8.04) of the DnaA protein.

The three-dimensional structure of the Chromosomal replication initiator protein. Orange color represents alpha helixes, purple color represents beta sheets, and grey color represents loops. Strctural Analysis of DnaA Protein: (A) the DnaA protein contains α-helix (50.32%, 235), β-strand (9.64%, 45) and random coil (40.04%, 187); (B) the Ramachandran plot of the refined structure shows 98.1%, 1.9% and 0.0% residues in favored, allowed and disallowed region, respectively; (C) the z-score (-8.04) of the DnaA protein. As the 3D structure was not found in Protein Data Bank, different tools e.g. Phyre 2, Raptor × along with the Swiss model were used to find the 3D structure of the protein. Structures obtained from these tools were analyzed by the Ramachandran plot and compared to get the best possible outcome. Phyre 2 model was selected as it was better than other models. The confidence level of the structure was 100% and coverage was 66% (Fig. 5). The obtained structure was refined using the Galaxy server. It generated five models. Model 3 was considered the finest model and selected for further analysis. The improved model has a 95.3% favored region in RAMPAGE and poor rotamers as 0.4%, qrmsd as 0.416, clash score as 18.5, MolProbity as 2.092, and RAMPAGE server analyzes and validates the refined structure by creating Ramachandran plot. In the more favorite region, 98.1% of residues are generated, 1.9% of amino acid residues within the permitted area, and none was present in outer regions according to the vaccine construct Ramachandran plot (Fig. 6B). ProSA-web also gives − 8.04 Z, which is within an acceptable range of values (Fig. 6C). Besides, the refined model showed 3errors with PROCHECK. The refined model was 90.3846 in total quality (ERRAT score). These results show that the refined model is of good quality.

Fig. 5

The three-dimensional structure of the Chromosomal replication initiator protein. Orange color represents alpha helixes, purple color represents beta sheets, and grey color represents loops.

OPM-PPM server predicts that chromosomal replication initiator protein DnaA of Shigella sonnei (Ss046) was a peripheral protein. Peripheral membrane proteins interact with the lipid bilayer of the cell membrane. They do not cross cell membranes; therefore, they do not enter into the hydrophobic spaces of the lipid bilayer. They are temporarily associated with the cell membrane in which they adhere, may be attached to integral membrane proteins, or penetrate the peripheral regions (Fig. 7). The depth/hydrophobic thickness calculated was 0.9 ± 0.6 Å, ΔG transfer was −2.6 kcal/mol and tilt angle was 45 ± 7°.

Fig. 7

Topology of DnaA protein predicted by OPM-PPM server.

Topology of DnaA protein predicted by OPM-PPM server. Molecular docking was carried out through MOE software for the screening of compounds possessing the best interaction of binding residue with the receptor. Compounds were analyzed based on RMSD value, residue binding with ligands, and docking score. Out of 5000 docked molecules, the top four ranked docking poses were chosen. Compounds with low scores i.e. rmsd < 3 and residues with the largest number of interactions were selected. These selected phytochemicals exhibited their minimum binding energy in the range of −17 kcal mol−1 to −16 kcal mol−1. The minimum binding energy and scoring function of each docked ligand are shown in Table 2. To get a precise idea of the receptor-ligand interaction with the highest docked complexes, the MOE LigX tool was used to analyze 2D plots of these interactions. Docked complexes were then visualized by the UCSF chimera. Riccionidin A/ DnaA pocket complex LigX interaction diagram showed good binding with Glu 335 and Thr 174 via hydrogen bonds with a score of 17.6 kcal/mol (Fig. 8A). LigX interaction diagrams showed that Dothistromin, Fustin, and Morin were found to bind with DnaA protein with a score of −16.9 kcal mol−1, −16.6 kcal mol−1, and −16.4 kcal mol−1 forming hydrogen bonds with the side chains of Ser 331, Arg 432, and Asp 427 (Dothistromin), Ser 331 and Arg 432 (Fustin) and Glu 335, Ser 331, Arg 432 and Asp 427 (Morin), Fig. 8 (B-D) respectively.

Table 2

Interaction detail of top four bioactive phytochemicals in the proposed site of DnaA protein.

Sr no	PubChem Id	Chemical name	Docking score	Interaction detail
Sr no	PubChem Id	Chemical name	Docking score	rmsd value	Residues
1	441,775	Riccionidin A	−17.6	0.8	Glu 335Thr 174
2	108,014	Dothistromin	−16.9	1.3	Ser 331Arg 432Asp 427
3	5,317,435	Fustin	−16.6	1.5	Ser 331Arg 432
4	5,281,670	Morin	−16.4	1.3	Glu 335Ser 331Arg 432Asp 427

Fig. 8

A: Docked Riccionidin in complex with DnaA; side chains atoms of Glu 335 and Thr 174 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right. B: Docked Dothistromin in complex with DnaA; side chains atoms of Ser 331, Arg 432, and Asp 427 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right. C: Docked Fustin in complex with DnaA; side chains atoms of Ser 331 and Arg 432 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right. D: Docked Morin in complex with DnaA; side chains atoms of Glu 335, Ser 331, Arg 432, and Asp 427 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right.

Interaction detail of top four bioactive phytochemicals in the proposed site of DnaA protein. A: Docked Riccionidin in complex with DnaA; side chains atoms of Glu 335 and Thr 174 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right. B: Docked Dothistromin in complex with DnaA; side chains atoms of Ser 331, Arg 432, and Asp 427 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right. C: Docked Fustin in complex with DnaA; side chains atoms of Ser 331 and Arg 432 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right. D: Docked Morin in complex with DnaA; side chains atoms of Glu 335, Ser 331, Arg 432, and Asp 427 making hydrogen bonds, shown in the green line. The docked pose of the compound visualized by Chimera is shown at right. However, Dothistromin was ranked top ascribed to its maximum binding score and good binding affinity among the revealed binding residues from the DnaA. Three out of the four phytochemicals, i.e., appear to exhibit high binding affinity to the Ser 331 and Arg 432 amino acids, indicating that these are the most active residues.

ADMET/Drug scan results

The drug-likeness of four selected compounds was predicted through the Molinspiration server, based on the Lipinski Rules of five. The selected candidates displayed zero violation of the ‘rule of five’ and exhibited drug-like properties (Table 3). All the candidate compounds were subjected to assessment for pharmacokinetic properties via the admetSAR server to further validate the potential of drug likeliness (Table 4).

Table 3

Results of compounds examined for Lipinski rule.

Compound	Molecular weight (g/mol)	Number of HBA	Number of HBD	MLogP
Lipinski rule of five	<500	<10	<5	<5
Riccionidin A	285.23	6	4	−0.67
Dothistromin	372.29	9	5	1.49
Fustin	288.25	6	4	0.80
Morin	302.24	6	5	1.88

Table 4

ADMET Profiling Enlisting Absorption, Metabolism and Toxicity related drug-like parameters of candidate compounds.

A. ADMET Profiling
Compounds	Riccionidin A	Dothistromin	Fustin	Morin
A. Absorption
Blood-Brain Barrier	+	_	_	_
Human Intestinal Absorption	+	+	+	+
P-glycoprotein substrate	_	_	_	_
CYP450 1A2 Inhibitor	Yes	No	Yes	Yes
CYP450 2C9 Inhibitor	No	No	Yes	Yes
CYP450 2D6 Inhibitor	No	No	No	No
CYP450 2C19 Inhibitor	No	No	No	Yes
CYP450 3A4 Inhibitor	No	No	No	Yes
Distribution
Subcellular localization	Mitochondria	Mitochondria	Mitochondria	Mitochondria
Toxicity
AMES Toxicity	No	No	No	No

Results of compounds examined for Lipinski rule. ADMET Profiling Enlisting Absorption, Metabolism and Toxicity related drug-like parameters of candidate compounds.

Toxicity assessment results

After the docking test, the four molecules obtained underwent various toxicity modules. Table 5 lists the rat's oral acute toxicity (LD50) as mg/Kg, predicted specific toxicity classes (I–VI) and predicted accuracy in percentage, organ toxicity prediction regarding the findings of liver toxicity. The graphical representation of the expected distribution of dose value for the candidate compounds is shown in Fig. 9A-D. Table 6 enlisted genotoxicity prediction regarding cytotoxicity and mutagenicity. The genotoxicity prediction concerning cytotoxicity and mutagenicity showed that all compounds were obtained cytotoxic inactive with likelihood scores of 0.89, 0.76, 0.98, and 0.98 respectively. In the case of mutagenicity endpoints, all compounds were found to be mutagenic inactive with a Probability score of 0.55, 0.56, 0.53, and 0.52.

Table 5

Prediction of oral acute toxicity, class and accuracy, organ toxicity and genetic toxicity endpoints of candidate compounds (IA: Inactive).

Sr.	Compounds Name	Oral LD50 Value (mg/kg)	Predicted Toxicity Class	Prediction Accuracy (%)	Hepatotoxicity	Probability	Cytotoxicity	Probability
1	Riccionidin A	2991	IV	67.38%	IA	0.77	IA	0.89
2	Dothistromin	3000	V	68.07%	IA	0.75	IA	0.76
3	Fustin	2000	IV	72.9%	IA	0.7	IA	0.98
4	Morin	3919	V	69.26%	IA	0.68	IA	0.98

Fig. 9

Graphical representation of predicted dose value distribution for candidate compounds (A = Riccionidin A; B = Dothistromin; C = Fustin; D = Morin;).

Table 6

Prediction of genetic toxicity endpoints of candidate compounds (IA: Inactive).

Sr. No	Compounds name	Cytotoxicity	Probability	Mutagenicity	Probability
1	Riccionidin A	IA	0.89	IA	0.55
2	Dothistromin	IA	0.76	IA	0.56
3	Fustin	IA	0.98	IA	0.53
4	Morin	IA	0.98	IA	0.52

Prediction of oral acute toxicity, class and accuracy, organ toxicity and genetic toxicity endpoints of candidate compounds (IA: Inactive). Graphical representation of predicted dose value distribution for candidate compounds (A = Riccionidin A; B = Dothistromin; C = Fustin; D = Morin;). Prediction of genetic toxicity endpoints of candidate compounds (IA: Inactive). Among four compounds, Riccionidin A and Fustin obtained class IV i-e prescribed as toxic if consumed, with 67.38% and 72.9% prediction accuracy respectively Fig. 9A & C). While the rest of the two compounds Dothistromin and Morin obtained class V i-e prescribed as can be toxic if consumed, with 68.07% and 69.26% prediction accuracy respectively (Fig. 9B &D). All compounds were identified as hepatotoxic inactive with a likelihood score of 0.77, 0.75, 0.7, and 0.68 respectively for the prediction of organ toxicity regarding liver toxicity.

Discussion

Shigella sonnei is a multidrug-resistant (MDR) bacterium (Abbasi et al., 2019). Shigella sonnei causes shigellosis which affects millions of people annually and causes several deaths. Mild shigellosis cases are treated without antimicrobial agents and their recovery is quite rapid. Antibiotics control the spread of this disease inside the intestine as well as reduces the duration required for it to enter the entire body. The use of anti-diarrheal medications is not recommended as it may aggravate the disease (Sati et al., 2019). There is currently no effective vaccine for shigellosis, but frequent and thorough hand washing can prevent person-to-person transmission (Sybilski et al.). To combat this life-threatening situation, there is a dire need to develop drugs against Shigella sonnei immediately. In our study, subtractive proteomics was employed to screen the drug targets against MDR Shigella sonnei. This approach is utilized to find the targets based on the determination of essential as well as non-homologous proteins within pathogenic organisms. Identifying drug targets is a critical step in the computer-based drug designing procedures (Hosen et al., 2014) Recent advances in the disciplines of bioinformatics as well as computational biology have created a variety of approaches to drug design and in silico analysis, reducing the time and expense associated with trial and error of ions devoted to drug development (Barh et al., 2011). The whole proteome of Shigella sonnei (Ss046) that contained 4072 proteins was analyzed through CD-HIT that eliminated all the redundant proteins and provided a group of 3773 non-redundant proteins. Essential proteins are necessary for the survival of bacteria. If these essential proteins are degraded or mutated, bacteria cannot survive. By targeting these proteins, we can kill bacteria and cure the disease. Essential proteins are preferred targets for vaccine development and antibacterial drugs. Thus 394 essential proteins were screened from non-redundant proteins. Shilpa S., et al identified 807 essential proteins in Eubacterium nodatum; Sakharkar et al. found 306 essential genes within P. aeruginosa whereas Chan-Eng Chong et al. identified 312 essential proteins within Burkholderia pseudomallei using this method (Chong et al., 2006, Sakharkar et al., 2004, Shiragannavar et al., 2019). These genes could be homologous to humans. Thus, targeting such genes can interfere with human metabolism and prove fatal. The possibility of cross-reactivity, as well as adverse events, can be reduced by the selection of the non-homologous proteins that are not found in homo sapiens (Barh et al., 2011). To avoid such undesirable circumstances and toxicity, we screened 51 non-homologous proteins. Comparative study of human and pathogen metabolic pathways through the KEGG database showed that 7 pathways are specific to pathogen only and 22 pathways are common in both pathogen and host. These 7 pathways include a two-component system, bacterial secretion system, Quorum sensing, Monobactam biosynthesis, metabolism of microbe within diverse environments, and synthesis of the secondary metabolites. The findings of pathways identification which are specific to the pathogen are in line with the results of L. interrogans, A. baumannii, and S. saprophyticus reported by (Amineni et al., 2010, Goyal et al., 2018, Shahid et al., 2020). A total of 16 essential proteins of S. sonnei were associated with these 7 pathways. Out of these 16, 15 proteins were involved in some pathways common between H. sapiens and S. sonnei. Only one protein DnaA was involved in a unique metabolic pathway. DnaA, a ubiquitous protein that is responsible for initiating the replication of chromosomal DNA at specific sites by the unwinding of double-stranded DNA helix. It is very crucial for the replication of DNA in bacteria (Mott and Berger, 2007, Zakrzewska-Czerwińska et al., 2007). It is evident that DnaA is crucial for the replication of the chromosome and survival of Shigella sonnei (Ss046), thus it could be a potential drug target against this pathogen. DNA replication proteins are necessary for cell viability and are therefore attractive targets for drug designing. Different tools were applied to determine the sequence and structural features as well as the function and localization of that protein. DnaA was found to be cytoplasmic protein as predicted by Psortb. The crucial function of this protein in cell viability maintenance renders them more attractive towards drug targets. The protein was 55426.35 kDa in molecular weight. DnaA was basic, according to the theoretical pI value, which can ensure stable physiological pH interaction. The GRAVY has been measured as −0.364, the negative value refers to its hydrophilic nature, which makes it interact with neighboring water molecules. The tertiary structure of the putative drug target protein was predicted, evaluated, and validated. Molecular docking was carried out to find the compounds exhibiting the best residue interaction with the target protein. Out of 5000 docked molecules, four top molecules: Riccionidin A, Dothistromin, Fustin, and Morin were selected based on low score i.e. rmsd < 3 and many interacting residues. These phytochemicals exhibited minimum binding energy of almost −17 kcal mol−1 to −16 kcal mol−1. Based on “Lipinski's Rule of Five” the molecular profile and drug probability of these four compounds were assessed. All of them fulfilled the “Lipinski's Rule of Five” and exhibited no desecration. The compounds were then tested for penetration of the blood–brain barrier (BBB), HIA (Human Intestinal Absorption), as well as AMES monitoring. Predicting the ADMET properties is a significant indicator of the behavior, toxicity level, and fate of the drug candidate in the human body. It gives a likelihood of the candidate's ability to enter the intestinal absorption, metabolism, blood–brain barrier, subcellular localization, and most importantly, the level of harm it can cause in the body (Lin et al., 2003). The superfamily cytochrome P450 consists of isoforms such as CYP2A6, CYP1A2, CYP2C9, CYP2D6, CYP2C19, CYP3A4, and CYP2E1 which are involved in drug metabolism as well as hepatic clearance (Vasanthanathan et al., 2009). So, inhibiting the cytochrome P450 isoforms can result in drug-drug interaction that hinders the metabolism of concomitant drugs that cause its accumulation to toxic levels (Lynch & Price, 2007). ADMET profile of compounds indicates that all these compounds have no adverse effects on Absorption. Besides, all compounds showed no toxicity and no mutagenic effects compared to the AMES test. Various toxicity modules were subjected to the four-hit compounds obtained after the virtual screening. Results of the toxicity evaluation revealed that no compound was found to be cytotoxic, hepatotoxic, and mutagenic. Our study identified four drug-leading inhibitors that could be one therapeutic inhibitor of DnaA by effectively targeting and inhibiting apoptosis.

Conclusion

In recent years, diarrhea-causing S. sonnei has become multidrug-resistant. Realizing the importance of the development of potent drug targets, this study was designed to carry out the computational analysis of human microbial pathogen S. sonnei (strain Ss046) to find out the potential target using several computational software and tools. In the first phase of the study, one protein target was identified based on unique pathways not common to H. sapiens and pathogens. In the second phase, structural analysis of the putative target DnaA was performed which was then employed for docking study. The four compounds namely Riccionidin A, Dothistromin, Fustin, and Morin exhibited high binding affinity with the binding pocket of DnaA protein. Thus, this study represents a significant advance in the design of new, potent compounds against S. sonnei.

Funding

Taif University Researchers Supporting Project number (TURSP-2020/258), Taif University, Taif, Saudi Arabia

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

53 in total

1. A novel genomics approach for the identification of drug targets in pathogens, with special reference to Pseudomonas aeruginosa.

Authors: Kishore R Sakharkar; Meena K Sakharkar; Vincent T K Chow
Journal: In Silico Biol Date: 2004

2. In silico analysis of Burkholderia pseudomallei genome sequence for potential drug targets.

Authors: Chan-Eng Chong; Boon-San Lim; Sheila Nathan; Rahmah Mohamed
Journal: In Silico Biol Date: 2006

3. RaptorX server: a resource for template-based protein structure modeling.

Authors: Morten Källberg; Gohar Margaryan; Sheng Wang; Jianzhu Ma; Jinbo Xu
Journal: Methods Mol Biol Date: 2014

4. SignalP 5.0 improves signal peptide predictions using deep neural networks.

Authors: José Juan Almagro Armenteros; Konstantinos D Tsirigos; Casper Kaae Sønderby; Thomas Nordahl Petersen; Ole Winther; Søren Brunak; Gunnar von Heijne; Henrik Nielsen
Journal: Nat Biotechnol Date: 2019-02-18 Impact factor: 54.908

5. Application of a subtractive genomics approach for in silico identification and characterization of novel drug targets in Mycobacterium tuberculosis F11.

Authors: Md Ismail Hosen; Arif Mohammad Tanmoy; Deena-Al Mahbuba; Umme Salma; Mohammad Nazim; Md Tariqul Islam; Sharif Akhteruzzaman
Journal: Interdiscip Sci Date: 2014-01-28 Impact factor: 2.233

6. Classification of cytochrome P450 1A2 inhibitors and noninhibitors by machine learning techniques.

Authors: Poongavanam Vasanthanathan; Olivier Taboureau; Chris Oostenbrink; Nico P E Vermeulen; Lars Olsen; Flemming Steen Jørgensen
Journal: Drug Metab Dispos Date: 2008-12-04 Impact factor: 3.922

Review 7. Regulation of the initiation of chromosomal replication in bacteria.

Authors: Jolanta Zakrzewska-Czerwińska; Dagmara Jakimowicz; Anna Zawilak-Pawlik; Walter Messer
Journal: FEMS Microbiol Rev Date: 2007-04-25 Impact factor: 16.408

8. Travelers' Diarrhea: An Update on the Incidence, Etiology, and Risk in Military Deployments and Similar Travel Populations.

Authors: Chad K Porter; Scott Olson; Alexis Hall; Mark S Riddle
Journal: Mil Med Date: 2017-09 Impact factor: 1.437

9. Identification of novel drug targets for humans and potential vaccine targets for cattle by subtractive genomic analysis of Brucella abortus strain 2308.

Authors: Araf Mahmud; Md Tahsin Khan; Asif Iqbal
Journal: Microb Pathog Date: 2019-09-08 Impact factor: 3.738

10. PubChem Substance and Compound databases.

Authors: Sunghwan Kim; Paul A Thiessen; Evan E Bolton; Jie Chen; Gang Fu; Asta Gindulyte; Lianyi Han; Jane He; Siqian He; Benjamin A Shoemaker; Jiyao Wang; Bo Yu; Jian Zhang; Stephen H Bryant
Journal: Nucleic Acids Res Date: 2015-09-22 Impact factor: 16.971