Bg_9562 is a potential broad-spectrum antifungal effector protein derived from the bacteria Burkholderia gladioli strain NGJ1 and is effective against Rhizoctonia solani, the causal agent of sheath blight in rice. In the present study, in vitro antifungal assays showed that Bg_9562 was efficient at 35 °C and 45 °C and ineffective either at high acidic pH (3.0) or alkaline pH (9.5) conditions. Compatibility studies between the native bioagents Trichoderma asperellum TAIK1 and Bacillus subtilis BIK3 indicated that Bg_9562 was compatible with the bioagents. A field study using foliar spray of the Bg_9562 protein indicated the need of formulating the protein before its application. In silico analysis predicted that Bg_9562 possess 111 amino acid residues (46 hydrophobic residues, 12 positive and 8 negative residues) with the high aliphatic index of 89.92, attributing to its thermostability with a half-life of 30 h. Bg_9562 (C491H813N137O166S5) possessed a protein binding potential of 1.27 kcal/mol with a better possibility of interacting and perturbing the membrane, the main target for antimicrobial proteins. The secondary structure revealed the predominance of random coils in its structure, and the best 3D model of Bg_9562 was predicted using an ab initio method with Robetta and AlphaFold 2. The predicted binding ligands were nucleic acids and zinc with confidence scores of 0.07 and 0.05, respectively. The N-terminal region (1-14 residues) and C-terminal region (101 to 111) of Bg_9562 residues were predicted to be disordered regions. Stability and binding properties of the protein from the above studies would help to encapsulate Bg_9562 using a suitable carrier to maintain efficiency and improve delivery against Rhizoctonia solani in the most challenging rice ecosphere.
Bg_9562 is a potential broad-spectrum antifungal effector protein derived from the bacteria Burkholderia gladioli strain NGJ1 and is effective against Rhizoctonia solani, the causal agent of sheath blight in rice. In the present study, in vitro antifungal assays showed that Bg_9562 was efficient at 35 °C and 45 °C and ineffective either at high acidic pH (3.0) or alkaline pH (9.5) conditions. Compatibility studies between the native bioagents Trichoderma asperellum TAIK1 and Bacillus subtilis BIK3 indicated that Bg_9562 was compatible with the bioagents. A field study using foliar spray of the Bg_9562 protein indicated the need of formulating the protein before its application. In silico analysis predicted that Bg_9562 possess 111 amino acid residues (46 hydrophobic residues, 12 positive and 8 negative residues) with the high aliphatic index of 89.92, attributing to its thermostability with a half-life of 30 h. Bg_9562 (C491H813N137O166S5) possessed a protein binding potential of 1.27 kcal/mol with a better possibility of interacting and perturbing the membrane, the main target for antimicrobial proteins. The secondary structure revealed the predominance of random coils in its structure, and the best 3D model of Bg_9562 was predicted using an ab initio method with Robetta and AlphaFold 2. The predicted binding ligands were nucleic acids and zinc with confidence scores of 0.07 and 0.05, respectively. The N-terminal region (1-14 residues) and C-terminal region (101 to 111) of Bg_9562 residues were predicted to be disordered regions. Stability and binding properties of the protein from the above studies would help to encapsulate Bg_9562 using a suitable carrier to maintain efficiency and improve delivery against Rhizoctonia solani in the most challenging rice ecosphere.
Entities:
Keywords:
antifungal protein; biological control; in silico analysis; protein modeling; sheath blight
Management of plant pathogens using alternate methods to chemical pesticides, in the present era of consumer awareness and climate change, is a necessity more than an option. Pesticide residues are a major issue of concern in the domestic and export markets, affecting the overall income and livelihood of farmers. In this context, antagonistic microbes offer an alternative for a wide range of inputs, for such alternative eco-friendly strategies of pest and disease management [1]. Antimicrobial metabolites and proteins from the microbes offer numerous opportunities to manage the plant pathogens effectively, provided they are used appropriately [2]. These compounds are highly sensitive to environmental conditions, in particular temperature, light and moisture, and therefore need special formulations to be more viable and effective during storage, handling and in field applications [3]. There are several reports on the utilization of formulated antifungal proteins for the treatment of human infections, mostly from the prospects of drug discovery and target delivery against human pathogens [4]. Antifungal (AF) proteins in general are classified into groups based on their mode of action. The first group, which are generally amphipathic in nature, act by lysis [5], while the second group interferes with cell wall synthesis or biosynthesis of essential components such as glucan or chitin [6,7]. Use of fungal-derived antifungal and bacterial-derived antifungal proteins for the management of plant pathogens has been extensively reviewed [8].Rice (Oryza sativa L.) is one of the two principal food crops in India. India is the second-largest producer, consumer and exporter, with an average production of 117.47 mT in an area of 43.79 mha [9]. The average yield losses in rice due to pests and diseases vary between 10–30% depending on the severity of the stress factors involved [10]. Among other diseases, sheath blight disease is one of the most destructive, leading to a significant yield reduction in rice [11]. Sheath blight disease of rice is caused by soil-born fungi, Rhizoctonia solani (RS) (Basidiomycetes; teleomorph: Thanatephorus cucumeris), which attacks more than 100 host plants [11]. The fungus is divided into 14 anastomosis groups (AG1 to AG13 and AGBI) among which the subgroup AG1 IA is the most common cause of disease in at least 27 families of monocots and dicots [12,13]. Currently, the disease is largely managed by the application of chemical fungicides [14]. As there is an increasing concern regarding pesticide residues in crop products and soils, non-chemical disease control is becoming increasingly popular [15], and in addition, the integrated approach with the use of biocontrol agents would reduce the chances of pathogens evolving resistance against the fungicides [16].Recently, a prophage tail-like protein produced by the bacteria Burkholderia gladioli strain NGJ1 named Bg_9562 was demonstrated to have microphagous activity against RS using a Type III secretion system (T3SS; Injectisome) effector protein for entering the fungal hyphae [17]. Bg_9562 and its various orthologs were found to harbor a conserved phage_TAC_7 superfamily domain and no known toxic or lytic domain was detected in Bg_9562 and its orthologs [17]. Despite being very effective against RS, the physiochemical properties, 3D structure, mode of action as an antifungal agent and its translocation in the plant system still remain to be investigated. To address this issue, the current study is focused on identifying different properties of Bg_9562 through wet-lab experiments with RS and coupled with in silico computational approaches to predict the 3D structure of the protein for obtaining insights regarding the structural and functional characterization of the protein. We obtained the purified protein from NIPGR (National Institute of Plant Genome Research, New Delhi, India) and conducted the wet-lab in vitro assays and field studies. The physicochemical properties, transmembrane regions, and thermal stability-related residues of the protein were determined in silico and correlated with wet-lab experiments and bioactivity assays of Bg_9562 on RS. The interaction of Bg_9562 with potential native biocontrol agents Trichoderma asperellum TAIK1 and Bacillus subtilis BIK3 [18] was studied, which helps in the use of this protein along with the biocontrol agents. In summary, this study aims to identify the antimicrobial properties of Bg_9562 for its effective usage as a biocontrol agent to fight against sheath blight disease of rice.
2. Results
2.1. Antifungal Efficacy of Bg_9562 on RS
The purified protein was found to be suppressing the growth of RS in vitro when compared to the control. In the experiment on different treatments of protein and R. solani under different temperature and pH conditions, it was observed that the protein was effective at both the temperature variables (35 °C, 45 °C) and ineffective at acidic pH (3.0). It was effective in restricting mycelial growth even with increased temperatures, indicating its thermal stability (Figure 1). However, it was inactive and unable to restrict the mycelial growth at either acidic or alkaline pH but effective under neutral pH (Table S1).
Figure 1
Antifungal activity of Bg_9562 protein at varied temperatures and incubation periods (R. solani mycelial discs): (A,B) 6 h incubation at 45 °C, 35 °C; (C,D) 12 h incubation at 45 °C, 35 °C; (E,F) 24 h incubation at 45 °C, 35 °C; (G) untreated protein (positive control); (H) R. solani (negative control). Observations taken after 24 h.
2.2. In Vitro Compatibility of Bioagents, Trichoderma Asperellum TAIK1 and Bacillus Subtilis BIK3 with Bg_9562
In vitro studies indicated that Bg_9562 has no adverse effect on the growth of these bioagents, at both the temperature variables (35 °C, 45 °C) at different time intervals (6 h, 12 h and 4 h), and were able to grow normally (Figure 2 and Figure 3). No inhibition in growth was observed in the antagonistic microbes tested, even after 7 days of incubation.
Figure 2
Interaction of Bg_9562 protein and TAIK1 mycelial discs at varied temperatures and incubation periods: (A,B) 6 h incubation at 45 °C, 35 °C; (C,D) 12 h incubation at 45 °C, 35 °C; (E,F) 24 h incubation at 45 °C, 35 °C; (G) untreated protein (positive control); (H) TAIK1 (negative control); no effect of protein on mycelial growth. Observations taken after 24 h.
Figure 3
Antifungal activity of Bg_9562 protein and BIK3 colony discs at varied temperatures and incubation periods: (A,B) 6 h incubation at 45 °C, 35 °C; (C,D) 12 h incubation at 45 °C, 35 °C; (E,F) 24 h incubation at 45 °C, 35 °C; (G) untreated protein (positive control); (H) BIK3 (negative control); no effect of protein on bacterial growth. Observations taken after 24 h.
2.3. Assessment of Bg_9562 Protein Activity under Field Conditions
Data recorded in the field revealed that direct foliar treatment of plants with Bg_9562 was not efficient to manage the disease incidence, calculated as percentage disease index (PDI) in comparison to the positive control (Table S2). From the results obtained, it can be assumed that direct foliar spray may be ineffective in controlling the disease; however, formulating the protein along with compatible biocontrol agents may be effective against RS.
2.4. Amino Acid Composition in Bg_9562
The Bg_9562 protein sequence contains 111 amino acids (Table S3) as predicted using the Expasy-translate online application (Table S4). In silico studies on amino acid composition revealed that the Bg_9562 protein has the maximum alanine (14.4%) and minimum cysteine (0.9%) and histidine (0.9%) residues (Table S5). A complete absence of two aromatic amino acids, tryptophan and tyrosine was observed. The total number of negatively-charged residues (Asp + Glu) was 12 and the total number of positively-charged residues (Arg + Lys) was 8, while the total number of hydrophobic amino acids (Ile + Val + Lue + Phe + Cys + Met + Ala + Tyr) was 46.
2.5. Physiochemical Properties of Bg_9562 Determined Using ProtParam Tool
The molecular weight of Bg_9562 was predicted to be 11.5 kDa and its theoretical isoelectric point was calculated as 4.65. The total net charge was estimated as −3.75 at neutral pH, indicating that Bg_9652 has an acidic nature. Further, Bg_9562 was found to be a stable protein with the predicted instability index value of 21.19 (Table 1). The aliphatic index of Bg_9562 was estimated to be as high as 89.92. The GRAVY (grand average of hydropathicity) index of Bg_9562 was predicted to be −0.012, indicating the solubility of the protein and its hydrophilic nature. The estimated half-life of Bg_9562 was 30 h. The extinction coefficient was estimated as 0 M−1 cm−1 at 280 nm measured in water, assuming all cysteine residues are reduced and converted to cystines. Due to the absence of tryptophan residue in the Bg_9562 protein, the estimation could result in more than 10% error in the computed extinction coefficient.
Table 1
List of physiochemical properties of Bg_9562 protein using ProtParam tool.
S. No
Parameters
Values/Scores
1
Number of amino acids
111
2
Molecular weight
11,451.97
3
Theoretical isoelectric point
4.65
4
Total number of negatively-charged residues (Asp + Glu)
12
5
Total number of positively-charged residues (Arg + Lys)
8
6
Total number of hydrophobic amino acids (Ile + Val + Lue + Phe + Cys + Met + Ala + Tyr)
46
7
Total number of atoms
1612
8
Formula
C491H813N137O166S5
9
Instability index
21.19
10
Aliphatic index
89.92
11
Grand average of hydropathicity index (GRAVY)
−0.012
12
Estimated half-life (mammalian reticulocytes, in vitro)
30 h
13
Estimated half-life (yeast, in vitro)
>20 h
14
Estimated half-life (Escherichia coli, in vivo)
>10 h
15
Extinction coefficient
0 M−1 cm−1
2.6. Physiochemical Properties of Bg_9562 Determined Using Protscale
The details of other important physicochemical properties of Bg_9562 are provided in Table S6. These were predicted using the Expasy-Protscale tool (Table S4) with a default window size of 9 and values obtained were normalized so that they all fit in the range of 0 to 1.
2.7. Bg_9562 Properties Determined Using APD
Using predictive tools available at Antimicrobial Peptide Database v2.34, it was found that Bg_9562 has antimicrobial activity (Table 2). The Boman index [19] estimated the protein binding potential of Bg_9562 as 1.27 kcal/mol, which indicates the potential affinity of Bg_9562 with other proteins in the cell. A higher Boman index value (>2.48) indicates that a given antimicrobial protein will be multifunctional or will play a variety of different roles within the cell due to its ability to interact with a wide range of proteins, while a low or negative Boman index value indicates a less effective antimicrobial protein. Another tool at ADP, which estimated the hydrophobicity if the amino acid contributed favorably to membrane interface partitioning of peptides, was determined by the Wimley–White scale for interfacial insertions. Interestingly, Bg_9562 displayed a high Wimley–White whole-residue hydrophobicity value of 26.18 kcal/mol, indicating that it is hydrophobic. Furthermore, it indicated that it has a better possibility of interacting and perturbing the membrane, which is the main target of action of many antimicrobial proteins. A hydropathy plot drawn using the Kyte and Doolittle scale (Table S4) (Figure 4A) predicted hydrophobic regions with a single transmembrane region in Bg_9562. Similar results were obtained with hydrophobicity (Kyte and Doolittle) estimated using the Protscale tool (Table S6), indicating the hydrophobic nature of Bg_9562. The transmembrane nature of Bg_9562 was determined using TMPred (Transmembrane Predictor) software and TMHMM-2.0 server (prediction of transmembrane helices in protein). Both the tools detected a single transmembrane region in the protein Bg_9562 between 83 to 102 amino acid positions (Figure 4B,C).
Table 2
List of various parameters of Bg_9562 predicted using Antimicrobial Peptide Database (APD)-based prediction.
S. No.
Parameter
Values
1
Protein binding potential (Boman index)
1.27 kcal/mol
2
The Wimley–White whole-residue hydrophobicity of the peptide
26.18 kcal/mol
3
APD defined total hydrophobic ratio of protein
41%
4
Total net charge
−3.75
5
Antimicrobial activity
Yes
Figure 4
Hydropathy plot constructed for Bg_9562 protein and prediction of transmembrane region of Bg_9562. (A) Kyte–Doolittle hydropathy plot was constructed with 103 effective amino acids and with the window size of 9. The peaks above the red line (~1.8) in the graph indicate possible transmembrane regions in the protein. (B) The output of TMPred analysis shows presence of single strong transmembrane helices with 83–102 residues of Bg_9562 protein. (C) Prediction of transmembrane region with high probability using TMHMM software.
2.8. Propensity of Crystallization
Bg_9562 was found to crystallize with a 0.507 confidence score by the CRYSTALP2 webserver. This server generates the predictions utilizing the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence of the given protein. Similar results were obtained with the PPCpred webserver (predictor of protein production, purification and crystallization), where the crystallization was with the propensity value of 0.23. A PPCpred predicted score of above 0.4 has the ability to crystallize the protein, thus suggesting a difficulty in crystallization of Bg_9562.
2.9. Sub-Program Sorting Using PSORT Tool
The possible cleavage site of Bg_9562 was estimated between 56 and 57 amino acid residues. It was noticed that no N-terminal signal peptide, no endoplasmic reticulum retention site and no peroxisomal targeting signal were present in the C-terminal region. There was no possible vacuolar targeting motif and no RNA-binding motif likewise. Similarly, there was no transport motif from the cell surface to Golgi, and no N-myristoylation pattern was found. Therefore, the κ-nearest neighbor (κ-NN) algorithm was deployed for assessing the probability of localizing at each candidate site, where κ is a predefined integer parameter. A two-fold κ-NN was employed by the PSORT tool (Protein Subcellular Localization Prediction tool), where two different κ values (κ1 < κ2) were used, localization sites were classified into two categories according to their data size, and the prediction probability for localization corresponded to the reckoned κ-data points. The results showed that κ-data points contained nuclear proteins with 47.8%, cytoplasmic proteins with 34.8%, mitochondrial proteins with 8.7% and cytoskeleton proteins with 8.7%. The Bg_9562 was predicted to be localized in the nucleus with a probability of 47.8%.
2.10. Secondary Structure of Bg_9562
Secondary structure prediction of Bg_9562 using various methods of the NPS@ (Network Protein Sequence @nalysis) server estimated that there is a dominance of coiled structural components followed by helices and extended strands in Bg_9562 (Table 3). Similar results were also obtained using PSIPRED 4.0 workbench (PSIBLAST-based secondary structure PREDiction) (Figure 5A,B).
Table 3
Secondary structure of Bg_9562 predicted using NPS@ server.
Secondary Structure
DSC *
HNN *
MLRC *
PHD *
PREDATOR
Secondary Consensus
Alpha helix
55.86%
27.3%
26.13%
34.23%
22.52%
36.04%
Extended strand
3.6%
7.21%
9.91%
16.22%
3.6%
9.91%
Random coil
40.54%
65.77%
63.96%
49.55%
73.87%
52.25%
* DSC—Discrimination of Protein Secondary structure Class. * HNN—Hierarchical Neural Network method. * MLRC—Multivariate Linear Regression Combination. * PHD—Prediction Heidelberg secondary structure prediction method.
Figure 5
(A) Prediction of secondary structure of Bg_9562 using PSIPRED 4.0 work bench (B) Annotation grid showing the structure and function of each residue of Bg_9562 protein using PSIPRED 4.0. (C) 3D structure model of Bg_9562 developed using Robetta (D) Validation of 3D structure model using Ramachandran plot.
2.11. 3D Structure Modeling of Bg_9562
The 3D structure of Bg_9562 is not available in the Protein Data Bank (PDB) (http://www.rcsb.org/pdb, accessed on 24 January 2022) Due to the absence of a suitable structural template, homology modeling was not successfully utilized. Thus, using the comparative/homology modeling, web-based servers PHYRE 2.0 (Protein Homology/anologY Recognition Engine), HHpred, RAPTORX, (PS)2-v2 and SWISS MODEL generated models with less confidence, and their Verify 3D values and overall quality factor (ERRAT) were not within the accepted range (Table S7).In order to obtain the high-quality structure of Bg_9562, ab initio and threading approaches were deployed. By using these approaches, five models were generated through LOMETs (Local Meta-Threading Server), ten models generated through SPARKS, five models from QUARK and five models from Robetta. The models generated through LOMETs and SPARKS did not qualify upon validation using Verify 3D, ERRAT and Ramachandran plot. The most appropriate approach was ab initio using Robetta and QUARK. The models generated by Robetta and QUARK were similar in Verify 3D and ERRAT, while values varied with the evaluation with Ramachandran plot (Table S7). Model four of Robetta displayed allowed values of Z-score (−7.4) and QMEAN4 (−1.42), and Verify 3D (100%) and ERRAT (98.05%) and Ramachandran plot favored regions with 87.4%. We selected this as the best model for the 3D structure of Bg_9562 (Figure 5C,D). For this predicted 3D structure, the COACH tool (consensus approach to protein–ligand binding site prediction) predicted the binding ligands as nucleic acids and zinc with lower confidence scores of 0.07 C-Score and 0.05 C-Score, respectively. C-Score, as predicted using the COACH tool, defines the confidence score of prediction and it ranges 0–1, where a higher score indicates more reliable prediction. The consensus binding residues of Bg_9562 for nucleic acids included 38, 67, 78, 90, and 91 and for that of zinc were 82 and 85 residues of Bg_9562.AlphaFold 2 also predicted a highly accurate structure of Bg_9562. The three outputs of AlphaFold 2 are represented in Figure 6. The first output is the 3D coordinates of Bg_9562 (Figure 6A). The second output is per residue confidence metric, called pLDDT, that corresponds to the model’s predicted score on the LDDT-Cα metric, which is usually in the range of 0–100. The Local Distance Difference test (LDDT) score is computed using only distances between Cα atoms in the protein model. The obtained model for Bg_9562 was found to be highly accurate, possessing the regions with pLDDT > 90. The third output was Predicted Aligned Error, which is required to assess the confidence in the domain packing and large-scale topology of the protein. The dark green color indicated low error in predicted domains and relative orientations of Bg_9562 (Figure 6C).
Figure 6
The three outputs of AlphaFold v.2.1.0. online tool. (A) Predicted 3D structure of Bg_9562 using AlphaFold 2. The predicted model has very high confidence levels of pLDDT > 90. (B) The predicted LDDT (pLDDT) plot gives best information on intra-domain confidence and it interpreted as high as pLDDT > 90. (C) Interactive 2D plot of Predicted Aligned Error (PAE), the dark green color at (x, y) indicated AlphaFold 2 predicted low error with well-defined relative positions from two different domains.
2.12. Functional Analysis of Conserved and Disordered Regions of Bg_9562
The pure form of protein Bg_9562 was obtained from National Institute of Plant Genome Research laboratory (NIPGR, Delhi, India). The laboratory procedure for protein expression and purification was described in [17]. After confirming the presence of ~13 kD band in Western blot, the purified protein Bg_9562 was utilized for all further experiments.
4.2. Broad-Spectrum Bioactivity Assay of Bg_9562 on Fungal and Bacterial Growth
Individual axenic cultures of RS, TAIK1 and BIK3 were grown on Potato Dextrose Agar (PDA) and Luria Broth Agar (LBA) media, respectively, and kept for incubation at room temperature. Using a cork borer, mycelial discs were collected and were dipped in 1 mL (100 µg/mL) of protein and incubated at two temperatures, viz., 35 °C, 45 °C; pH 3.0, 9.5 (n = 3) and at time intervals, viz., 6, 12 and 24 h along with positive (purified Bg_9562 protein at standard temperature and neutral pH) and negative controls (axenic cultures of RS, TAIK1, BIK3). After treatment, fungal/bacterial discs were washed individually in sterile distilled water and placed on fresh PDA and LBA plates and were incubated at their respective growth temperatures. Observations in the form of mycelial growth or bacterial colonies were observed after 6, 12 and 24 h in terms of mycelial lawn area (RS, TAIK1) and bacterial colonies (BIK3) observed in treated and control plates. The experiment was repeated three times.
4.3. Efficacy of Bg_9562 Protein under Field Conditions
To test the efficacy of Bg_9562 protein, pilot field trials were taken up in a rice variety TN1. The field trials were laid out in a Randomized Block Design with three replications and three treatments (pre-infection, post-infection and co-inoculation of RS and protein) against rice sheath blight disease at varying concentrations of the protein, viz., 2.5 ppm, 5.0 ppm, 7.5 ppm and 10 ppm, and with a spray volume of 833 mL/T maintaining positive (Carbendazim 50% WP) and negative (untreated) controls (Table S9).
4.4. Sequence Retrieval
Nucleotide sequence of the Bg_9562 gene of B. gladioli strain NGJ1 was retrieved from the NCBI database using accession code KX620741. It was then translated to polypeptide sequence using the online tool Expasy-translate [63] and it was used for further in silico analysis in the study. The computational approaches utilized in every step of the in silico study are described briefly as a flowchart (Figure S3). Various online software/servers deployed at each step in the in silico analysis are listed and their URL details are given in Table S4.
4.5. Prediction of Physiochemical Properties of Bg_9562 Protein
The amino acid composition and various other physiochemical properties of Bg_9652, including theoretical pI, molecular weight, bulkiness, polarity, aliphatic index and instability index were estimated using the online tools Expasy-ProtParam [63] and Protscale [63] that are available for free use in the public domain. The hydrophobicity plot of the contiguous amino acid residues of Bg_9562 was determined by constructing the Kyte–Doolittle hydropathy graph [64]. Other parameters, i.e., Wimley–White whole-residue hydrophobicity of the peptide, protein binding potential (Boman index) and antimicrobial potential were calculated using predictive online tools of the Antimicrobial Peptide Database v.2.34 (APD2; http://aps.unmc.edu/AP/main.php, accessed on 31 August 2020) [65]. The Boman index is an estimate of protein-binding potential, calculated on the basis of cyclohexane-to-water partition coefficient of the respective amino acid side chains divided by the total number of amino acid residues within the peptide [19]. Wimley–White whole-residue hydrophobicity of the peptide (the sum of whole-residue free energy of transfer of peptide from water to phosphatidylcholine interface) calculates the protein hydrophobicity, and the higher its value the more likely the protein is to be hydrophobic [66].
4.6. Prediction of Functional Properties of Bg_9562 Protein
The motifs in the Bg_9562 protein were predicted using the default parameters in the MEME Suite 5.4.1 (Multiple Em for Motif Elicitation) [67]. MEME represents motifs as position-dependent letter–probability matrices that describe the probability of each possible letter at each position in the pattern. The prediction of transmembrane-spanning regions and their orientation in the protein Bg_9562 was performed using TMPred online software. The algorithm is based on the statistical analysis of TMbase, a database of naturally occurring transmembrane proteins [68]. Another tool that predicts topology of membrane proteins is based on a hidden Markov model; TMHMM-2.0 server was employed [69]. The property of localization of Bg_9562 within the cells was predicted using the PSORT program [70].Online database CRYSTALP2 [71] was used to predict the propensity of the protein to crystallize. Either fully or partially disordered proteins have little tendency to crystallize. The sequence-based prediction of the propensity for production of diffraction-quality crystals, production of crystals, purification and production of the protein material was predicted using the online database PPCpred [72]. A ConSurf server was employed to identify the conserved and biologically important residues in the amino acid sequences of Bg_9562 [73]. The degree to which an amino acid position is evolutionarily conserved (i.e., its evolutionary rate) is strongly dependent on its structural and functional importance. Thus, conservation analysis of positions among members from the same family can often reveal the importance of each position protein structure or function. In ConSurf, evolutionary rate is estimated based on the evolutionary relatedness between the protein and its homologues and considering the similarity between amino acids as reflected in the substitution matrix methods. The multiple alignment was built using MAFT and UNIREF90 was used for the collection of homologs using HMMER homology search algorithm of the ConSurf database. These generated sequences that were closest to the given query, i.e., Bg_9562 were selected for phylogenetic tree representation. The phylogenetic tree was constructed using Pylogeny.fr, an online database tool [74].
4.7. Secondary Structure Prediction
The secondary structure of Bg_9562 protein was predicted using PSIPRED 4.0 workbench [75] and in addition, the secondary structure of the protein was also predicted employing NPS@ server. The Network Protein Sequence @nalysis (NPS@) is an interactive webserver dedicated to protein sequence analysis [76]. Secondary structure prediction using NPS@ server was performed with various methods, viz., DSC [77], HNN [78], PHD [79], PREDATOR [80], MLRC [81] SIMPA96 [82] and secondary consensus [83] keeping default parameters for 4 state predictions and keeping output width = 70.
4.8. 3D Structure Prediction
To obtain a high-quality model of Bg_9562 protein, its tertiary structure was modeled using in silico methods including homology modeling, threading and ab initio. For comparative homology modeling, web-based server HHPred [84] was used and the best alignment suggested by HHpred that is typically collected using the PSIBLAST program was used. The atomic coordinates built on this target template alignment were generated using MODELLER v. 9.2 [85]. Another web-based server, Phyre2, was used, which uses the hidden Markov method to generate alignments of a submitted protein sequence against proteins with published structures [86]. The resulting alignments were then used to produce homology-based models of the query sequence to predict its three-dimensional structure. RaptorX uses a non-linear scoring function to combine homologous information with structural information for a given template–sequence alignment. It uses NEFF to adjust the relative importance of homology and structural information. RaptorX uses a combination of RaptorX-Boost and Raptor-X MSA to build 3D models for a target–template alignment [87]. SWISS MODEL was used for predicting and analyzing homology-based 3D protein structure [88]. Web-based threading servers, LOMETS (which builds 3D models by collecting high-scoring target-to-template alignments from locally installed threading programs FFAS3D, HHsearch, MUSTER, pGenTHREADER, PPAS, PRC, PROSPECT2, SP3 and SPARKS-X) [89] and I-TASSER that generates 3D models based on the hierarchical method for protein structure and function prediction were employed [90]. Further, SPARKSX, that constructs 3D models based on the application of probabilistic-based matching between predicted primary structural properties of the query and corresponding native properties of templates was also used in the study [91]. For proteins such as Bg_9562, which has very little homology with known templates, another method, ab initio, was deployed. Ab initio is a de novo protein structure prediction that builds 3D models of proteins from primary structures in the presence or absence of homologs to the given query protein. Robetta by Baker’s lab, whose protein structure prediction is continuously evaluated using CAMEO, was deployed for ab initio method [92] and another tool was QUARK [93]. As the existing methods fall far short of atomic accuracy, especially when no homologous template is available, another most powerful tool based on neural network-based model, AlphaFold 2 [94], was utilized to build highly accurate model of Bg_9562.AlphaFold 2 on Google Colab’s Notebook [95] was used to model Bg_9562. This Colab notebook allows easy prediction of protein using a slightly simplified version of AlphaFold v.2.1.0. Colab notebooks are coupled to a special sequence search program, MMSeqs2, thus making this technology more powerful.
4.9. 3D Structure Validation
The 3D structure validation of modeled protein was carried out by PROSA, QMEAN4 and Ramachandran webservers. Using PROCHECK [96] plot to visualize backbone torsional angles, ψ and φ of residues in protein structures were drawn. It is a universal way to calculate the number of residues in favored, allowed and outlier regions in the given protein. PROSA webserver is a frequently employed tool in the validation of protein structures obtained from X-ray analysis, NMR spectroscopy and theoretical calculations. QMEAN4 is a linear combination of four statistical potential terms. It predicts a global lDDT score in the range (0,1) and the value calculated by the webserver is transformed into a Z-score [97]. After selecting the best evaluated 3D structure of Bg_9562, ligands binding to it were predicted using the COACH server [98]. COACH is consensus approach to protein–ligand binding site prediction. When the given structure is subjected to COACH server, it predicts complementary ligand binding sites using combined multiple prediction results of algorithms from TM-SITE, S-SITE, COFACTOR, and FINDSITE [98].
The current study of the in silico analysis of physiochemical properties coupled with wet-lab experiments of Bg_9562 help in better understanding of the protein and its functional characterization, further facilitating in the identification of fungal–protein interactions and membrane and nucleic acid interactions associated with the broad-spectrum antifungal activity of the protein. Furthermore, considering its broad-spectrum antifungal activity, the protein may be potentially useful in biotechnological applications to control fungal diseases. According to our results, Bg_9562 is a stable protein with a single transmembrane region, possesses the right amount of aliphatic index and hydropathicity index values, and is essentially nontoxic to be used as an antifungal drug. In short, bioinformatics analysis would be helpful to choose the functional and targeted segments of the protein for preparations of specific formulations for controlling sheath blight disease of rice.
Authors: Andreas Biegert; Christian Mayer; Michael Remmert; Johannes Söding; Andrei N Lupas Journal: Nucleic Acids Res Date: 2006-07-01 Impact factor: 16.971