Literature DB >> 35474779

Deciphering Isoniazid Drug Resistance Mechanisms on Dimeric Mycobacterium tuberculosis KatG via Post-molecular Dynamics Analyses Including Combined Dynamic Residue Network Metrics.

Victor Barozi¹, Thommas Mutemi Musyoka¹, Olivier Sheik Amamuddy¹, Özlem Tastan Bishop¹.

Abstract

Resistance mutations in Mycobacterium tuberculosis (Mtb) catalase peroxidase protein (KatG), an essential enzyme in isoniazid (INH) activation, reduce the sensitivity of Mtb to first-line drugs, hence presenting challenges in tuberculosis (TB) management. Thus, understanding the mutational imposed resistance mechanisms remains of utmost importance in the quest to reduce the TB burden. Herein, effects of 11 high confidence mutations in the KatG structure and residue network communication patterns were determined using extensive computational approaches. Combined traditional post-molecular dynamics analysis and comparative essential dynamics revealed that the mutant proteins have significant loop flexibility around the heme binding pocket and enhanced asymmetric protomer behavior with respect to wild-type (WT) protein. Heme contact analysis between WT and mutant proteins identified a reduction to no contact between heme and residue His270, a covalent bond vital for the heme-enabled KatG catalytic activity. Betweenness centrality calculations showed large hub ensembles with new hubs especially around the binding cavity and expanded to the dimerization domain via interface in the mutant systems, providing possible compensatory allosteric communication paths for the active site as a result of the mutations which may destabilize the heme binding pocket and the loops in its vicinity. Additionally, an interesting observation came from Eigencentrality hubs, most of which are located in the C-terminal domain, indicating relevance of the domain in the protease functionality. Overall, our results provide insight toward the mechanisms involved in KatG-INH resistance in addition to identifying key regions in the enzyme functionality, which can be used for future drug design.

Entities: Chemical

Year: 2022 PMID： 35474779 PMCID： PMC9025985 DOI： 10.1021/acsomega.2c01036

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

Mycobacterium tuberculosis (Mtb) is the causative agent of tuberculosis (TB), and the disease presents a major global health problem with an estimated 1.3 million deaths worldwide comprising 214,000 HIV-positive cases.[1,2] Pathogen resistance against individual and combination use of first-line drugs (multi-drug-resistant, MDR) as well as first-line and second-line TB drugs (extensively drug resistant, XDR) present the greatest hurdles in disease management.[3] Inaccessibility to treatment and poor drug adherence among patients are the major factors toward Mtb drug resistance.[4] Besides the present challenges, the outbreak of the coronavirus disease 2019 (COVID-19) pandemic worsened TB management as a result of increased household exposure not to mention the increase in poverty levels especially in vulnerable populations.[1] Resistance to the most effective first-line TB drugs,[5] isoniazid or isonicotinic acid hydrazide (INH), rifampicin (RIF), ethambutol (EMB), and pyrazinamide (PZA), is mostly as a consequence of missense mutations within the Mtb drug targets.[6,7] The prodrug, INH, which has a simple structure consisting of a pyridine ring and an hydrazide group,[8] is activated by the Mtb catalase peroxidase (KatG) enzyme in the presence of nicotinamide adenine diphosphate (NAD), forming an INH-NAD adduct. The resulting adduct binds tightly to the enoyl acyl carrier protein (ACP) reductase (InhA), inhibiting the synthesis of mycolic acid, an integral component of the Mtb cell wall. The 80 kDa homodimeric Mtb KatG enzyme is a bifunctional protein possessing both catalase and peroxidase activity,[9] which is ascribed to a case of gene duplication. It is categorized under class I of the superfamily of plant, fungal, and bacterial peroxidases due to its high sequence similarity in relation to cytochrome c peroxidase from yeast and ascorbate peroxidase.[10] Each protomer contains N-terminal and C-terminal domains (Figure ). A heme group (cofactor) is located in the active site pocket of the N-terminal domain and sits at the end of the substrate access channel marked by residues Asp137 and Ser315.[11] As shown in Figure , the heme binding area is surrounded by two pockets and conserved residues: Arg104, Trp107, and His108 are in the distal pocket, and His270, Trp321, and Asp381 are in the proximal pocket.[12] A recent cryogenic electron microscopy (cryo-EM) article reported three potential binding sites for the pro-drug, two of which are in the vicinity of heme and the third one is near to the protomer interface area.[12] Next to the active site is an adduct of Trp107, Tyr229, and Met255 (MYW) that is characteristic of most catalase peroxidases (Figure ).[13] Previous studies have indicated that the adduct facilitates the catalase and not the peroxidase activity of the KatG enzyme.[14,15] Besides lacking the heme cofactor, truncation of the C-terminal domain from the KatG structure by Baker et al.[16] resulted in diminished catalase and peroxidase activity of the enzyme. Similarly, DeVito et al. demonstrated that removal of at least 42 C-terminal residues compromised the KatG ability to activate INH.[17] As a result, the C-terminal domain is believed to offer architectural stability to the KatG active site. The protease functions as a dimer with the help of the dimerization domain (first 100 residues), which hooks around the same region in the opposite monomer.[17−19] Besides fortuitously activating INH, KatG is known for protecting Mtb from toxic hydroperoxides and hydroxyl radicals present in the aerobic environment.[20] This characteristic is one of the virulence factors associated with Mtb that has seen it survive hostile host environments. Through its catalase activity, the KatG enzyme takes hydrogen peroxide as a substrate, forming an intermediate compound (oxyferryl porphyrin π cation radical) which subsequently reacts with another hydrogen peroxide molecule converting KatG back to its natural state while releasing water and oxygen.[21]

Figure 1

Structure of the wild-type KatG enzyme. The dimeric structure of KatG with protomers A and B shown in cartoon and surface presentations, respectively. The heme group is depicted in a magenta stick presentation, with the binding pocket made up of both six conserved residues located in the proximal pocket (red sticks) and distal pocket (blue sticks). The MYW triad (Met255, Tyr229 in green sticks, and Trp107 in blue) is conserved in all catalase peroxidases and is vital for catalase activity. Due to the multistaged mechanism to action of INH, mutations in the katG(22) and inhA(23) genes are the main drivers of phenotypic Mtb-INH drug resistance, in which 60–90% of the resistance is due to KatG protein variations.[24−27] Even though there are more than 230 mutations reported to date,[28] computational studies to decode the resistance mechanisms behind these mutations are very limited and are mostly focused on the analysis of mutations on single static structures via mutating residues.[29−34] The recent study,[12] perhaps, is one of the first that looks more deeply at the two mutations, W107R and T275P, to understand the change in structure and activity of KatG. To further uncover drug resistance mechanisms of INH as a poorly understood area, we applied computational approaches that we have recently developed.[35−38] Our objective was to discern the structural and functional effects of 11 high confidence (HC) mutations (S140N, S140R, G279D, G285D, S315I, S315N, S315R, S315T, G316D, S457I, and G593D), so as to guide the process of designing next generation antitubercular drugs. Previous in silico studies, for instance, using INH derivatives illustrated that INH’s binding energy in the mutants can be refined through INH redesign, hence improving the INH binding ability in mutants.[39] To this effect, we modeled 3D structures of KatG mutant proteins and investigated their conformational changes over time through molecular dynamics (MD) trajectory analysis. Post-MD analysis revealed a diverse array of effects of the mutations on the KatG protein structure characterized with changes in protein compactness and flexibility of the mutant systems in comparison to the wild-type (WT). In addition, instability and increased conformational variation of the proposed INH binding pocket was observed in the mutants through one of our novel approaches, comparative essential dynamics (ED) analysis[36] Residues critical in the communication network of the WT and mutant proteins were identified using combined dynamic residue network (DRN) metric analysis for the first time in KatG as initially done by other studies within our research group.[40,41] Additionally, through alanine scanning, key interface residues contributing the bulk of the binding energy and with a high centrality in the mutant structures were identified. Collectively, this study employs a combination of computational approaches to provide insight into the structural and residue communication changes caused by HC drug resistance mutations in Mtb KatG.

Results and Discussion

High Confidence KatG-INH Resistance Mutations Retrieved

Eleven high confidence missense mutations, associated with phenotypic INH drug resistance, were retrieved from the TBDReaMDB.[42] These mutations (S140N, S140R, G279D, G285D, S315I, S315N, S315R, S315T, G316D, S457I, and G593D) were previously identified through minimum inhibitory concentration and genomic sequencing studies all around the world.[43−45] A majority of the mutations (S140N, S140R, G279D, G285D, S315I, S315N, S315R, and S315T) are located in the N-terminal domain, whereas S457I and G593D are in the C-terminal domain (Figure ). Residue 315, positioned at the entrance of the active site access channel, has the highest number of mutations, including the most prevalent INH resistance conferring mutation, S315T.[46−48]

Figure 2

High confidence KatG mutations. Shown are the positions of the 11 KatG-INH HC missense mutations in the N- and C-terminal domains.

High confidence KatG mutations. Shown are the positions of the 11 KatG-INH HC missense mutations in the N- and C-terminal domains. In this study, we modeled the mutant proteins by introducing each of the mutations to both protomers. In each model, the z-DOPE score was less than −1.1.[49] The potential effect of resistance mutations on protein structure and function was then analyzed in two parts: global analysis to identify the structural changes in the enzyme as a whole and local analysis to pinpoint mutation related changes at residue level.

Part I: Global Analysis

Mutation-Induced Global Structural Changes Identified

To follow the global conformational changes over the course of 300 ns MD simulations, we analyzed the WT and mutant protein structures via root-mean-square deviation (RMSD) and radius of gyration (Rg) calculations. At first, commonly used RMSD versus timeline plots were determined to evaluate the positional divergence of a structure compared to the starting conformation over the simulation (Figure A). The results indicated that WT and mutant proteins equilibrated very quickly and attained structural stability during the simulations. Of these systems, G279D, G285D, S315N, S457I, and G593D had RMSD values slightly higher than those of WT as an indication of increased conformational variation. We next presented the RMSD data as violin plots per protomer (Figure B). This analysis revealed that WT and a majority of the mutant proteins visited multimodel conformations and showed asymmetric protomer behavior. Even though it is not widely studied, asymmetric behavior of homodimer proteins is observed and analyzed in other cases.[37,50] S457I presented the most variability in both protomers. Whereas protomer A of S140R, G279D, and G285D had multiple conformations, protomer B displayed unimodal RMSD distribution as an indication for the protomer visiting a single conformational space during the simulation. Only two systems, S140N and S315T, demonstrated symmetric protomer behavior in unimodal shape.

Figure 3

KatG WT and mutant RMSD and Rg analysis. (A) WT (blue) and mutant (orange) line plots showing RMSD progression in reference to the initial structure over the simulation period. The x-axis shows the time (ns) and y-axis the RMSD (nm). (B) Split violin plots of the RMSD and Rg distribution in both protomers (protomer A is in green, and protomer B is in orange) of all systems sorted in ascending order of the median RMSD and Rg. The x-axis is for the protein systems, and the y-axis is for the RMSD and Rg. (C) All versus all RMSD data shown as a heat map. The x- and y-axes represent the frames at different time stamps (ns). The color gradient from white to dark red indicates the degree of variation. Next, for every system, all versus all frame RMSD analysis was performed for a comparison of each protein MD frame to itself and to all the other frames (Figure C). WT is noted to sample two major conformational ensembles: (1) from 0 to 200 ns and (2) from 200 ns onward. This observation was also common for most mutants, however, at different time stamps in each system. Based on the RMSD scale, most systems experienced less conformational diversity between frames (RMSD ≤ 4 Å) throughout the simulation except for S457I, where an RMSD of ≥4 Å relative to the initial conformation was noted. As previously observed (Figure A,B), S457I again demonstrated the highest variation. To determine the quantitative distribution of the system conformations, similar structures were clustered in each trajectory using the GROMACS geometric clustering approach. In clustering, similar conformations are grouped together based on their geometric similarity. The method uses a partitioning algorithm, and hence, a conformation can only be present in one unique cluster ensemble.[51] The clustering calculations showed that, for the majority of the systems, most protein structures were distributed within two main cluster ensembles (Table S1) with one of the ensembles (C1) having more structures than the other. Superimpositions of system structures from the different cluster ensembles confirmed that the difference in the clusters is mostly due to loop region dynamics (Figure S1). Furthermore, clustering results were in agreement with all versus all RMSD analysis, which also showed that most systems sampled two main structural conformations. Rg was calculated to investigate the effect of mutations on the overall compactness of the structure. As shown in Figure B, Rg analysis presented minimal differences in the overall structural compactness of the mutants compared to the WT (all within the range of 2.775–2.975 nm Rg). Protomer A of most mutants showed a degree of gyration slightly higher than that of protomer B.

Mutations Caused Asymmetric Protomer Behavior and Large Loop Fluctuations in the Vicinity of the Heme Binding Pocket While C-Terminal Domain Stayed Stable

Here, we used one of our recently developed tools from MDM-TASK-web,[36] comparative ED, to analyze mutation caused conformational changes in the protein as a whole (Figure S2) as well as at the protomer level (Figure ). In this approach, the trajectories are aligned with each other, and a single covariance matrix is calculated to avoid a potential problem, which is the generation of a trajectory-specific covariance matrix that might not be accurately comparable. Even though the total variance from the first two PCs might be lower than that for traditional PCA calculations, the single covariance matrix represents the total variability shared across the trajectories, hence we believe it reliably represents the differences between the systems.[35]

Figure 4

Comparative essential dynamics analysis of the WT and mutant proteins at protomer level. For each subplot, the WT is at the top and mutant at the bottom with protomer A to the left and protomer B to the right. The x- and y-axes show the variance in percentage as explained by PC1 and PC2, respectively. The plot color code from dark blue to yellow shows the simulation progression in nanoseconds. From Figure S2, protein essential motions for each mutant were comparatively determined in relation to the WT along PC1 and PC2, in which the total variance ranged between ∼40 and 60%. A majority of the mutant systems, viz. S140N, S140R, G285D, S315I, S315R, G316D, S457I, and G593D, sampled a diverse conformational space in relation to the WT across PC1 and PC2 axes. PC1 consistently accounted for most conformational variance with the least percentage at 29.55% in G316D and the highest in S315N (54.43%). Comparative ED features automated conformation extraction from the lowest energy basins.[35] Superimposition of the lowest energy conformation from each mutant to that of the WT revealed a low RMSD variation range between 2.175 Å in S315T and 3.396 Å in S315N (Table S2). Interestingly, the most diverse conformational space sampled with respect to the WT was seen in S457I, even though the lowest energy structure had an RMSD of 3.326 Å compared to that in the WT. This could mean that the mutant systems with a diverse conformation space to the WT (S140N, S140R, G285D, S315I, S315R, G316D, S457I, and G593D) sampled more conformations before equilibration, and this is in agreement with both the RMSD and geometric clustering results. At the protomer level, comparative analysis was done between WT and each mutant protein separately, and within each paired system, each protomer of a mutant was compared to the protomers of WT (or vice versa); this was done since protomer dynamics can switch between identical copies of a protomer in a homodimer[37] (Figure ). We further supported this analysis by the root-mean-square fluctuation (RMSF) calculations (Figure S3). In the S140N-WT pair, less conformational variation was observed in both WT protomers compared to protomer B of S140N along PC1. This was attributed to high loop region fluctuations at positions 359–375 observed in S140N protomer B (Figure S3). In S140R, both mutant protomers A and B explored more conformational space along PC1 and PC2, respectively, compared to the WT. Here, besides the disordered loop (position: 359–375) in both protomers of S140R, there was a marked increase in residue fluctuation at positions 245–350, more so in protomer A (Figure S3). In G279D, both WT protomers shared a more compact distribution compared to the mutant, especially protomer A of G279D. Asymmetry in conformational variation was noted between protomers A and B of G285D. Protomer A had a spread-out distribution compared to B. In relation to the WT, mutant protomer A had more variation, which concurs with RMSD violin plot results that indicated more structural variation for protomer A than for protomer B. The variance in protomer A eigenspace could be explained by the marked residue flexibility noted at positions 405–415 of G285D compared to that of the WT (Figure S3). A switch in protomer behavior where WT protomer A behaved somewhat like S315I protomer B and vice versa was observed in the S315I-WT pair, and this behavior is supported in the RMSF analysis too. We recently reported a similar switch behavior between the protomers in the presence of mutations in the homodimeric SARS-CoV-2 Mpro protein, too.[37] Like in most mutant systems, protomer A of S315N explored more conformational space than the WT along PC2, whereas in protomer B it was along PC1. A distinctively spread-out conformational space in contrast to the WT was noted in S315R, S457I, and G593D in both protomers and along both PCs. In all of these systems, increased RMSF distribution compared to the WT was noted at different regions of the structure, significantly so in S457I (Figure S3). S315T and G316D also experienced a diverse protomer A distribution in contrast to the WT along PC2, which made up 19.98 and 13.03% of the motions, respectively. Similarly, protomer B sampled more conformations predominately along PC1 (24.86% for S315T and 25.87% for G316D). Altogether, S457I had the most diverse conformational space congruent to RMSD and RMSF analysis which showed more structural variation and residue flexibility in the mutant, respectively. Across all systems, PC1 accounted for most of the essential dynamics. Another interesting observation from RMSF analysis was that the C-terminal domain of both protomers stayed similar to WT (Figure S3). Previously, cryo-EM studies of W107R and T275P mutant proteins showed significant structural disorder in the vicinity of the heme binding pocket and the loops surrounding the binding pocket, asymmetric behavior of mutant proteins, and stable C-terminal domain in both WT and mutants.[12] Collectively, our results on the RMSD, RMSF clustering, and principal component analysis (PCA) agree with cryo-EM findings and indicate that the HC mutations studied here, also, affect conformational evolution of the KatG structure with more structural variability. Considering the link between protein structural dynamics and function,[52] the mutation-imposed changes could, in turn, affect KatG’s sensitivity to INH through changes in the binding pocket environment. To that effect, a comparative analysis of the binding pocket motions of the mutant systems to the WT was investigated in the next section.

Comparative Essential Dynamics Revealed Diverse Conformational Changes in Each Protomer Binding Pocket

As a next step, we further zoomed into the INH binding pocket of each protomer per protein. Inferring from literature about the possible location of the INH binding pocket[12] and using the Computed Atlas of Surface Topology of proteins (CASTp) Web server[53] for protein pocket detection, the potential KatG-INH binding pocket residues were identified as 91–95, 97, 98, 100, 101, 103, 104, 107, 108, 136, 137, 139, 140, 205, 224, 227–233, 248, 252, 265, 266, 269, 270, 272–276, 281, 309, 312–315, 317, 321, 326, 350, 378, 380, 381, 408, 412, and 415. Some of the binding pocket residues are located in the loops surrounding the pocket. These loops are located in residues 134–141, 189–240, 271–330, and 355–380. In the previous section, we showed that these loops have increased flexibility in the presence of resistance mutations. According to comparative ED results of the potential KatG-INH binding pocket (Figure ), the majority of the mutants showed more conformational variability in the prodrug binding pocket compared to the WT. Asymmetric behavior was also noted per system protomer in (1) the diversity of the conformational space, viz. S140R, G285D, S315I, S315R, G316D, and G593D, and (2) in the shift of conformation distribution in the majority of the proteases. Furthermore, a correlation between binding pocket and protomer behavior was observed across the systems where highly diverse protomer conformation distributions in comparison to the WT resulted in assorted motions of the binding pocket. It is likely that the highly flexible regions observed in the N-terminal domain that were noted to affect general protomer dynamics equally affect the binding pocket kinetics based on their location in the N-terminal region. G285D showed the most spread-out distribution along PC1 in protomer A, even though it contributed 34.48% of the motions, and this can be linked to the highly flexible helix region at positions 405–415. From our analysis, we observed that the total variance based on PC1 and PC2 was slightly above 50%, and this was attributed to the diverse conformational states sampled between the WT and mutants. This can also be seen from the Rg and RMSD results of the binding pocket (Figure B), where all of the mutant systems show higher pocket RMSDs compared to that of the WT, signifying a higher degree of variability. Further, from Rg, all mutants showed a higher degree of gyration of the binding pocket residues compared to that the WT except for S315N.

Figure 5

Potential KatG-INH binding pocket analysis. (A) Scatter plots of comparative essential dynamics of the INH binding pocket in the WT and of mutant proteins. For each subplot, the WT is at the top and the mutant is at the bottom with protomer A to the left and protomer B to the right. The x- and y-axes show the variance in percentage as explained by PC1 and PC2, respectively. (B) Split violin plots of the potential INH binding pocket RMSD and Rg distribution for protomers (protomer A is presented in green, and protomer B is in orange) arranged in ascending order of the median.

Noticeable Increase in the Center of Mass Distances and Decrease in Hydrogen Bond Distribution Were Observed in Some Mutants

As a next step in analyzing the observed pronounced asymmetric behavior of the protomers and of the binding pocket in the presence of the mutations as well as increased loop fluctuations in the vicinity of binding pocket, we further explored the heme environment. One of the identified INH binding sites is near the δ-meso edge of the heme, in which the recent cryo-EM study combined with wet lab experiments showed reduced heme uptake and retention in KatG W107R and T275P mutants due to variation caused disorder in the heme environment, which consequently affected INH activation.[12] The heme makes covalent bond linkages, van der Waals interactions, and the hydrogen bonds between the proportionate heme side chains and the protein residues.[54] Here, we investigated the heme behavior through (1) analyses of center of mass (COM) distance, (2) hydrogen bonds (H-bonds), and (3) the short-range heme contact interactions for the last 50 ns of MD trajectories. First, the positional stability of the heme was explored through measurement of distance between the COM of heme and the COM of each protomer active site per protein system (Figure A). We believe this measurement is more accurate compared to that of heme RMSD with respect to active site of the reference structure, as the active site demonstrated large variability in different mutant cases (Figure ). The same active site residues previously identified using CASTp Web server[53] in the comparative essential dynamics section were used here for active site COM measurement. We observed a noticeable increase in the COM distance of S315I (both protomers) and S315R (protomer B).

Figure 6

Comparative analysis of heme environment in WT (in blue) and mutants (in orange). (A) Line plots of center of mass (COM) measurements between heme and active site pocket per protomer. (B) Line plots of the time evolution of the hydrogen bond (H-bond) formed between the heme and the protein systems in protomers A and B, respectively, for the last 50 ns. The x-axis shows the time, and the y-axis shows COM distance or the number of the H-bonds formed, respectively. Next, we analyzed the propensity of heme to form H-bonds in the active site pocket of KatG protomers in the presence of resistance mutations (Figure B). As recently reported by Chaplin and team,[12] the tendency of heme to form H-bonds was different for each protomer in the majority of the systems. In the WT, fewer heme H-bonds were formed in protomer B than in A. Furthermore, in comparison to the WT, generally, a fewer number of H-bonds were noted in protomer A of the mutants, whereas in protomer B, heme H-bond distribution was more or less the same in the WT and mutants. Furthermore, protomer A of G279D, G285D, and G316D formed the fewest number of H-bonds. This could potentially be because of the reduction in the contact frequency between the heme and coordinating residues in the active pocket, as discussed in the next section. Interestingly, we observed a similar trend between COM and H-bond measurements in protomer B of S315I and S315R mutants and the opposite in protomer A of S315I.

Part II: Local Analysis

Enhanced Asymmetric Behavior of Heme Atomic Interactions per Protomer Was Observed in the Presence of Resistance Mutations

To further zoom into cofactor–protein interactions at the atomic level and to identify immediate heme contacts over the MD simulation, a modified version of the weighted residue contact map tool[35,36] that ranks the heme contact residues based on atomic contact frequency was used. The script ranked the heme atomic contacts per residue and based on the frequency of contact on a scale of 0 to 1. The heme–protein residue contact frequencies were then normalized across all systems, and the results are presented as a heat map (Figure A).

Figure 7

Heme atomic interactions in the protein systems. (A) Heat map of heme atomic interactions in protomers A and B for all protein systems. Interaction frequency is normalized and ranked between 0 and 1. (B) Heat map illustrating asymmetric behavior in heme–protein interactions between protomers A and B. The color scale indicates the difference in heme interactions between the protomers (A minus B). (C) Illustration of the heme (orange), hydrogen bond (blue), hydrophobic (dotted line), π-stacking (green), and water-bridged (gray) interactions in WT PDB structure (2CCA) as viewed using the protein ligand interaction profiler PLIP.[55] According to Figure A, in the protomer A of the WT, the cofactor maintained main contacts with seven residues, five of which were stronger than the other two. These included Pro100, His276, Thr314; distal pocket residues Arg104 and Trp107; proximal pocket residue His270; and gating channel residue Ser315. In protomer B, on the other hand, only five of the same interactions were observed. Arg104 interaction was significantly weaker, and interaction with His276 was lost. In general, heme has a hydrophobic interaction with Arg104, and hydrophobic and π-stacking interactions with Trp107 on the distal side (Figure C). On the proximal side, heme maintains metal complexes with His270 and hydrogen bonds with His276, whereas laterally, heme interacted with Ser315 through hydrogen bonds, more so in protomer A. These key interactions were reduced and, in some cases, lost completely in the mutant systems; for example, in protomer A, π-stacking interactions with Trp107 reduced 3-fold in S315R compared to the WT, whereas that of His270 was completely lost in S140R, G285D, and S315I. Heme metal complexation with His270 was also measurably reduced in S140N, G316D, S457I, and G593D compared to the WT. Similarly, a majority of mutants showed reduced/lost interaction with residues Arg104, His276, and Ser315, which predominantly coordinate heme in the WT. This could explain previously observed reduction in heme hydrogen bonds within the mutants. Comparatively, in protomer B, the WT maintained a high contact frequency with Trp107, His270, and Ser315, whereas a majority of the mutants lost these interactions. The mostly affected interactions were with His270, which were completely lost in S315I, S315N, and S315R. An equally affected interaction was the H-bond with Ser315, which was reduced in G279D, G316D, and S457I and completely lost in the rest of the mutants except G593D. From these observations, the protease mutational effect on heme coordination and binding could be one way of inducing resistance, as similarly observed by Chaplin;[12] a disordered heme environment in KatG mutants W107R and T275P ultimately affected heme binding to the protomer. In order to better visualize the asymmetric behavior of heme–protein residue interactions per protomer, a difference heat map (protomer A minus protomer B) was calculated using the data presented in Figure A. The results (Figure B) were compelling and clearly summarized interaction loss and gain in one or the other protomers in the presence of mutations with respect to the WT. Heme–protein residue interactions of WT protomers were almost symmetrical with slightly enhanced heme interaction on residues Arg104, His276, and Ser315 of protomer A. His270 is vital in heme coordination and stabilization, where its nitrogen side chain forms electrostatic interactions with heme iron,[9,12] and showed hardly any protomer differences in WT. On the other hand, asymmetric interaction behavior was observed on S315N and S315R with increase/retention in protomer A with respect to protomer B, and this was inversed in S140N, S140R, and G593D. As G593D is located in the C-terminal domain of the protein, the change in heme–protein interaction on protomer B compared to that on protomer A might be an indication of allosteric signaling. In addition to His270, one of the other residues that demonstrated interaction differences among two protomers was His276. In the presence of S140R, G279D, G285D, and G316D mutations, protomer A heme lost contact to His276; hence, the difference heat map of protomer B presented increased interactions with heme compared to those with protomer A. S315R was the one mutant that behaved similarly to the WT on His276–heme interactions. The last example of distinct asymmetric behavior can be given from residue Thr314, which showed retained/enhanced heme interactions on protomer B of S315I, S315R, G316D, and G593D. Among the S315 mutants, S315T, which is the most prevalent one, presented a fairly symmetric interaction profile in both protomers, except for His276. S457I was the other mutant with neutral promoter behavior.

Averaged DRN Metric Hub Changes Were Investigated in the Presence of Resistance Mutations

Mutations can have a plethora of effects on structural behavior of a protein ranging from global structural changes, as previously seen in the RMSD, RMSF, and essential dynamics analyses, to changes at residue level in the protein residue–residue interactions and communication networks. In a protein communication network, residues are commonly referred to as nodes and non-covalent interactions of nodes as edges.[56] The term “centrality” is used to deduce the importance of a node in a communication network. The change in residue side chains caused by mutations may lead to rearrangement of the protein network patterns and hence change node centrality. In our recent studies,[37,38] a new approach to analyze DRN centrality nodes was used where the global top 5% of residues across the WT and mutant protein systems were identified for five different averaged centrality metrics; Betweenness centrality (BC), Closeness centrality (CC), Degree centrality (DC), Eigencentrality (EC), and Katz centrality (KC). We also demonstrated that each metric provides a different perspective to the network individually; hence, collective information has the utmost importance. In the very same studies, we also introduced some new terminology. We defined the “hub” as any node that formed part of the set of highest centrality nodes and specified these hubs as the global top 5% centrality nodes measured across all related samples for any given averaged centrality metric. We also introduced the term “persistent hub”; if a hub exists across all systems compared, then the hub is called persistent. In this study, we applied our approaches to identify metric specific hubs in KatG and to decipher the effects of resistance mutations on its communication profile. Factoring in the size of the protein, we calculated and analyzed the hubs for each metric for 2, 3, 4, and 5% and identified that the global top 4% residue data were the most informative. As previously established,[38] 4% hub data for each metric are presented as a heat map. Analysis of each of the heat maps is detailed in the following subsections. To gain insights on changes in hubs due to mutations, for each metric, we mapped the uniquely observed hubs of each mutant protein with respect to the WT hubs (ΔBC, ΔCC, ΔDC, ΔEC, ΔKC) as well as the rest of the commonly shared hubs with the WT.

Betweenness Centrality (BC)

BC provides a measure of usage frequency for each node via the calculations of the number of shortest paths passing through a node for a given residue interaction network. Over the years, BC has proven to be one of the most informative metrics. It is used to identify key allosteric residues[57] as well as mutation and ligand-induced residue network changes in a variety of proteins.[38,41,58,59] Heat map presentation of the 4% averaged BC calculations showed intersystem and interprotomer differences (Figure ). Asymmetric protomer behavior of the dimeric proteins was discussed previously.[37] Here, we observed only minor differences. Yet, due to these differences, we have not identified any persistent hubs.[37,38] The following hubs were present in at least six of the studied systems: Pro29 (protomer A), Leu43, Asn44 (protomer B), Leu48, His49, Gln190 (protomer A), Glu195, Tyr197 (protomer B), Phe483 (protomer B), and Leu616. Residues Pro29, Leu43, Leu48, and His49 form part of the dimerization domain (residues 1–99)[19] that interacts with itself (1–99 from the other protomer), forming the dimeric structure. Interestingly, residues Asn35, Tyr197, and Phe483 were hubs in the mutant systems exclusively, especially in protomer B (Asn35: A140N, G279D, G285D, G316D, and S457I; Tyr197: G285D, S315I, S315N, S315T, S457I, and G593D; Phe483: S140R, G279D, G285D, S315N, S315R, G316D, and G593D). It is important to note that residue Tyr197, through hydrophobic interactions, forms interlocking hooks with Tyr28 from the opposite protomer, facilitating dimer formation in KatG.[9] Trp107 is also implicated as the site for the cation radical formation following porphyrin π-cation radical reduction from hydrogen peroxide.[60] Short-range contact analysis using contact maps showed that residue Tyr197 gains new contact with Pro219, Lue220, Ala221, and Ala222, while Phe483 gains new contact with Asp736 in the mutant systems, which explains their hub status (Figure S4). The noted increase in centrality for these residues could be a compensatory mechanism to maintain functionality in the mutants.

Figure 8

Heat maps for the potential hubs according to the global top 4% for averaged BC metric. The x-axis shows the protein residues, and the y-axis is for the WT and mutant proteins. A and B are for protomers A and B, respectively. Detected hubs are annotated with their centrality values, whereas their homologous residues in alternate samples are not, but are only shown for the sake of comparison. Low to high centrality values are colored white to black. Visual inspection of the residue mapping to the 3D KatG WT structure, unexpectedly, showed that the identified BC hubs form large hub ensembles as communication regions rather than clearly defined single communication paths, as identified in our previous studies.[37,38] These hubs are mostly located at or around the interface region (Figure S5). In addition to conferring structural stability,[61] protein interface residues are also known to be involved in inter- and intraprotein communication.[62] Interestingly, in most of the mutant systems, these large hub ensembles lost some of the hubs that were detected in WT but gained new ones especially around the binding cavity (heme, proximal and distal pocket and covalent triad residues) and expanded to the hook region (dimerization domain) via the interface. Figure depicts three mutant systems as examples; the others are presented in Figure S5. Some of the common hubs forming these ensembles from the active site to the protein core included Ala109, Trp135, Ala221, Thr251, Phe252, Met255, Thr275, and Thr314. Of interest are residues Thr275 and Thr314, which showed an increase in interaction frequency with heme (Figure ) in protomer A for S140N and protomer B for S315R and S315T. A similar trend was observed for the resistance mutations located in the C-terminal. These allosteric mutations also introduced new BC hubs around the heme binding pocket (i.e., Asn138 and Ala139 of protomer A; Phe272 of protomer B of G539D mutant protein). The status of these residues and increased interaction with heme compared to the WT could suggest a compensatory allosteric communication path for the active site as a result of mutations which destabilize the heme binding pocket and the loops in its vicinity.

Figure 9

Cartoon representation of KatG WT (A) and mutants (B–D) showing the distribution of the BC hubs. Protomer A N-terminal is shown as pale cyan and C-terminal as pale yellow, whereas protomer B N- and C-terminal domains are sky blue and lime, respectively. Dimerization domain is shown as yellow and bright orange for protomers A and B, respectively. Hubs for WT and hubs common to both WT and mutants are shown in salmon (protomer A) and gray (protomer B); hubs unique to each system (Δhubs: mutant proteins hubs – WT hubs) are shown in the respective domain colors, whereas the mutation positions are shown as firebrick spheres. The heme group and its proximal and distal residues are shown as sticks. Mutant unique hubs forming a path to and from the active site of the protein core are labeled.

Closeness Centrality (CC)

CC identifies the central nodes which are closer to other nodes. Heat map presentation of the data showed that the protomers have a slightly asymmetric behavior (Figure S6), as we also observed in the BC data (Figure ). Previously, we showed that high centrality nodes (hubs) of averaged CC values occur within the protein core.[37,38] Mapping the hubs to the 3D protein structures exhibited that all global top 4% averaged CC hubs are primarily located in the center of the protein, specifically in the interface area and hook region of the dimerization domain (Figure ). The persistent CC hubs are identified as Asn44, Leu45, Val47, Leu48, His49, Ala621, and Glu703. CC analysis further highlighted the centrality of the dimerization domain, especially the residues between 40 and 50, in the functionality of the KatG enzyme irrespective of drug resistance mutations. More so, as per TBDReaMDB, drugs resistance gene associated database (DRAGdb),[63] and genome-wide Mycobacterium tuberculosis variation (GMTV)[64] databases, there are no documented mutations located at the dimerization domain positions Pro29, Leu43, Asn44, Leu45, Lys46, and Gln50. Krishnamoorthy and team refer to the important protein regions not susceptible to mutations as mutational cold spots.[65] Alternatively, we proposed to use persistent hub data for the identification of cold spot residues.[37] The absence of mutations at these residue positions coupled with our marked centrality values emphasizes the importance of these residues in the functionality of KatG. Collectively, BC and CC calculations identified which residues in the interface region influence communication and facilitate the protease dimerization in the studied systems. These regions present new target areas for drug development and innovation in relation to KatG and Mtb drug resistance.

Figure 10

Cartoon presentation of KatG depicting the positions of the global top 4% CC hubs as spheres for each protein system. Protomer A N-terminal is shown as pale cyan, and C-terminal as pale yellow, whereas protomer B N- and C-terminal domains as sky blue and lime, respectively. Dimerization domain is shown as yellow and bright orange for protomers A and B, respectively. Hubs for WT and hubs common to both WT and mutants are shown in salmon (protomer A) and gray (protomer B); hubs unique to each system (Δhubs: mutant proteins hubs – WT hubs) are shown in the respective domain colors, whereas the mutation positions are shown as firebrick spheres. The heme group and its proximal and distal residues are shown as sticks. One interesting observation was the correlation between the interprotomer distance measurements (Figure S7) and the number of CC hubs (Figure ). From the results, steady interprotomer distances were observed in WT and in the majority of the mutant systems except for S315N and S457I, where minimal (<0.3 nm) increase in distance compared to the WT was noted toward the end of the simulations. These two protein systems had also fewer CC hubs compared to the other system (Figure G,K).

Degree Centrality (DC), Eigen Centrality (EC), and Katz Centrality (KC)

DC is defined as the number of neighboring nodes around a given node; hence DC hubs can be regarded as functionally important at the local level. On the other hand, EC identifies high connectivity nodes, especially when surrounded by other high connectivity nodes, due to their dependence on the residue neighborhood.[35,37,38] As indicated previously, averaged EC is more stringent in assigning centrality than averaged DC, whereas KC is typically between.[38] From the averaged DC results, there was a small margin between the highest and smallest values per system, implying a similar level of connectivity in all systems (Figure S8). Global top 4% DC analysis identified only one persistent hub: Ala555 (Figure ). Generally, less DC hubs were noted in N-terminal domain of the mutant systems compared to the WT especially in protomer A (Figure S9).

Figure 11

Cartoon presentation of KatG displaying the positions of the concerted persistent hubs from BC, CC, DC, EC, and KC metric calculations. Protomer A N-terminal is shown as pale cyan and C-terminal as pale yellow, whereas protomer B N- and C-terminal domains are sky blue and lime, respectively. Dimerization domain is shown as yellow and bright orange for protomer A and B, respectively. Heme is shown as purple sticks. Interestingly, the results from the global top 4% EC calculations showed not only asymmetric protomer behavior but also a clear difference between the WT and mutants (Figure ). In protomer A, the WT protease only shared five EC hub residues with the mutants, Leu472, Val473, Ala476, Gly547, and Gly548. Further, only mutants S140R, G279D, S315I, and S315T had EC hubs at positons S140R (105, 106, 109–127, 162, 165, 166, 187, 190–194, 418, 419), G279D (102–106, 109, 110, 121, 122, 165, 166, 169), S315I (102–113, 118, 120–189, 256, 415, 416, 418, 419), and S315T (106, 109–126, 166, 164, 196, 418, 419) in protomer A as per Figure . The WT had the least number of hubs in protomer A: 5. However, in protomer B, the WT had the most hubs compared to all the mutant systems, highlighting the asymmetric protease behavior. Further asymmetricity was observed in S140N, G285D, S315I, and G593D which had no EC hubs in protomer B.

Figure 12

Heat maps for the potential hubs according to the global top 4% for averaged EC metric. The x-axis shows the protein residues, and the y-axis is for the WT and mutant proteins. A and B are for protomers A and B, respectively. Detected hubs are annotated with their centrality values, while their homologous residues in alternate samples are not but are only shown for the sake of comparison. Low to high centrality values are colored white to black. Mapping of the hub residues of the respective variant proteases showed that (1) irrespective of the presence of mutations, there is asymmetric behavior of the protease protomers, and (2) a majority of the EC hub residues are in the C-terminal domain, indicating its relevance in the protease functionality (Figure ). On a global scale, a majority of the most influential residues as per EC computations were in the helix regions of 468–479 and 545–557 of the C-terminal domain. Previous studies have shown that there is crosstalk between the N- and C-terminal domains, and that is vital for active site structuring.[66] Regardless of the asymmetric protomer behavior, the EC results show that the C-terminal domain is critical in the functionality of the KatG protease.

Figure 13

Cartoon presentation of KatG depicting the positions of the global top 4% EC hubs as spheres for each protein system. Protomer A N-terminal is shown as pale cyan and C-terminal as pale yellow, whereas protomer B N- and C-terminal domains are sky blue and lime, respectively. Dimerization domain is shown as yellow and bright orange for protomers A and B, respectively. Hubs for WT and hubs common to both WT and mutants are shown in salmon (protomer A) and gray (protomer B); hubs unique to each system (Δhubs: mutant proteins hubs – WT hubs) are shown in the respective domain colors, whereas the mutation positions are shown as firebrick spheres. The heme group and its proximal and distal residues are shown as sticks. For Katz centrality, a more uniform distribution of DRN values was observed compared to EC (Figure ). Furthermore, from the global top 4% calculation, Leu472, Val473, Gly547, Gly548, and Ala551 residues were identified as persistent hubs (Figure ). It is important to note that all of the persistent hubs are in the C-terminal domain of the protease. Again, a majority of the residues with influence on the systems’ network were in the C-terminal domain (Figure S10). These findings further support the vitality of the C-terminal and identify specific regions of interest in domain. A high correlation (Pearson correlation coefficient) between the overall calculated mutant and WT DRN metrices (Table S3) was observed except for EC, as previously shown. This illustrates that even with the observed global and local changes, the mutations do not drastically change the protein’s communication network.

Mutational Effects on Residue Interactions as Identified Using Weighted Contact Maps

Using the MDM-TASK’s weighted contact map tool, the degree of residue interactions with immediate neighbors (within 6.7 Å Euclidean distance) ranked from 0 to 1 (presence or absence) for (1) the mutated residues and (2) persistent hubs (Figure ) was determined. Previous studies have used this approach to study the short-range changes in residue interactions in mutants.[38,41,67,68] From Figure , slight differences in residue contacts between protomers were noted, and this was attributed to local asymmetry in side chain conformations, common in complexes.[69] In protomer A, S140R showed a 2-fold reduced interaction with His276 compared to the WT; however, we noted a compensatory gain of hydrogen bonds with Lys143, Trp300, and Ser315. S140N gained H-bonds with Lys143, Trp300, and Ala131. Consequent to the gains in interactions at position 140, higher centrality was noted at this position in S140N/R compared to the WT. G279D lost contact with Leu283 in addition to a 10-fold and 30-fold decrement in interaction with Tyr304 and Ile313, respectively. In G285D, more than a 50-fold decrement in main chain H-bond with Ser303 was noted together with slight compensatory gains in contact with Ala291 and Gln295, and this resulted in increased residue fluctuation around this position in comparison to the WT in protomer A. At position 315, mutants S315R/T lost contact with Ala139, S315I/N lost H-bonds with Thr275 and His276, and all mutants lost interaction with Ala350 except for S315T that had a 10-fold contact reduction compared to the WT. Despite the lost interaction at 315, a higher degree of centrality was observed at this position in the mutants. In G316D, Asp316 lost the H-bond interaction with Pro232, His276, and Ile313 in addition to a 3-fold contact reduction with Ala348 in comparison to the WT; however, we noted compensatory gains with Gly279 and Pro280. G593D had a complete loss of interaction of main chain H-bonds of Arg595, Asn596, Ala606, Glu607, and Val628 while also having a 10-fold reduced interaction with Val694. G593D also gained contacts with Glu588 and Asn602. Due to the asymmetric protomer behavior noted from RMSD and ED calculations, subtle differences in mutant contacts between protomers B and A were noted here.

Figure 14

Residue contacts in the WT and mutation positions in the mutants. A and B are heat maps of the weighted residue contacts at the WT and mutation residue positions in protomers A and B, respectively. The degree of contact is shown by the color scale from 0–1. WT and mutant residue positions are on the y-axis and their contacts on the x-axis. In protomer B, S140N/R gained contact with His276 and lost interaction with Ala312. G279D also gained contact with Asp311 and Ala312 while losing the interaction with Gly316, Ala348, and Gly349. At position 315, significant contact gains with Thr275, His276, Gly277, Gly349, and Ala350 were noted for S315T. This gain in interaction by S315T could mean a more compact assortment of residues around the INH access channel. Studies have shown that the size of the threonine side chain of S315T mutant results in narrowing of the binding pocket access channel, hindering INH access to the active pocket.[11,29,70] G316D and G593D showed a loss and 6-fold reduction in the H-bond with Ile313 and Asn596, respectively. Contact fingerprint analysis for the persistent hubs (Tables S4 and S5) elucidated that the high centrality noted in the dimerization hook region is because of new/increased interaction between the dimerization domain with hub and neighboring residues, as described by Okeke and team.[38] Collectively, these results highlight the effects mutations impose on structural networks which, in turn, affect the protein interactions and communication patterns.

Mutation-Induced Effects on Systems’ Binding Energy

Here, changes in the systems’ binding energies as an effect of mutations were studied using alanine scanning and vital residues contributing to the bulk binding energy. In alanine scanning, protein interface residues were mutated to alanine through targeted mutation to identify the destabilizing and stabilizing residues or “hot spot” residues as coined by Clackson and Wells.[71] Destabilizing residues are the residues with binding energies ≥1 kcal/mol, whereas stabilizing residues are those with binding energies <−0.8 kcal/mol.[72] Since majority of the residues with high BC and CC values were in the interface region, alanine scanning was applied to assess their effects on the dimeric KatG binding energy. Alanine scanning identified several destabilizing residues that are common across multiple mutants especially in the dimerization domain (Table S4). The observed commonality of destabilizing residues across the mutants suggests a similar pattern of residue interactions in the KatG variants. Of the destabilizing interface residues, particular residues in the dimerization domain region, viz. Val30, Asn35, Gln36, Asn44, Lys46, His49, Glu192, and Glu195, also had significantly high centrality values (BC and CC) as previously observed. The consistency of importance of the dimerization domain across the WT and mutant systems speaks to the key structural regions of KatG. These residues present new areas of interest in as far as understanding KatG functionality and new drug design and discovery in relation to drug resistant TB.

Conclusion

TB persists as one of the leading causes of morbidity and mortality globally with continually emerging resistant strains of bacteria to current antibiotics. In order to curb the TB drug resistance issue, constant efforts have been put on identifying new drugs and drug targets; yet this might not be an ultimate solution, as it is highly likely that bacteria would develop resistance to new drugs, too. Winning against bacteria’s defense mechanisms might only be achieved by understanding the underlying causes. With that in mind, in this study, we performed a comprehensive computational analysis of the KatG protein in the presence of 11 HC INH resistance mutations. We combined all-atom MD simulations with post-MD trajectory calculations to identify structural changes due to missense mutations. Besides traditional trajectory analysis approaches (RMSD, RMSF, Rg, H-bond), we also applied some of the methods that we, recently, developed including comparative ED[36,38] and combined DRN metric analysis.[36−38,50] We further supported our findings with alanine scanning. A majority of the mutations studied here (S140N, S140R, G279D, G285D, S315I, S315N, S315R, and S315T) are located in the N-terminal domain, some of which are in the vicinity of the heme binding site, whereas S457I and G593D are in the C-terminal. Thus, we observed mutation-induced local effects at the residue level as well as long distance allosteric effects, which might explain the cause of resistance to prodrug, INH. The specific key observations in this study can be summarized as follows: (1) our results on RMSD, structural clustering, RMSF, and comparative ED identified that the mutant proteins have significant loop flexibility around the heme binding pocket and enhanced asymmetric protomer behavior with respect to WT protein. Similar results were also observed on a recent cryo-EM article[12] for two other resistance mutations. Further to these observations, we also showed drastic (asymmetric) changes in the binding pockets per mutant systems with respect to WT. Thus, we conclude that these mutations, regardless of their locations, affect the sensitivity of KatG toward INH through changes in the binding pocket environment. (2) RMSF results revealed that the C-terminal domain of both protomers in each mutant system remained similar to WT. Previous wet lab results demonstrated that removal of this domain, as a whole or partially, diminishes the ability of the protein to activate INH.[16,17] Thus, our results support the idea of a C-terminal domain offering architectural stability to the KatG active site by displaying no structural changes due to mutations. Additionally, an interesting observation came from the global top 4% averaged EC hub calculations. Mapping of these hub residues of the respective mutant proteases showed that majority of the EC hubs are in the C-terminal domain, indicating its relevance in the protease functionality. (3) We also looked at the heme contact residues per protomer based on atomic contact frequency over the MD simulations. The results were compelling and clearly show interaction loss and gain in one or the other protomers in the presence of mutations with respect to the WT. On the other hand, heme–protein residue interactions of WT protomers were almost symmetrical. In particular, His270, which is vital in heme coordination and stabilization, and His276 demonstrated interaction differences among two protomers in various mutants, and in some cases, His270 interaction was lost. (4) In the context of DRN analysis, we had some relatively unusual observations compared to our previous two studies, in which we identified allosteric communication paths formed by BC(38) or EC(37) between allosteric and active sites. Here, BC calculations gave large hub ensembles as communication regions rather than clearly defined single communication paths. The mutant systems gave us interesting observations as these large hub ensembles lost some of the hubs that were detected in WT but gained new ones especially around the binding cavity (heme, proximal and distal pocket, and covalent triad residues) and expanded to the hook region (dimerization domain) via the interface. This might be a compensatory allosteric communication path for the active site as a result of the mutations which destabilize the heme binding pocket and the loops in its vicinity. (5) One interesting observation was the correlation between the interprotomer distance measurements and the number of CC hubs. From the results, steady interprotomer distances were observed in WT and in majority of the mutant systems except for S315N and S457I, where minimal (<0.3 nm) increase in distance compared to the WT was noted toward the end of the simulations. These two protein systems had also fewer CC hubs compared to the other system. (6) Alanine scanning identified several destabilizing residues that are common across multiple mutants especially in the dimerization domain, some of which also had significantly high BC and CC values. Collectively, our results highlighted important resistance mechanisms against prodrug, INH, and identified functionally important regions of the protein that should be considered for future rational drug design.

Methods

Retrieval of the Mtb KatG WT and Its Mutant Sequences

The Mtb WT KatG protein sequence (UniProt ID: P9WIE5) and crystal structure (PDB ID: 2CCA) were retrieved from the Universal Protein Resource (UniProt) database[73] and the RCSB Protein Data Bank (PDB),[74] respectively. Further, 11 high confidence single point mutations previously identified through INH minimum inhibitory concentration tests and genomic sequencing were obtained from the TB Drug Resistance Mutation database (TBDReaMDB).[42]

Homology Modeling of Mutant KatG Proteins

Each of the mutations was introduced simultaneously to each KatG protomer via an ad hoc Python script. Using Auto Model and slow refinement options in MODELLER v9.23,[75] 100 3D mutant homodimeric structures per mutant were generated based on the WT KatG crystal structure (PDB ID: 2CCA) as the template. The choice of template was guided by its qualities including high resolution (2.0 Å) and the cocrystallized heme groups. The models with the lowest score of the normalized discrete optimized protein energy (z-DOPE)[49] were selected for additional downstream calculations. Molecular kinetic studies of KatG indicate that the enzyme interacts with INH at a pH of 7.2.[11] Thus, all titratable residues of each structure were protonated at pH 7.0 to match the neutral environment using the PROPKA tool from PDB2QR (version 2.1.1).[76]

All-Atom Molecular Dynamics Simulations and Trajectory Analysis

To determine the effects of mutations on protease conformation, the modeled holo-3D mutant structures together with that of the WT were subjected to 300 ns all-atom MD simulation runs in GROMACS[77] v2019.4. First, the gro and top files were generated using GROMOS54a7 force field[78] under default settings. The chosen force field also included parameters for the KatG heme cofactor. The generated topologies were then solvated using a single point charge 216 (SPC216) water model in a cubic box with at least 1.0 nm distance between the protein structures and box edges. The system charge was neutralized using NaCl ions of 0.15 M concentration and minimized via the steepest descent energy minimization algorithm with an energy step size of 0.01 without constraints until a tolerance limit of 1000.0 kJ/mol/nm was reached. After energy minimization, temperature equilibration (NVT ensemble: constant number of particles, volume, and temperature) was implemented applying Berendsen temperature coupling at 300 K for 100 ps. This was followed with pressure equilibration (NPT ensemble: constant number of particles, pressure, and temperature) using Parrinello–Rahman barostat[79] at 1 atm and 300 K also for 100 ps. With equilibration complete, 300 ns production runs with a time step of 2 fs were performed for both the WT and 11 mutant systems. LINCS algorithm[80] was used for equilibration and production runs where all bonds were constrained. Particle mesh Ewald electrostatics[81] were used for long-range electrostatic calculations with a Fourier spacing of 0.16 nm. For the short-range Coulomb and van der Waals interactions, a cutoff distance of 1.4 nm was applied. MD simulations were run at the Centre for High Performance Computing (CHPC) utilizing 240 cores and approximately 8735 CPU time for each of the 12 systems (∼104,820 total CPU time). Lastly, periodic boundary conditions were removed and the trajectories fitted to the reference structure using the gmx trjconv tool. Subsequently, calculation of ensemble conformational properties was implemented using GROMACS built-in tools: gmx rms, gmx rmsf, gmx gyrate, and gmx cluster to calculate the RMSD, RMSF, Rg, and geometric clustering. Data from these tools was analyzed and presented using RStudio and different Python libraries, viz. Seaborn,[82] Pandas,[83] pytraj,[84] matplotlib,[85] Numpy,[86] and NGLview.[87] Further, all versus all RMSD calculations using only Cα atoms were computed for all systems by an ad hoc Python script. The script first concatenated all system trajectories before computing the RMSD for each frame while comparing it to all other frames and itself using the Python pytraj package and a step size of 100 frames. The visual molecular dynamics tool[88] was used to observe the dynamics of each system including the heme conformational behavior. Further structural changes were accessed through measuring the changes in interprotomer center of mass (COM) distance using the GROMACS gmxdistance tool.

Essential Dynamics Analysis

The most dominant and collective motions of the protein systems were studied using the comparative essential dynamics tool, compare_essential_dynamics.py, from the MDM-TASK web.[36] With comparative ED, the WT trajectory was paired with each mutant trajectory (11 pairs in total) where each mutant trajectory was aligned to that of the WT (reference structure) via the Cα atoms before the computation and decomposition of the covariance matrix. As a result, conformational sampling of the mutants was comparatively accessed within the same eigen subspace as that of the WT. Herein, the last three C-terminal residues were excluded from the ED calculations due to the increased flexibility. The approach was also applied to study binding pocket dynamics by specifying the residues of interest in each ensemble. Protein motions as described by first and second principal components (PC1 and PC2) were plotted as scatter plots for each system with the percentage variance as explained by each PC shown on the axes. The time stamps in picoseconds for the lowest energy conformations as calculated from 2D kernel density estimates were indicated on the PC1 and PC2 scatter plots.

Protein–Heme Hydrogen Bond Interactions, Short-Range Contacts, and COM Distance

Hydrogen bond interactions between cofactors/ligands and the proteins are cardinal as they ensure stability and sustainability of the protein cofactor/ligand interactions.[89] Here, due to the significance of the heme in the KatG functionality, the frequency of H-bonds formed between the heme and the protein systems was analyzed using the GROMACS gmx hbond tool in each system and presented as line graphs. Additionally, short-range cofactor (heme) interactions over the course of the established equilibrated region of the trajectory (last 50 ns) were investigated using an ad hoc Python script. Using the system trajectory and topology files as inputs, the script employed MDTraj,[90] NetworkX,[91] Numpy, Pandas, and matplotlib Python libraries to establish edges between the protein and ligand atoms and then weighed the frequency of contact over the course of the simulation. The normalized heme contacts were then presented as heat maps using Seaborn Python library. The difference in normalized heme contacts between the protomers (heme contacts in protomer A minus protomer B) was also calculated and presented as a heatmap. Further, in order to describe heme positional stability in the mutant systems, the heme center of mass (COM) in relation to the COM of the active site pocket resides was also calculated using the gmx distance tool for each system’s protomers.

Dynamic Residue Network Analysis

Dynamic residue network analysis was used to identify the key residues in intraprotein communication in the WT and mutant systems. Using the last 50 ns of the simulations, the Cβ atoms (Cα for glycine) of each residue were treated as nodes and, where a connection between the nodes within a cutoff distance of ≤6.7 Å existed, it was treated as an edge.[92] A pentad of DRN metrics, viz. averaged betweenness centrality, averaged closeness of centrality, averaged degree of centrality, averaged eigenvector centrality, and averaged katz centrality were calculated for every snapshot of the last 50 ns of the systems’ trajectories using the MDM-TASK-web server scripts.[36,93]BC measures the frequency of a node’s involvement in the protein’s shortest paths and hence measures the residue importance in a protein network.[94]BC is calculated from eq , where V is the complete set of nodes; m is the number of frames; σ(s,t) is the number of shortest paths connecting nodes s and t; σ(s,t|v) is the number of these paths passing another node v; and i is the frame number.CC is a measure of the proximity of a node to all the other nodes in a network. Here, the CC of a node is calculated by computing the average of the shortest paths between the node in question and all the other nodes in the network.[95]CC informs on a node’s centrality in the network communication and is calculated from eq , where d(v,u) is the shortest-path distance between v and u, and n is the number of nodes in the graph.The DC metric informs on the extent to which a given node is connected to its neighbors in a network.[95] In other words, DC is a measure of the number of unique edges a node has, and it is calculated from eq where n is the number of nodes; A is the jkth adjacency for the ith frame.EC is used to measure the node’s influence in a network based on the centrality of its neighboring nodes (high-scoring and low scoring nodes).[95,96] Here, an adjacency matrix is used to calculate EC of a node while considering the centrality of the neighboring nodes using eqs and 4b, where (a) EC is the eigenvector and lambda is the eigenvalue for the eigen decomposition of adjacency matrix A. In NetworkX, this is obtained by power iteration. (b) Averaged EC is computed for the ith residue by computing the vector for each MD frame and averaging.In graph theory, the KC metric informs on the relative influence of a node in a network while considering not only its immediate neighboring nodes but also the neighbors’ neighbors.[96]KC is computed from eqs and 5b.

Identification of DRN Centrality Hubs

Applying the recently developed algorithm for DRN analysis,[37,38] metric-specific results per protomer for all systems were merged in a vector before sorting them in descending order while keeping track of the residue numbers (indices). Afterward, a threshold for DRN values within the top 4% of the whole set was determined by getting the product of the total number of protomer residues (715) by the number of systems (12) and percentage cut off (0.04); [715 × 12 × 0.04]. The 4% cutoff was selected to include all hub residues while limiting noise in the data based on the size of the protein. Lastly, the threshold was used to create a binary matrix with dimensions similar to those of the original set. Residues with hub status together with homologous residues from each system were selected and presented as a heat map for each metric.

Residue Contact Map Analysis

Residue contact maps were calculated to illustrate the degree of interaction at (1) mutation positions and (2) persistent hubs from DRN analysis for each system trajectory. This analysis was performed to identify the changes in residue–residue interactions resulting from mutations in the equilibrated portion of the simulation. Using the MDM-TASK-web’s[36] weighted residue contact map tool and a cutoff Euclidean distance of 6.7 Å between the target residue and its immediate contacts, weighted contacts were calculated for the last 50 ns using a step size of 1 frame. Contact heat maps were then generated using the contact_heatmap.py tool from MDM-TASK-web.[36]

Analysis of the Dimer Binding Energy through Alanine Scanning

Alanine scanning was performed via the ROBETTA Web server.[97] To this effect, individual essential dynamics otherwise referred to as PCA was computed for each system separately using the MDM-TASK web to obtain the time stamps of the low energy/stable conformations as calculated by the standard PCA tool of MDM-TASK web.[35,36] Stable conformations were then extracted from the respective trajectories at these time stamps using the gmx trjconv tool and submitted to the ROBETTA Web server. Here, structural interface residues were comprehensively mutated to alanine to evaluate (1) the change in the residue binding energy contribution in the protein and (2) the change in protein stability. An interface residue was defined as one having one or more atoms within ≤4 Å of a residue atom from the other monomer. Destabilizing, stabilizing, and neutral residues were defined as having binding energies of ≥1, −0.8, and −0.8 to 0.99 kcal/mol, respectively.

79 in total

Review 1. Structural symmetry and protein function.

Authors: D S Goodsell; A J Olson
Journal: Annu Rev Biophys Biomol Struct Date: 2000

2. Small-world communication of residues and significance for protein dynamics.

Authors: Ali Rana Atilgan; Pelin Akan; Canan Baysal
Journal: Biophys J Date: 2004-01 Impact factor: 4.033

3. VMD: visual molecular dynamics.

Authors: W Humphrey; A Dalke; K Schulten
Journal: J Mol Graph Date: 1996-02

4. Protein structure modeling with MODELLER.

Authors: Benjamin Webb; Andrej Sali
Journal: Methods Mol Biol Date: 2014

5. Modulation of the activities of catalase-peroxidase HPI of Escherichia coli by site-directed mutagenesis.

Authors: A Hillar; B Peters; R Pauls; A Loboda; H Zhang; A G Mauk; P C Loewen
Journal: Biochemistry Date: 2000-05-16 Impact factor: 3.162

6. Analysis of interactions of clinical mutants of catalase-peroxidase (KatG) responsible for isoniazid resistance in Mycobacterium tuberculosis with derivatives of isoniazid.

Authors: Ameeruddin Nusrath Unissa; George Priya Doss C; Thirumal Kumar; Sukumar Swathi; Appisetty Ramya Lakshmi; Luke Elizabeth Hanna
Journal: J Glob Antimicrob Resist Date: 2017-07-23 Impact factor: 4.035

7. Access channel residues Ser315 and Asp137 in Mycobacterium tuberculosis catalase-peroxidase (KatG) control peroxidatic activation of the pro-drug isoniazid.

Authors: Xiangbo Zhao; Hans-Petter Hersleth; Janan Zhu; K Kristoffer Andersson; Richard S Magliozzo
Journal: Chem Commun (Camb) Date: 2013-12-25 Impact factor: 6.222

8. Computational modeling and bioinformatic analyses of functional mutations in drug target genes in Mycobacterium tuberculosis.

Authors: Pooja Singh; Salma Jamal; Faraz Ahmed; Najumu Saqib; Seema Mehra; Waseem Ali; Deodutta Roy; Nasreen Z Ehtesham; Seyed E Hasnain
Journal: Comput Struct Biotechnol J Date: 2021-04-19 Impact factor: 7.271

9. Tuberculosis drug resistance mutation database.

Authors: Andreas Sandgren; Michael Strong; Preetika Muthukrishnan; Brian K Weiner; George M Church; Megan B Murray
Journal: PLoS Med Date: 2009-02-10 Impact factor: 11.069

1 in total

1. Evolutionary progression of collective mutations in Omicron sub-lineages towards efficient RBD-hACE2: Allosteric communications between and within viral and human proteins.

Authors: Victor Barozi; Adrienne L Edkins; Özlem Tastan Bishop
Journal: Comput Struct Biotechnol J Date: 2022-08-17 Impact factor: 6.155

1 in total