Literature DB >> 28575169

MD-TASK: a software suite for analyzing molecular dynamics trajectories.

David K Brown1, David L Penkler1, Olivier Sheik Amamuddy1, Caroline Ross1, Ali Rana Atilgan2, Canan Atilgan2, Özlem Tastan Bishop1.   

Abstract

SUMMARY: Molecular dynamics (MD) determines the physical motions of atoms of a biological macromolecule in a cell-like environment and is an important method in structural bioinformatics. Traditionally, measurements such as root mean square deviation, root mean square fluctuation, radius of gyration, and various energy measures have been used to analyze MD simulations. Here, we present MD-TASK, a novel software suite that employs graph theory techniques, perturbation response scanning, and dynamic cross-correlation to provide unique ways for analyzing MD trajectories.
AVAILABILITY AND IMPLEMENTATION: MD-TASK has been open-sourced and is available for download from https://github.com/RUBi-ZA/MD-TASK , implemented in Python and supported on Linux/Unix. CONTACT: o.tastanbishop@ru.ac.za.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2017        PMID: 28575169      PMCID: PMC5860072          DOI: 10.1093/bioinformatics/btx349

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Molecular dynamics (MD) is used to understand the movement of atoms of macromolecules in a simulated cell environment. MD simulations produce trajectories depicting the motions of atoms by presenting the atomic coordinates at specified time intervals, allowing for investigation into changes over time. Traditionally, these trajectories are used to analyze macromolecule dynamics by calculating well-established measures such as root mean square deviation, root mean square fluctuation (RMSF), radius of gyration and energy-based approaches including the molecular mechanics/Poisson Boltzmann surface area (MM/PBSA) and the molecular mechanics/generalized Born surface area (MM/GBSA) (Kollman ). Applications of MD simulations are broad, ranging from determining the stability of macromolecular complexes to understanding allosteric behavior of proteins. Although the measures mentioned above are informative, sometimes additional approaches are required. For example, changes in residue interaction networks (RINs) have been investigated in the context of mutation analysis (Bhakat ; Doshi ; Brown ). Another method, perturbation response scanning (PRS), following MD simulations, is used to identify residues important for controlling conformational changes (Atilgan and Atilgan, 2009). Although traditional measures are incorporated into MD programs, they can be limited depending on the research questions. Hence, individual research labs often write custom scripts to further analyze trajectories. This might be challenging to some, especially if it requires mathematical knowledge and complex scripting. To serve this need, we present MD-TASK, an easy-to-use tool suite with detailed documentation, for analyzing MD trajectories. MD-TASK includes what we call dynamic residue network (DRN) (which combines the RINs of the frames in an MD trajectory) analysis, PRS, and dynamic cross-correlation (DCC), none of which are found in commonly used MD packages.

2 Materials and methods

2.1 Implementation

MD-TASK was developed using Python for Linux/Unix-based systems. Various non-standard Python libraries, including NumPy, SciPy, Matplotlib (Hunter, 2007), MDTraj (McGibbon ) and NetworkX, were used in the suite. Thus, MD-TASK supports any formats the underlying Python libraries support including .binpos (AMBER), LH5 (MSMBuilder2), PDB, XML (OpenMM, HOOMD-Blue), .arc (TINKER), .dcd (NAMD), .dtr (DESMOND), hdf5, NetCDF (AMBER), .trr (Gromacs), .xtc (Gromacs), .xyz (VMD) and LAMMPS. Further, the igraph package for R was used to generate residue contact maps.

2.2 Network analysis

RINs can be analyzed using graph theory. In a RIN, each residue in the protein is a node in the network. An edge between two nodes exists if the Cβ atoms (Cα for Glycine) of the residues are within a user-defined cut-off distance (usually 6.5–7.5 Å) of each other. MD-TASK constructs a DRN and uses this to calculate the changes in betweenness centrality (BC) and average shortest path (L) to residues over the trajectory.

2.2.1 Betweenness centrality

BC is a measure of how important a residue is for communication within a protein. The BC of a node is equal to the number of shortest paths from all nodes to all others that pass through that node. MD-TASK uses an implementation of Ulrik Brandes algorithm in the NetworkX library for calculating BC (Brandes, 2001).

2.2.2 Average shortest path

For a given residue, L is the sum of the shortest paths to that residue, divided by the total number of residues less one. MD-TASK uses a custom algorithm in the NetworkX library for quickly calculating the shortest path between all residues (NetworkX Developers, 2017). The average shortest path describes how accessible a residue is within the protein.

2.2.3 Residue contact map

Residue contact maps are generated by monitoring the interactions of a residue throughout a simulation, yielding a network diagram with the residue of interest [e.g. single nucleotide polymorphism (SNP)] at the center, and residues that it interacts with arranged around it. Edges between the residue of interest and the other residues are weighted based on how often the interaction exists.

2.3 Perturbation response scanning

Given the atomic coordinates for initial and target states, together with an equilibrated MD trajectory of the initial structure, the algorithm performs a residue-by-residue scan of the initial conformation, exerting external forces on each residue, and records the subsequent displacement of other residues using linear response theory and a variance-covariance matrix obtained from the MD trajectory (Atilgan ). The quality of the predicted displacement is assessed by correlating the predicted and experimental difference between the initial and target states. This results in a correlation coefficient for each residue, where a value close to one implies good agreement with the known experimental change. Residues whose perturbation invokes a conformational displacement closest to the target structure are reported as hot residues.

2.4 Dynamic cross-correlation

DCC is calculated using the following formula: with the displacement from the average position of atom i, and ⟨⟩ the time average over the whole trajectory (Di Marino ). MD-TASK generates a heat-map depicting the DCC between the Cα atoms of selected frames in a trajectory to identify relative residue movements.

3 Performance

The results of a performance test are presented in Table 1. Tests were conducted on the ‘example_small.dcd’ trajectory, a 599 residue, 2000 frame trajectory provided in the ‘example’ sub-directory of the MD-TASK Github repository. Tools were set to iterate through the trajectory at 100 frame intervals and then executed 10 times. These executions were timed using the Linux ‘time’ utility, which includes the time to start up the Python interpreter, providing an accurate measure of what users will experience in practice. An average time was then calculated for each tool. The PRS script used 50 random forces per residue.
Table 1.

Results of MD-TASK performance tests

ScriptAverage time (s)
calc_network.py (–calc-L)37 298
calc_network.py (–calc-BC)62 109
calc_delta_BC.py16 852
calc_delta_L.py1864
avg_network.py1713
compare_networks.py4230
delta_networks.py2289
contact_map.py19 806
calc_correlation.py39 703
prs.py95 480
Results of MD-TASK performance tests Tests were conducted on a PC running Ubuntu 16.04 with an Intel Core i5-6300U CPU with 4 logical cores running at 2.4 GHz and 8 GB RAM.

4 Applications

The network measures, BC and L, have been used in minimized protein structures (Ozbaykal ) for Alanine scanning. One suggested use, here, is SNP analysis over MD simulations. The network analysis scripts of MD-TASK were applied to analyze renin-angiotensinogen system (Brown ). The data indicated that combination of RMSF values with network analysis can be informative. Further, a SNP analysis protocol combining traditional MD analysis tools with DRN is proposed in a recent review article (Brown and Tastan Bishop, 2017). PRS identifies single residues playing an active role in protein conformational changes between an initial and target state (Atilgan and Atilgan, 2009). Perturbative methods are instrumental in uncovering different protein functions such as the effect of pH on the distribution of conformations of calmodulin (Atilgan ) and ferric binding protein (Atilgan and Atilgan, 2009). By disclosing nonevident relationships, PRS was used to suggest new experiments to explore allosteric communication, e.g. in HSP70 (Penkler ). As an example of MD-TASK outputs, we used HIV protease. Sequence of closed conformation of crystal structure (4ZIP) was used as wild type (WT) target sequence to model its structure in open conformation by means of homology modeling. Two major (V32I, I47V) and one minor (V82I) mutations were then introduced, also using homology modeling, to create the open conformation mutant structure. In both cases 1TW7 was used as the template. Network analysis was performed for 40 ns MD runs for both WT and mutant proteins in open conformation (Fig. 1A). DCC was calculated for the mutant protein (Fig. 1B). For PRS (Fig. 1C), an equilibrated 20 ns section of the mutant trajectory was used. The PDB structure, 3S54, which represents the closed conformation with the same mutations, was used as the end state during PRS calculations. Residues having the highest change in reachability (highest ΔL) during the course of the trajectory are 46–56, comprising the flap. This property gives HIV-protease the flexibility to expand the active site cavity and diminish the effect of inhibitors while staying functional (Martin ). The peaks in ΔBC correspond to active site residues 25–27, supporting the hypothesis that high BC positions are responsible for interdomain communication (Ozbaykal ). While these properties are similar in both WT and mutant, the neighborhood structure of position 47 slightly shifts, as shown in the residue contact maps. DCC (Fig. 1B) demonstrates that the intrachain motions are highly correlated, while the motions of the chains with respect to each other are anticorrelated (residues 1–99 versus 100–198). The PRS results (Fig. 1C) display that there are no single residues whose perturbation directly leads to the conformational change between the equilibrated WT structure at the 20 ns snapshot and the triple mutant, as the maximum correlations are ∼0.6. Nevertheless, highest correlations span residues 60–72 implying an allosteric communication between this region and the flap motions. The specific residues are mapped onto the protein structure in Figure 1D. Figure 1 summarizes how MD-TASK provides a means to analyze the trajectories, and gives a bird’s eye view of various factors that may be effective in the dynamics of a protein.
Fig. 1

Outputs for (A) Network analysis: average ΔL; average ΔBC; residue contact maps (from top to bottom); (B) DCC; (C) PRS (plot not generated by MD-TASK); (D) HIV-protease with significant regions highlighted

Outputs for (A) Network analysis: average ΔL; average ΔBC; residue contact maps (from top to bottom); (B) DCC; (C) PRS (plot not generated by MD-TASK); (D) HIV-protease with significant regions highlighted

5 Conclusion

MD simulations have become an important tool in structural bioinformatics. Here, we present a new tool suite for analyzing MD trajectories using DRN analysis, PRS, and DCC. To the best of our knowledge, MD-TASK is the first downloadable tool suite for analysis of different properties along MD simulations not commonly found in other MD packages.
  12 in total

1.  Calculating structures and free energies of complex molecules: combining molecular mechanics and continuum models.

Authors:  P A Kollman; I Massova; C Reyes; B Kuhn; S Huo; L Chong; M Lee; T Lee; Y Duan; W Wang; O Donini; P Cieplak; J Srinivasan; D A Case; T E Cheatham
Journal:  Acc Chem Res       Date:  2000-12       Impact factor: 22.384

Review 2.  Network-based models as tools hinting at nonevident protein functionality.

Authors:  Canan Atilgan; Osman Burak Okan; Ali Rana Atilgan
Journal:  Annu Rev Biophys       Date:  2012-02-23       Impact factor: 12.981

3.  In silico mutational studies of Hsp70 disclose sites with distinct functional attributes.

Authors:  Gizem Ozbaykal; Ali Rana Atilgan; Canan Atilgan
Journal:  Proteins       Date:  2015-09-28

4.  MDTraj: A Modern Open Library for the Analysis of Molecular Dynamics Trajectories.

Authors:  Robert T McGibbon; Kyle A Beauchamp; Matthew P Harrigan; Christoph Klein; Jason M Swails; Carlos X Hernández; Christian R Schwantes; Lee-Ping Wang; Thomas J Lane; Vijay S Pande
Journal:  Biophys J       Date:  2015-10-20       Impact factor: 4.033

5.  "Wide-open" 1.3 A structure of a multidrug-resistant HIV-1 protease as a drug target.

Authors:  Philip Martin; John F Vickrey; Gheorghe Proteasa; Yurytzy L Jimenez; Zdzislaw Wawrzak; Mark A Winters; Thomas C Merigan; Ladislau C Kovari
Journal:  Structure       Date:  2005-12       Impact factor: 5.006

6.  Perturbation-Response Scanning Reveals Key Residues for Allosteric Control in Hsp70.

Authors:  David Penkler; Özge Sensoy; Canan Atilgan; Özlem Tastan Bishop
Journal:  J Chem Inf Model       Date:  2017-06-12       Impact factor: 4.956

7.  Dynamical network of residue-residue contacts reveals coupled allosteric effects in recognition, catalysis, and mutation.

Authors:  Urmi Doshi; Michael J Holliday; Elan Z Eisenmesser; Donald Hamelberg
Journal:  Proc Natl Acad Sci U S A       Date:  2016-04-11       Impact factor: 11.205

8.  Structure-Based Analysis of Single Nucleotide Variants in the Renin-Angiotensinogen Complex.

Authors:  David K Brown; Olivier Sheik Amamuddy; Özlem Tastan Bishop
Journal:  Glob Heart       Date:  2017-03-13

9.  An integrated molecular dynamics, principal component analysis and residue interaction network approach reveals the impact of M184V mutation on HIV reverse transcriptase resistance to lamivudine.

Authors:  Soumendranath Bhakat; Alberto J M Martin; Mahmoud E S Soliman
Journal:  Mol Biosyst       Date:  2014-08

10.  Perturbation-response scanning reveals ligand entry-exit mechanisms of ferric binding protein.

Authors:  Canan Atilgan; Ali Rana Atilgan
Journal:  PLoS Comput Biol       Date:  2009-10-23       Impact factor: 4.475

View more
  46 in total

1.  NAPS update: network analysis of molecular dynamics data and protein-nucleic acid complexes.

Authors:  Broto Chakrabarty; Varun Naganathan; Kanak Garg; Yash Agarwal; Nita Parekh
Journal:  Nucleic Acids Res       Date:  2019-07-02       Impact factor: 16.971

2.  An Interhelical Salt Bridge Controls Flexibility and Inhibitor Potency for Regulators of G-protein Signaling Proteins 4, 8, and 19.

Authors:  Vincent S Shaw; Mohammadjavad Mohammadi; Josiah A Quinn; Harish Vashisth; Richard R Neubig
Journal:  Mol Pharmacol       Date:  2019-09-22       Impact factor: 4.436

3.  Structural Analysis of the Regulatory GAF Domains of cGMP Phosphodiesterase Elucidates the Allosteric Communication Pathway.

Authors:  Richa Gupta; Yong Liu; Huanchen Wang; Christopher T Nordyke; Ryan Z Puterbaugh; Wenjun Cui; Krisztina Varga; Feixia Chu; Hengming Ke; Harish Vashisth; Rick H Cote
Journal:  J Mol Biol       Date:  2020-09-06       Impact factor: 5.469

4.  Exploring the conformational dynamics and flexibility of intrinsically disordered HIV-1 Nef protein using molecular dynamic network approaches.

Authors:  Anil Bhattarai; Isaac Arnold Emerson
Journal:  3 Biotech       Date:  2021-03-04       Impact factor: 2.406

5.  Molecular simulation studies of the interactions between the human/pangolin/cat/bat ACE2 and the receptor binding domain of the SARS-CoV-2 spike protein.

Authors:  Shaojie Ma; Hui Li; Jun Yang; Kunqian Yu
Journal:  Biochimie       Date:  2021-05-11       Impact factor: 4.079

6.  Identification of Selective Novel Hits against Plasmodium falciparum Prolyl tRNA Synthetase Active Site and a Predicted Allosteric Site Using in silico Approaches.

Authors:  Dorothy Wavinya Nyamai; Özlem Tastan Bishop
Journal:  Int J Mol Sci       Date:  2020-05-27       Impact factor: 5.923

7.  webPSN v2.0: a webserver to infer fingerprints of structural communication in biomacromolecules.

Authors:  Angelo Felline; Michele Seeber; Francesca Fanelli
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

8.  Dynamic residue interaction network analysis of the oseltamivir binding site of N1 neuraminidase and its H274Y mutation site conferring drug resistance in influenza A virus.

Authors:  Mohini Yadav; Manabu Igarashi; Norifumi Yamamoto
Journal:  PeerJ       Date:  2021-06-02       Impact factor: 2.984

9.  Site-Mutation of Hydrophobic Core Residues Synchronically Poise Super Interleukin 2 for Signaling: Identifying Distant Structural Effects through Affordable Computations.

Authors:  Longcan Mei; Yanping Zhou; Lizhe Zhu; Changlin Liu; Zhuo Wu; Fangkui Wang; Gefei Hao; Di Yu; Hong Yuan; Yanfang Cui
Journal:  Int J Mol Sci       Date:  2018-03-20       Impact factor: 5.923

10.  Structural Characterization of Carbonic Anhydrase VIII and Effects of Missense Single Nucleotide Variations to Protein Structure and Function.

Authors:  Taremekedzwa Allan Sanyanga; Özlem Tastan Bishop
Journal:  Int J Mol Sci       Date:  2020-04-16       Impact factor: 6.208

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.