| Literature DB >> 34185681 |
Sheng Ye1,2,3, Guozhen Zhang3, Jun Jiang4.
Abstract
The novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), invades a human cell via human angiotensin-converting enzyme 2 (hACE2) as the entry, causing the severe coronavirus disease (COVID-19). The interactions between hACE2 and the spike glycoprotein (S protein) of SARS-CoV-2 hold the key to understanding the molecular mechanism to develop treatment and vaccines, yet the dynamic nature of these interactions in fluctuating surroundings is very challenging to probe by those structure determination techniques requiring the structures of samples to be fixed. Here we demonstrate, by a proof-of-concept simulation of infrared (IR) spectra of S protein and hACE2, that time-resolved spectroscopy may monitor the real-time structural information of the protein-protein complexes of interest, with the help of machine learning. Our machine learning protocol is able to identify fine changes in IR spectra associated with variation of the secondary structures of S protein of the coronavirus. Further, it is three to four orders of magnitude faster than conventional quantum chemistry calculations. We expect our machine learning protocol would accelerate the development of real-time spectroscopy study of protein dynamics.Entities:
Keywords: IR spectroscopy; SARS-CoV-2; neural networks; protein dynamics
Mesh:
Substances:
Year: 2021 PMID: 34185681 PMCID: PMC8256048 DOI: 10.1073/pnas.2025879118
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.ML protocol for the IR spectra of proteins.
Fig. 2.ML-predicted IR spectra of SARS-CoV-1, SARS-CoV-2, SARS-CoV-1-hACE2, and SARS-CoV-2-hACE2. (A) Comparison of experimental (30) (black line) and ML-predicted (red line: single crystal structure [PDB ID code 2AMQ]; blue line: average of 1,000 configurations) spectra of SARS-CoV-1. (B) ML-predicted IR spectra of SARS-CoV-2 based on a single crystal structure (red lines, PDB ID code 6LU7) and 2,000 MD configurations (blue lines). (C) ML-predicted IR spectra of SARS-CoV-1-hACE2 (PDB ID code 2AJF) during 10us MD simulation (contains nine trajectories; 1,000 snapshots for nos.1 to 8 trajectories, 334 snapshots for no. 9 trajectory). (D) Same as C but for SARS-CoV-2-hACE2 (PDB ID code 6M17). Intensity is scaled to have the same maximum intensity for each panel.
Average secondary structure content (computed by Stride program) of various coronaviruses and comparison of the time required for computing IR spectra of a single structure by Density Functional Theory (DFT) and our ML model in the framework of vibrational exciton model
| β-Strands (%) | β-Turns (%) | α-Helix (%) | 310-Helices (%) | Coil (%) | Bridge (%) | DFT (s) | ML (s) | |
| SARS-COV-1 | 30.1 | 19.9 | 23.9 | 2.5 | 21.0 | 2.5 | 1,165,320 | 70.69 |
| SARS-COV-2 | 28.3 | 25.5 | 20.3 | 2.6 | 20.4 | 2.9 | 1,173,000 | 72.68 |
| SARS-CoV-1-hACE2 | 7.6 | 23.2 | 45.2 | 3.9 | 18.0 | 2.2 | 1,482,120 | 100.80 |
| SARS-CoV-2-hACE2 | 7.0 | 21.2 | 45.6 | 3.2 | 21.8 | 1.2 | 1,474,440 | 98.68 |
| Trimeric SARS-CoV-2 S protein (closed state) | 30.7 | 25.6 | 18.0 | 1.9 | 21.9 | 1.7 | 6,068,100 | 5,295.60 |
| Trimeric SARS-CoV-2 S protein (open state) | 30.6 | 25.1 | 18.7 | 1.9 | 22.1 | 1.6 | 6,068,100 | 4,613.40 |
| RBD/hACE2 binding (S1 state) | 32.3 | 22.1 | 9.4 | 7.8 | 27.7 | 0.8 | 370,440 | 20.64 |
| RBD/hACE2 binding (S2 state) | 31.8 | 21.5 | 12.1 | 6.2 | 27.3 | 1.2 | 370,440 | 20.64 |
| RBD/hACE2 binding (S3 state) | 33.5 | 25.5 | 12.1 | 6.2 | 21.5 | 1.2 | 370,440 | 20.64 |
| RBD/hACE2 binding (S4 state) | 33.0 | 21.4 | 9.4 | 7.8 | 27.3 | 1.2 | 370,440 | 20.64 |
| RBD/hACE2 binding (S5 state) | 33.0 | 21.9 | 11.6 | 4.7 | 27.6 | 1.2 | 370,440 | 20.64 |
All reported times refer to calculations on an eight-core Intel(R) Xeon(R) CPU (E5-2683v4 at 2.1 GHz). DFT, Density Functional Theory.
Fig. 3.ML-predicted IR spectra of Trimeric SARS-CoV-2 S protein. (A) Closed state (PDB ID code 6VXX). (B) Open state (PDB ID code 6VYB).
Fig. 4.Five representative states of the receptor-binding domain of the SARS-CoV-2 spike (S protein) and the human ACE2 (hACE2) receptor were selected from the combination trajectory.