Thanh Thi Nguyen1, Pubudu N Pathirana2, Thin Nguyen3, Quoc Viet Hung Nguyen4, Asim Bhatti5, Dinh C Nguyen2, Dung Tien Nguyen6, Ngoc Duy Nguyen5, Douglas Creighton5, Mohamed Abdelrazek6. 1. School of Information Technology, Deakin University, Victoria, Australia. thanh.nguyen@deakin.edu.au. 2. School of Engineering, Deakin University, Victoria, Australia. 3. Applied Artificial Intelligence Institute (A2I2), Deakin University, Victoria, Australia. 4. School of Information and Communication Technology, Griffith University, Queensland, Australia. 5. Institute for Intelligent Systems Research and Innovation (IISRI), Deakin University, Victoria, Australia. 6. School of Information Technology, Deakin University, Victoria, Australia.
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.
Severe acute respiratory syndrome coronavirus 2 (n class="Species">SARS-CoV-2) is a highly pathogenic virus that has caused the global COVID-19 pandemic. Tracing the evolution and transmission of the virus is crucial to respond to and control the pandemic through appropriate intervention strategies. This paper reports and analyses genomic mutations in the coding regions of SARS-CoV-2 and their probable protein secondary structure and solvent accessibility changes, which are predicted using deep learning models. Prediction results suggest that mutation D614G in the virus spike protein, which has attracted much attention from researchers, is unlikely to make changes in protein secondary structure and relative solvent accessibility. Based on 6324 viral genome sequences, we create a spreadsheet dataset of point mutations that can facilitate the investigation of SARS-CoV-2 in many perspectives, especially in tracing the evolution and worldwide spread of the virus. Our analysis results also show that coding genes E, M, ORF6, ORF7a, ORF7b and ORF10 are most stable, potentially suitable to be targeted for vaccine and drug development.
Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971
Authors: Darren P Martin; Steven Weaver; Houriiyah Tegally; James Emmanuel San; Stephen D Shank; Eduan Wilkinson; Alexander G Lucaci; Jennifer Giandhari; Sureshnee Naidoo; Yeshnee Pillay; Lavanya Singh; Richard J Lessells; Ravindra K Gupta; Joel O Wertheim; Anton Nekturenko; Ben Murrell; Gordon W Harkins; Philippe Lemey; Oscar A MacLean; David L Robertson; Tulio de Oliveira; Sergei L Kosakovsky Pond Journal: Cell Date: 2021-09-07 Impact factor: 66.850
Authors: Ettore Capoluongo; Carmela Nardelli; Maria Valeria Esposito; Antonio Riccardo Buonomo; Monica Gelzo; Biagio Pinchera; Emanuela Zappulo; Giulio Viceconte; Giuseppe Portella; Mario Setaro; Ivan Gentile; Giuseppe Castaldo Journal: Front Oncol Date: 2021-07-20 Impact factor: 6.244