Emre Aktas1. 1. Faculty of Art and Science, Department of Moleculer Biology and Genetics, Bioinformatic Section, Afyon Kocatepe University, Afyonkarahisar, Turkey.
Abstract
There are certain mutations related to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In addition to these known mutations, other new mutations have been found across regions in this study. Based on the results, in which 4,326 SARS-CoV-2 whole sequences were used, some mutations are found to be peculiar with certain regions, while some other mutations are found in all regions. In Asia, mutations (3 different mutations in QLA46612 isolated from South Korea) were found in the same sequence. Although huge number of mutations are detected (more than 70 in Asia) by regions, according to bioinformatics tools, some of them which are G75V (isolated from North America), T95I (isolated from South Korea), G143V (isolated from North America), M177I (isolated from Asia), L293M (isolated from Asia), P295H (isolated from Asia), T393P (isolated from Europe), P507S (isolated from Asia), and D614G (isolated from all regions) (These color used only make correct) predicted a damage to spike' protein structure. Furthermore, this study also aimed to reveal how binding sites of ligands change if the spike protein structure is damaged, and whether more than one mutation affects ligand binding. Mutations that were predicted to damage the structure did not affect the ligand-binding sites, whereas ligands' binding sites were affected in those with multiple mutations. It is thought that this study will give a different perspective to both the vaccine SARS-CoV studies and the change in the structure of the spike protein belonging to this virus against mutations.
There are certain mutations related to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In addition to these known mutations, other new mutations have been found across regions in this study. Based on the results, in which 4,326 SARS-CoV-2 whole sequences were used, some mutations are found to be peculiar with certain regions, while some other mutations are found in all regions. In Asia, mutations (3 different mutations in QLA46612 isolated from South Korea) were found in the same sequence. Although huge number of mutations are detected (more than 70 in Asia) by regions, according to bioinformatics tools, some of them which are G75V (isolated from North America), T95I (isolated from South Korea), G143V (isolated from North America), M177I (isolated from Asia), L293M (isolated from Asia), P295H (isolated from Asia), T393P (isolated from Europe), P507S (isolated from Asia), and D614G (isolated from all regions) (These color used only make correct) predicted a damage to spike' protein structure. Furthermore, this study also aimed to reveal how binding sites of ligands change if the spike protein structure is damaged, and whether more than one mutation affects ligand binding. Mutations that were predicted to damage the structure did not affect the ligand-binding sites, whereas ligands' binding sites were affected in those with multiple mutations. It is thought that this study will give a different perspective to both the vaccine SARS-CoV studies and the change in the structure of the spike protein belonging to this virus against mutations.
In 2 decades, mankind has accosted with at least one lethal outbreak from the betacoronaviruses.
The first was severe acute respiratory syndrome coronavirus (SARS-CoV) in 2002, which infected more than 8,000 people, with nearly 800 deaths.
In 2012, Middle East Respiratory Syndrome (MERS)-CoV resulted in 2,294 cases.
The last one is that severe acute respiratory syndrome–coronavirus 2 causes the contagious disease COVID-19 (coronavirus disease 2019), which was first reported in Wuhan, in December 2019. Despite wide efforts to control the disease, COVID-19 has now spread to more than 100 countries and result in a worldwide pandemic.
Until now, recorded cases are more than 119,267,000, and the number of deaths has exceeded 2,647,000 (https://covid19.who.int/, March 15, 2020). Information about viral mutations for COVID-19 will give important insights into assessing viral drug resistance, immune escape, and pathogenesis-related mechanisms.
Moreover, this information may play a vital role in the design of new vaccines, antiviral drugs, and diagnostic assays. However, mutagenic process is complex, and many factors are implicated in this process such as replication of nucleic acids influenced by few or no proofreading capability and/or postreplicative nucleic acid repair, host enzymes, spontaneous nucleic acid damages due to physical and chemical mutagens, recombination events, and other particular genetic elements.
Some mutations which belong to different proteins of SARS-COV were already found.
Along these mutations,
some combined factors are thought to make COVID-19 dangerous. One of these factors may be that humanity has no direct immunological experience with SARS-COV-2, making humans prone to the infection.
COVID-19, which has a rapid global spread, may provide the virus with a higher chance for natural selection of mutations. As with the case of influenza (where mutations slowly accumulate in the hemagglutinin protein), there is a complex interplay between mutations that can confer immune resistance to the virus and the fitness landscape of the particular variant in which they arise. Severe acute respiratory syndrome–coronavirus 2, which has a remarkably high mutation rate and many characterized variations, has been shown to have undergone certain mutations in its structural and nonstructural proteins, within several months of its global spread.[8-11] Virus-related mutations are a concern because mutations can both affect the transmission rate of the virus and affect possible vaccine studies, and mutations can belong to regions such as Europe and North America.[5,12] For example, SARS-CoV-2 variants with G614 in the S protein have replaced the original D614 variants and have become the dominant form circulating globally.
Like some aforementioned important studies,[8-12] this study focused on mutations and some of their characteristics to affect on spike structure damage. This study is focused on determining mutations that occurred based on regions and evaluating whether these new mutations affect the structure of spike proteins. In addition, this study also predicts how mutations affect the ligand-binding site of SARS-CoV-2. Characterization of these detected variants may give new insight for designing new candidate vaccine studies, treatments, and diagnostic approaches SARS-CoV-2.
Materials and Methods
Data set construction
NCBI Virus website (www.ncbi.nlm.nih.gov/labs/virus) was used for obtain 4,526 whole sequences surface glycoprotein of COVID-19 (taxid: 2697049) isolated from humans, and NCBI Virus website has been adjusted for the respective based on their geographic region.
Amino acid substitution analysis
The data set that is downloaded from the NCBI Virus website was aligned using MEGAX (align by MUSCLE) program. Geographical regions were evaluated separately, and amino acid substitutions were found manually.
Predicted protein structure
Phyre2 (www.sbg.bio.ic.ac.uk/phyre2/html), a suite of tools available on the web, was used to predict and analyze the protein structure, function, and mutations. All the predicted structures were obtained using this tool one by one.
Predicted structure models
MISSENSE3D online tool was used to predict the structure of missense variants in relation to normal structure. All the results (from Phyre2) obtained were analyzed one by one.
Predicted phylogenetic clusters and genotypes
Genome Detective Coronavirus Typing Tool was used for the prediction of phylogenetic clusters of virus. This application identifies the phylogenetic clusters and genotypes from assembled genomes in amino acid FASTA format.
Prediction of ligand site
3DLigandSite online method was used for an automated prediction of the ligand-binding sites.
Results and Discussion
Finding mutations based on regions and discussion
Eighty four whole spike protein sequences isolated from Africa are used, and the most common mutations are found to be Q667H (5 mutations), D614G (3 mutations), R408I (2 mutations), and others (1 mutations), which was found containing 8 different mutations (Table 1). One of these sequences, QJX45344, which was isolated from Tunisia, has 2 mutations which are A288T and Q314R (Table 1). This result may increase the likelihood of being infected with 2 different mutations at the same time. The predicted structure damage for both did not influence the structure damage to the spike proteins, based on the result of the MISSENSE3D online tool.
In this area, only D614G mutation predicted a damaged structure. For others, there was no prediction of structure damage (Table 1). But it is known that compared with the D614 variant, higher viral loads were found in patientsinfected with the G614 variant, but clinical data suggested no significant link between the D614G alteration and disease severity, and also suggesting the alteration may have increased the infectivity of SARS-CoV-2.
According to this interpretation, it may not be possible to draw a clear conclusion about how other found mutations will affect this epidemic process in terms of severity. Because, even D614G mutation predicted a damaged structure (Table 1), no significant link between the D614G alteration and disease severity was found.
Table 1.
Eight different types of mutation results based on whole sequences of surface glycoprotein based on Africa region (data taken from NCBI Virus website and results obtained using MEGAX manually).
Access number
The mutation
QKR84285
S12F
QJX45356
T29I
QJX45344
A288T
QJX45344
Q314R
QKT21014
R408I
QKR84321
A570S
QJX45321
D614G
QKW95051
S640A
Eight different types of mutation results based on whole sequences of surface glycoprotein based on Africa region (data taken from NCBI Virus website and results obtained using MEGAX manually).347 whole sequences of the spike protein isolated from Europe were used to predict possible mutations on SARS-CoV-2 surface proteins. The most common mutations were found to be D614G (39 mutations), H49Y (3 mutations), Y453 F (8 mutations), G261D (6 mutations), A845S (4 mutations), T676I (2 mutations), S254F (2 mutations), and I197V (2 mutations), respectively, while the others that have only one mutation are shown on Table 2. According to results of Table 2, the same mutation can occur at different positions. For instance, Alanine can change to Serine at 2 different positions such as A845S and A892S (Table 2). In addition, Threonine (T) can change to Isoleucine at 3 different positions T22I, T240I, and T676I. Only 2 (T393P and D614G) of these mutant sequence predicted a structure damage (Table 2). Korber et al
suggested that the alteration (D614G) may have increased the infectivity of SARS-CoV-2, and higher viral loads were found in patientsinfected with the G614 variant, and Toyoshima et al
said that this variant has also higher fatality rate., Like this suggestion, when T393P occurs at the spike protein, it may affect on both infectivity of SARS-CoV-2 and higher viral loads. A study showed that 4 mutations (at the nucleotide level) are common in the SARS-CoV-2 European isolates genomes, where the severity of the infection is mostly more intense than in the other geographical regions.
T393P mutation, which the other predicted a structure damage, is found only in Europe (Table 2). It is conceivable that this mutation is more likely to be found in Europe. Although F486L and N501T is not predicted that does not damage the structure of the spike protein (Table 2), it has been stated that the N501T and F486L mutations affect the stability of the spike protein.
It is known that stability is a fundamental property affecting function, activity, and regulation of biomolecules, and stability also is very important for vaccine study.[12,21]
Table 2.
Thirty five different types of mutations (obtained using MEGAX program) results only for Europe region.
Access number
The mutation
Access number
The mutation
QKM76366
T22I
QJS39507
N501T
QJT72134
L5F
QJT73034
T553N
QHU79173
H49Y
QJC19455
K558R
QJD23141
Q115R
QJT72470
T572I
QJT72086
M153I
QJT72278
L611F
QJT72350
L176I
QKM76846
D614G
QJS53410
N188D
QJT72614
T676I
QJS53494
I197V
QJZ28203
M740I
QKJ68364
V213L
QJS54286
G769V
QJT73010
T240I
QJS53386
Y789D
QKM76906
S254F
QIC53204
F797C
QJS39543
G261D
QJT72710
A845S
QJS39627
V367F
QJS53578
A892S
QJT72806
V382E
QJT72242
A1020V
QJT72386
C379F
QJS53506
H1101Y
QJS54106
T393P
QJS53398
V1122L
QJS39603
Y453F
QJZ28203
D1260N
QJS39567
F486L
Thirty five different types of mutations (obtained using MEGAX program) results only for Europe region.Based on 760 whole sequences from Oceania and South America, the most common mutations are found to be G1124V (25 mutations) and D614G (20 mutations), while other different mutations tend to increase, such as S50L (10 mutations), A262T (11 mutations), L5F (5 mutations), D138H (3 mutations), S221L (3 mutations), G485R (3 mutations) (Table 3). As in Europe and Africa, there are similar mutations that occurred at different positions, such as T29I, T76I, and T791I. Besides, QKV37632 sample has 2 mutations which are T29I and S704 (Table 3). As seen in sequences from all regions, the D614G mutation predicted structure damage for this region. Even the D936Y mutation did not predict the damage to the spike protein structure, however, this mutation is predicted to reduce the stability of spike proteins.
Stability is already mentioned that it is quite important for function and vaccine study.[12,21]
Table 3.
Thirty one different types of mutations. 760 whole sequences from Oceania and South America (20 of them belong to South America) of spike protein were used.
Access number
The mutation
Access number
The mutation
QJR90681
L5F
QHR84449
D614G
QKV37632
T29I
QJR87501
P621S
QKV38004
H49Y
QJR93417
A626V
QJR87081
S50L
QKR84925
Q675H
QJR88113
T76I
QJR87477
Q701H
QJR92637
I128F
QKV37632
S704L
QJR93237
D138H
QJR85593
M731I
QJR93801
L176F
QJR88113
T791I
QJR89217
S221L
QKV38208
P812S
QHR84449
S247R
QJR87261
A846V
QJR87129
W258L
QJR93861
D936Y
QJR87465
A262T
QJR88221
P1079S
QJR86937
I468T
QKV37548
G1124V
QKR86245
G485R
QJR85701
D1163G
QKR85081
H519Q
QJR85833
D1260N
QJR85965
P561L
Thirty one different types of mutations. 760 whole sequences from Oceania and South America (20 of them belong to South America) of spike protein were used.The maximum D614G mutation rates are found in North America in 2,700 complete sequences of only spike proteins. Based on results (some are shown on Table 4), more than 255 mutations for D614G was determined. In the sample sequences isolated for this study, some other mutations were found, such as L5F (19 mutations), D138H (18 mutations), E554D (13 mutations), and P631L (10 mutations). Like Tables 2 and 3, two different mutations were found at the same position. For instance, QKG89654 (A845D) and QKV35819 (A845V) have different mutations at the same position. Other examples are QKG91034 (Q836P) and QKG81751 (Q836L) (Table 4). These 2 examples may be a proof that some positions are more vulnerable to mutations. For both, there was no predicted structure damage according to MISSENSE3D online tool.
As in Europe and Africa, Threonine (T) changed to Isoleucine (I) at 3 different positions; however, there was no predicted structure damage (Table 4). The interaction of the mutations in the spike protein with the antibody was examined, and as a result, it was determined that possible mutations affect the functions of the antibodies.
Not all mutations might have a negative effect on the spike protein, some are known to affect them negatively. Among the results in this table (Table 4), it is possible that the spike protein will be adversely affected.[12,21,22] The presence of some mutations in a particular region can be mentioned as a regional effect in the formation of mutations (Tables 1–5).
Table 4.
Fifty two different mutations based on whole sequences (2,500 sequences) of spike proteins in North America.
Access number
The mutation
Access number
The mutation
QKG81847
L5F
QKG90866
A570V
QKG81475
S12C
QKE61636
D614G
QKG90662
Q14H
QKG81571
P631L
QKV07471
T29I
QKG89666
A647V
QKG90530
F32L
QLC93320
Q677R
QKV38905
S50L
QKV39263
T732A
QKG89918
H69Y
QKV35279
N751D
QLA47679
G75V
QKG90590
A783S
QKG27877
T95I
QKG90614
P812S
QKW89191
E132D
QKG91034
Q836P
QKG86505
D138H
QKG81751
Q836L
QKV38905
G143V
QLB39201
G838D
QLB39236
R158S
QKG89654
A845D
QLC91400
R214L
QKV35819
A845V
QLC47920
F220L
QKV38964
L922F
QKY77964
L229F
QKS65656
S922F
QKS65788
H245R
QLC92852
A1078V
QKX46227
D253G
QLC47920
R1091L
QLC48052
A262S
QKG90434
T1120I
QKG90986
V267L
QKG86529
V1129A
QKV35267
R273S
QKV35279
L1141F
QKV37031
P330S
QLC91196
P1162S
QKV39455
T345S
QKG91082
E1195Q
QLC48016
N354K
QKS65584
G1219V
QKV08239
P384L
QLC93524
V1228L
QKI30376
E554D
QLC92372
P1263L
Table 5.
Seventy six mutation results based on whole sequences (635 sequences) of spike proteins in Asia region.
Access. number
The mutation
Access.number
The mutation
QJX44586
F2L
QJD23249
H519Q
QIT07011
L8V
QIU81885
A570V
QJX44430
S13I
QJT43608
T572I
QKO25614
Q14H
QKJ68545
D574Y
QJQ84843
T22I
QJR84537
E583D
QIA20044
Y28N
QKJ68497
Q613H
QKO25770
H49Y
QIT06999
D614G
QIU80913
S50L
QIU81873
A653V
QKO25770
T76I
QKT20894
H655Y
QLA46612
L54F
QKW92184
Q675H
QJY40517
R78M
QLA10116
Q677H
QLA46612
F86S
QKV49386
R682Q
QLA46612
T95I
QKN61217
R682W
QKO25758
D138H
QKU37093
A684V
QKE61684
N148Y
QJX44634
A706S
QKV27551
W152
QJD47800
R765L
QKQ30162
M153I
QIZ16509
V772I
QJT43452
E156D
QJD20632
T791I
QJY40469
S162I
QKY60177
K786N
QKJ68737
Q173H
QKY65277
K795Q
QJW00291
M177I
QKO00486
P809S
QLA09870
K188N
QJQ84831
A829T
QKO25794
N211Y
QJT43584
T827I
QHZ00379
S221W
QJX44466
A879S
QLA10140
W258L
QJD47718
S884F
QKY60121
A262S
QJT43572
A892V
QKX47933
G261R
QIA98583
A930V
QJC19491
Q271R
QKK12815
S939Y
QJD23249
L293M
QKF95522
Q1002E
QJD23249
D294I
QJY40517
H1083Q
QJD23249
P295H
QKI31226
F1109L
QKV49386
V367F
QKO25782
V1104L
QJX44562
E471Q
QJR84369
K1181R
QKY60177
Q506H
QKJ68545
D1153Y
QKY60177
Y508N
QKO25674
K1191N
QKY60177
P507S
QKJ68605
Q1201K
QKY60189
P507H
QJR84429
C1243F
Although a large number of the same result is obtained in number, several are represented as a representation.
Fifty two different mutations based on whole sequences (2,500 sequences) of spike proteins in North America.Seventy six mutation results based on whole sequences (635 sequences) of spike proteins in Asia region.Although a large number of the same result is obtained in number, several are represented as a representation.Asia is the region where most mutation types were seen (Table 5). As seen in all regions, D614G was the most variant for all regions (Tables 1 to 5); 240 isolated samples had this variant in Asia (Table 5). In addition, mutations more than 3 were found, such as L54F (40 mutations), R78M (15 mutations), V367F (5 mutations), A829T (10 mutations), H1083Q (4 mutations), T791I (12 mutations), Q677H (4 mutations), E583D (15 mutations), T572I (10 mutations), and L8V (4 mutations). Moreover, some other regions have 1 or 2 mutations. QLA46612 isolated from South Korea has 4 different mutations L54F, F86S, T95I, and QKY60177, whereas India has 4 mutations Q506H, P507S, Y508N, and K786N, respectively (Table 5). None of these mutations predicted structure damage according to MISSENSE3D online tool.
Also, some mutations were found in more than one. As in Table 5, Threonine (T) changes to Isoleucine (I) at different positions such as T22I, T76I, T95I, T572I, T791I, and T827I; whereas Glutamine (Q) changed to Histidine (H) (QLA10116 and QKW92184). In this region, some mutations found include C1243F, Q1201K, K1191N, D1153Y, P507S, among others. Another example where 3 mutations occurred at the same isolated sequence is QKY60177, which has Q506H, Y508N, and P507S mutations. Like QLA46612 isolated from South Korea, QJD23249 isolated from Wilayah Persekutuan Malaysia has 4 mutations which includes L293M, D294I, P295H, and H519Q. Interestingly, QJD23249 isolated sample’ mutations are predicted no structure damages (Table 5). There are situations that are anticipated to increase the periodicity of encounters between SARS-CoV-2 and antibodies that could effect the dawn of antibody Millions of individuals have already been infected with SARS-CoV-2 and among them, neutralizing antibody titers are extremely changeable.[23,24] In addition, it may be predicted that the effects of mutations may worsen this situation. It will be important to identify mutations and monitor their prevalence in a way that is analogous to antiviral and antibiotic resistance monitoring.
It has been stated that environmental factors also affect the spread of SARS-CoV. In the same study, it was determined that both the temperature and the environment affected the spread of the virus.
Viral factors might contribute to transmissibility too. For example, a distinct rise in the prevalence of SARS-CoV-2 bearing a D614G mutation has been noted over time.
Whether this mutation provides a selective odds to the virus has been debated,
it has now been known that this variant infects humanACE2 cell lines more efficiently than wild-type virus, that offspring virus has increased expression of S protein, that the S protein has a higher rate of binding to ACE2.[27,28] As in these studies, the obtained mutations may affect the interaction of the spike protein with ACE-2 and may affect the transmission rate of the virus under certain environmental conditions. In a study of household transmission in China, opening windows to allow better air movement led to lower secondary household transmission.
Poor ventilation has been implicated in numerous transmission clusters, including those in bars, churches, and other location.[30,31] Even such specific areas can affect the distribution of the virus, while geographically, it may affect the distribution of the virus where there are special climates and conditions.
Predicted reasons for structure damages
All missense mutations were used to predict structure damage and the results are shown in Figure 1. Predicted structure damage for D614G mutation (found in all regions) is due to substitution, which replaces glycine originally located in a bend curvature in this area (Figure 1A). T393P isolated from Europe substitutes and introduces a buried proline which triggers disallowed phi/psi alert. The phi/psi angles are found in the favored region of the wild-type residue but not in outlier region of the mutant residue (Figure 1B). The predicted reason for M177I isolated from Asia is that substitution results in a change between the buried and exposed state of the target variant residue. Metiyonin is buried (relative solvent accessibility (RSA) = 1.0%) and Arginine is exposed (RSA = 16.9%). RSA for buried has to be <9% and difference between Relative Solvent Accessibility has to be ⩾5% (Figure 1C). The substitution in the P507S mutant sequence isolated from Asia replaces a buried uncharged residue (Proline, RSA 0.0%) with a charged residue Histidine (Figure 1D). The substitution in the P295H mutant sequence isolated from Asia replaces a buried uncharged residue (Proline, RSA 0.7%) with a charged residue Histidine and leads to the expansion of cavity volume by 142.128 Å^3 (Figure 1E). The substitution in the L293M mutant sequence resulted in a change between buried and exposed state of the target variant residue. Leucine was buried (RSA 2.4%) and Metiyonin was exposed (RSA 13.2%) (Figure 1F). The substitution in the G75V mutant sequence isolated from North America replaces a buried GLY residue (RSA 3.5%) with a buried Valine residue (RSA 0.0%) (Figure 1G). This (G143V) substitution triggers a disallowed phi/psi alert. The phi/psi angles are in the allowed region of wild-type residue, but not the outlier region of the mutant residue, and it replaces glycine originally located in a bending curvature (Figure 1H). The substitution in T95I mutant sequence isolated from Asia and North America disrupts all side-chain/side-chain H-bond(s) and/or side-chain/main-chain H-bond(s) formed by a buried Threonine residue (RSA 0.0%) (Figure 1I). The phylogenetic tree of mutations according to bioinformatics tools in shown in Figure 2. The phi (φ) values of amino acid residues and the psi (ψ) values and H-bond(s) are important to create homolog models and 3D structures of the envelope protein.
Therefore, finding these results may be used by bioinformaticians to possible vaccine studies and obtaining predicted protein structure. They tend to closely relate to both bats SARS-CoV and outgroup, according to Genome Detective Coronavirus Typing Tool which is assembled genomes in FASTA format.
Amino acid forms of spike protein were used to obtain pyhlogenetic some samples. This allows proper identification of other coronavirus types and the chasing of new viral mutations as the outbreak expands globally.
Interestingly, mutations that damage the structure did not affect the ligand-binding sites (Figure 3); however, ligands’ binding sites were affected in those with multiple mutations (Figure 4). The results for all mutations detected to affect the structure were the same and are shown in Figure 3. For example, the same source structure (2dd8_S,2ajf_E pdb) was taken for the structure predicted for all ligand-binding sites. Moreover, all amino acids were the same (Figure 3). QJX45344 that is isolated from Africa has 2 mutations at the same sequence, and the first (Figure 4) represents the sequence result. As seen in Figure 3, the source used to predict the structure was 2dd8_S,2ajf_E; however, the predicted binding sites were different from those of Figure 3. These binding sites includes 338 Phenylalanine (contact: 1, Av distance: 0.00), 339 Glycine (contact: 1, Av distance: 0.00), 342 Phenylalanine (contact: 2, Av distance: 0.18), 343 Asparagine (contact: 2, Av distance: 0.00). The second (Figure 4) is QLA46612 (which has 3 mutations) isolated from South Korea. The source used to predict the structure was 1ww6_A,1ulf_ A,1ulc_B; while that used for predicting the binding site was 118 Leucine (contact: 3, Av distance: 0.27), 120 Valine (contact: 3, Av distance: 0.16) 127 Valine (contact: 3, Av distance: 0.05), 129 Lysine (contact: 3, Av distance: 0.00), 157 Phenylalanine (contact: 3, Av distance: 0.169), 159 VAL (contact: 3, Av distance: 0.00), 160 Tyrosine (contact:2, Av distance: 0.00), 169 Glutamic Acid (contact: 2, Av distance: 0.54). However, these predicted binding sites are different from those in Figure 3. It can be said that more than one mutation affects the ligand-binding site based on 3DLigandSite analysis. The phi (φ) values of amino acid residues and the psi (ψ) values and H-bond(s) are important to create homolog models and 3D structure of envelope protein.
Therefore, finding this results may be used by bioinformatician to possible vaccine studies and obtaining predicted protein structure
Figure 1.
All mutations found in Tables 1-5 were analyzed one by one based on their region by using bioinformatic tools. And mutations that predicted to may affect the structure of the spike protein are shown. These mutations are D614G(A), T393P(B), M177I(C), P507S(D), P295H(E), L293M(F), G75V(G), G143V)(H), T95I(I) respectively.According to bioinformatics analysis, these mutations might affect the structure of the spike protein.
Both structures are given and illustrates with colors. While yellow color shows wild-type chains, dark green color shows mutant chains. Light green color shows wild-type residue, and red color shows mutant residues. The reason why the light green color does not appear is that it remained inside the shape.
Figure 2.
The Phylogenetic tree of one mutation (T393P, in blue color) predicted to may play a role in structure damage according to Genome Detective Coronavirus Typing Tool is shown.
All mutations have same location.
Figure 3.
The mutations predicted to might affect ligand-binding site results are obtained by 3DLigandSite.17 They are QJX45344(A, B), QKY60177(C, D), QLA46612(E), QJD23249(F), QKV37632(G). Besides it is predicted that mutations (are shown in Figure 1) damage the structure but did not affect ligand-binding sites.
All results (predicted ligand-binding sites) were the same, but the structures are different. While blue color represents predicted residues, cyan represents heterogens based on 3DLigandSite analysis.
Figure 4.
The ligand-binding site and some varying features when 2 or more mutations occur. In these two samples, it was determined that when two mutations occur at the same time, both the structure of the spike surface protein and the ligand-binding sites may be affected (QLA46612(a), QJX45344(b)). While blue color represents predicted residues, cyan represents heterogens based on 3DLigandSite analysis.
All mutations found in Tables 1-5 were analyzed one by one based on their region by using bioinformatic tools. And mutations that predicted to may affect the structure of the spike protein are shown. These mutations are D614G(A), T393P(B), M177I(C), P507S(D), P295H(E), L293M(F), G75V(G), G143V)(H), T95I(I) respectively.According to bioinformatics analysis, these mutations might affect the structure of the spike protein.Both structures are given and illustrates with colors. While yellow color shows wild-type chains, dark green color shows mutant chains. Light green color shows wild-type residue, and red color shows mutant residues. The reason why the light green color does not appear is that it remained inside the shape.The Phylogenetic tree of one mutation (T393P, in blue color) predicted to may play a role in structure damage according to Genome Detective Coronavirus Typing Tool is shown.All mutations have same location.The mutations predicted to might affect ligand-binding site results are obtained by 3DLigandSite.17 They are QJX45344(A, B), QKY60177(C, D), QLA46612(E), QJD23249(F), QKV37632(G). Besides it is predicted that mutations (are shown in Figure 1) damage the structure but did not affect ligand-binding sites.All results (predicted ligand-binding sites) were the same, but the structures are different. While blue color represents predicted residues, cyan represents heterogens based on 3DLigandSite analysis.The ligand-binding site and some varying features when 2 or more mutations occur. In these two samples, it was determined that when two mutations occur at the same time, both the structure of the spike surface protein and the ligand-binding sites may be affected (QLA46612(a), QJX45344(b)). While blue color represents predicted residues, cyan represents heterogens based on 3DLigandSite analysis.
Conclusion
In this study, it was determined that some of the mutations obtained affect the structure of the spike both the protein structure and the binding sites of the ligand, and some did not. In addition, some of these mutations were found in all regions, while others were found to be only in a certain region. According to this result, mutations may be region specific and can be thought to be affected by environmental factors belonging to that region. Another important result of the study is that more than one mutation is seen in a sample. It can be concluded that more than one mutants maybe found in individuals at the same time. Therefore, attention should be paid to travel as there may be a risk of different mutations according to the regions. It should also be known that there is a possibility that one mutation may have different forms in a single person, and accordingly, this possibility should be considered in possible vaccine studies.