| Literature DB >> 33423311 |
Abstract
The human spike protein sequences from Asia, Africa, Europe, North America, South America, and Oceania were analyzed by comparing with the reference severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) protein sequence from Wuhan-Hu-1, China. Out of 10333 spike protein sequences analyzed, 8155 proteins comprised one or more mutations. A total of 9654 mutations were observed that correspond to 400 distinct mutation sites. The receptor binding domain (RBD) which is involved in the interactions with human angiotensin-converting enzyme-2 (ACE-2) receptor and causes infection leading to the COVID-19 disease comprised 44 mutations that included residues within 3.2 Å interacting distance from the ACE-2 receptor. The mutations observed in the spike proteins are discussed in the context of their distribution according to the geographical locations, mutation sites, mutation types, distribution of the number of mutations at the mutation sites and mutations at the glycosylation sites. The density of mutations in different regions of the spike protein sequence and location of the mutations in protein three-dimensional structure corresponding to the RBD are discussed. The mutations identified in the present work are important considerations for antibody, vaccine, and drug development.Entities:
Keywords: SARS-CoV-2; mutations; receptor binding domain; sequence and structural mapping; spike proteins
Mesh:
Substances:
Year: 2021 PMID: 33423311 PMCID: PMC8014176 DOI: 10.1002/prot.26042
Source DB: PubMed Journal: Proteins ISSN: 0887-3585
Geographical distribution of human SARS‐CoV‐2 spike proteins and their associated number of mutations
| Continent | Number of spike proteins | Number of mutations |
|---|---|---|
| Africa | 103 | 121 |
| Asia | 996 | 1169 |
| Europe | 370 | 360 |
| North America | 8268 | 7453 |
| South America | 29 | 26 |
| Oceania | 567 | 525 |
Distribution of mutations in the different regions of human SARS‐CoV‐2 spike proteins
| Regions | Total number of mutations | Number of distinct mutation types |
|---|---|---|
| S1A domain (1‐302) | 759 | 196 |
| S1A‐S1B linker (303‐332) | 31 | 11 |
| S1B domain (333‐527) | 204 | 52 |
| S1B ‐ S1C linker (528‐533) | 1 | 1 |
| S1C domain (534‐589) | 75 | 15 |
| S1C ‐ S1D linker (590‐593) | 0 | 0 |
| S1D domain (594‐674) | 7915 | 34 |
| Protease cleavage site (675‐692) | 126 | 35 |
| S1‐S2 subunits linker (693‐710) | 13 | 7 |
| Central β‐strand (711‐737) | 13 | 7 |
| Downward helix (738‐782) | 24 | 18 |
| S2’ cleavage site (783‐815) | 27 | 13 |
| Fusion peptide (816‐828) | 6 | 3 |
| Connecting region (829‐911) | 111 | 26 |
| Heptad repeat region (912‐983) | 73 | 18 |
| Central helix (984‐1034) | 8 | 6 |
| β‐hairpin (1035‐1068) | 7 | 4 |
| β‐sheet domain (1069‐1133) | 79 | 27 |
| Heptad repeat region (1134‐1213) | 65 | 22 |
| Transmembrane region (1214‐1236) | 28 | 7 |
| Cytoplasmic region (1237‐1273) | 89 | 15 |
FIGURE 1Mutation density in human SARS‐CoV‐2 spike protein regions [Color figure can be viewed at wileyonlinelibrary.com]
Mutation sites and mutation types observed in human SARS‐CoV‐2 spike proteins according to geographical locations
| North America | F2L, L5F, L5I, V6F, L7V, P9L, S12C, Q14H, C15F, N17K, L18F, T20I, T22N, T22A, T22I, Q23K, P25S, A27S, A27V, T29I, F32L, R34C, H49Y, S50L, T51I, Q52L, Q52H, L54F, L54W, F55I, P57L, H69Y, S71F, G72V, T73I, G75V, T76I, F79L, D80N, D80Y, N87Y, D88N, D88E, D88Y, D88A, V90F, T95A, T95I, E96D, K97T, S98F, R102I, I105L, D111N, K113R, L118F, V130A, E132D, C136R, D138H, L141‐, L141F, G142V, G142‐, V143F, V143‐, Y144‐, Y144V, Y145H, H146Y, K147E, N148S, S151I, M153T, M153V, M153I, E154V, F157L, R158S, L176F, M177I, D178N, G181V, L189F, R190K, I203M, I210‐, R214L, D215Y, D215G, L216F, Q218L, F220L, S221L, A222V, A222P, D228H, L229F, Q239R, T240I, L242F, A243S, A243V, H245R, H245Y, R246K, D253Y, D253G, S254F, S256L, W258L, G261V, G261R, A262S, Y265C, V267L, R273S, E281Q, A288S, L293V, D294E, P295S, E298G, T307I, V308L, E309Q, Q314K, Q314L, Q314H, T315I, Q321L, T323I, P330S, A344S, T345S, A348T, A348S, N354K, R357K, V367F, V382L, P384L, V395I, R403K, V407I, A411S, G413R, L441I, R457K, K458Q, G476S, S477N, P479L, V483A, E484Q, Q493L, S494P, Y508H, H519Q, A520S, A522V, K529E, G545S, T547I, L552F, T553I, E554D, K558N, A570V, T572I, D574Y, E583D, I584V, S596I, I598V, N603H, Q613H, D614G, V615F, T618A, P621S, V622F, V622I, V622A, A623S, H625Y, A626V, P631S, W633R, G639V, S640F, A647S, A647V, E654Z, E654K, H655Y, N658Y, A668S, A672V, Q675R, Q675H, T676S, T676I, Q677H, Q677R, T678I, P681L, P681H, R682W, A684S, A684T, V687L, A688V, A688S, S689I, S691F, S698L, N703D, S704L, V705F, A706V, I714M, T716I, I720V, T724A, M731I, T732A, T732I, G744V, D745G, N751D, L754F, R765S, R765H, T768I, G769A, A771S, T778I, Q779H, E780Q, A783S, D808V, D808G, P809S, I818V, L822F, D830H, D830Y, Q836P, Q836L, G838D, A845S, A845D, A845V, A845D, A845S, A846V, R847I, K854R, N856S, T859I, D867N, A879V, A879S, A893E, A893V, E918V, L922F, A924S, A924V, S929I, D936H, D936Y, L938F, S939F, T941I, G946V, A958S, N969S, L981F, T1006I, V1008T, T1009I, A1016S, A1020V, F1052L, P1053T, L1063F, V1065L, A1070V, Q1071H, E1072V, K1073N, A1078V, A1078S, G1085R, K1086N, R1091L, H1101Y, V1104L, P1112L, D1118Y, T1120I, V1122L, S1123P, G1124V, G1124C, V1129A, I1130M, T1136I, D1139H, L1141F, D1146H, S1147L, D1153Y, P1162Q, P1162S, P1162Q, P1162S, D1163Y, G1167V, D1168H, V1176F, N1187Y, K1191N, N1192T, E1195Q, L1203F, K1205N, E1207A, 1219 V, G1219S, I1221T, V1228L, M1229I, V1230L, T1231I, T1231A, C1235F, M1237I, M1237V, M1237T, T1238I, K1245N, C1247F, G1251R, D1259H, S1261F, P1263L, V1264L |
| South America | N74K, I197V, D614G, V1176F |
| Europe | L5F, T22I, H49Y, Q115R, M153I, L176I, L176F, F186S, N188D, I197V, V213L, T240I, S254F, G261D, V367F, C379F, V382E, T393P, Y453F, F486L, N501T, T553N, K558R, T572I, L611F, D614G, T676I, S686G, M740I, G769V, Y789D, F797C, D839Y, A845S, A1020V, H1101Y, V1122L, P1162L, K1191N, M1229I, D1260N, D1260H, P1263L |
| Africa | L5F, S12F, T29I, H49Y, V70F, Y144‐, L242F, A288T, Q314R, R408I, A570S, D614G, S640A, A653V, Q677H, P812L |
| Asia | F2L, L5F, L8V, S12F, S13I, Q14H, T22I, P25L, Y28H, Y28N, T29A, G35V, Y38C, H49Y, S50L, L54F, A67V, A67S, I68‐, I68R, H69‐, V70‐, S71‐, G72‐, T73‐, N74‐, N74K, G75V, G75‐, T76I, T76‐, R78M, F86S, T95I, E96G, K97Q, S98F, V127F, D138Y, D138H, F140L, L141‐, G142‐, V143‐, Y144‐, H146Y, H146R, N148Y, S151G, W152L, M153I, S155I, E156D, S162I, Q173H, M177I, G181A, N185K, R190S, N211Y, V213L, S221W, Y248H, S255F, W258L, G261R, G261S, A262S, V267L, G268D, Q271R, A292V, L293M, D294I, P295H, L296F, S297W, C301F, P337R, V367F, L368P, V382L, R408I, A411D, E471Q, S477N, E484Q, P491L, Q506H, P507H, P507S, Y508N, L518I, H519Q, A520S, A570V, T572I, D574Y, E583D, G594S, Q607L, Q613H, D614G, V622F, A653V, E654Q, H655Y, Y660F, A672D, Q675‐, Q675H, Q675R, T676‐, Q677H, Q677‐, T678‐, N679‐, S680‐, P681‐, R682Q, R682W, R682‐, A684V, A684‐, R685‐, S686‐, V687‐, A688V, A688‐, Q690H, A701V, A706S, M731I, L754F, R765L, A771S, V772I, Q774R, E780D, A783S, K786N, T791I, K795Q, G798A, P809S, T827I, A829T, I834V, A879S, S884F, A892V, M900I, A930V, D936Y, L938F, S939F, S939Y, S974P, Q1002E, L1063F, T1077I, H1083Q, D1084Y, H1088R, F1089V, V1104L, F1109L, D1139Y, S1147L, D1153Y, G1167S, K1181R, R1185H, N1187K, K1191N, E1195Q, Q1201K, V1230E, C1243F, G1246A, D1259Y |
| Oceania | L5F, T22I, T29I, H49Y, S50L, T76I, S98F, I128F, D138H, M153I, L176F, D178N, E180K, I210‐, S221L, S247R, D253G, W258L, A262T, G283V, I468T, E471Z, S477N, V483F, G485R, Q498Z, T500I, N501Y, H519Q, P561L, E583D, D614G, P621S, A626V, Q675K, Q675H, S704L, M731I, T791I, D808B, P812S, D839N, A846V, I931V, D936Y, K1073N, 1079S, G1124V, D1163G, C1254F, D1260N |
FIGURE 2The 44 mutations (red spheres) mapped on to the crystal structure of the spike protein RBD (cyan) complexed with ACE‐2 receptor (green) (PDB code: 6LZG). PDB, Protein Data Bank; RBD, receptor binding domain [Color figure can be viewed at wileyonlinelibrary.com]
FIGURE 3Mutations in human SARS‐CoV‐2 spike protein RBD. RBD, receptor binding domain [Color figure can be viewed at wileyonlinelibrary.com]
FIGURE 4Interactions of spike protein residues (cyan) with ACE‐2 (green) side‐chain residues (yellow) that are within 3.2 Å in crystal structure of human SARS‐CoV‐2 spike protein RBD complexed with ACE‐2 receptor (PDB code: 6LZG). The spike protein mutated residues are shown in (red). PDB, Protein Data Bank; RBD, receptor binding domain [Color figure can be viewed at wileyonlinelibrary.com]