| Literature DB >> 33889583 |
Nikita Zrelovs1, Monta Ustinova1, Ivars Silamikelis1, Liga Birzniece1, Kaspars Megnis1, Vita Rovite1, Lauma Freimane1, Laila Silamikele1, Laura Ansone1, Janis Pjalkovskis1, Davids Fridmanis1, Baiba Vilne2, Marta Priedite3, Anastasija Caica3, Mikus Gavars4, Dmitry Perminov4,5, Jelena Storozenko2,6, Oksana Savicka2,6, Elina Dimina7, Uga Dumpis8,9, Janis Klovins1.
Abstract
Remaining a major healthcare concern with nearly 29 million confirmed cases worldwide at the time of writing, novel severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) has caused more than 920 thousand deaths since its outbreak in China, December 2019. First case of a person testing positive for SARS-CoV-2 infection within the territory of the Republic of Latvia was registered on 2nd of March 2020, 9 days prior to the pandemic declaration by WHO. Since then, more than 277,000 tests were carried out confirming a total of 1,464 cases of coronavirus disease 2019 (COVID-19) in the country as of 12th of September 2020. Rapidly reacting to the spread of the infection, an ongoing sequencing campaign was started mid-March in collaboration with the local testing laboratories, with an ultimate goal in sequencing as much local viral isolates as possible, resulting in first full-length SARS-CoV-2 isolate genome sequences from the Baltics region being made publicly available in early April. With 133 viral isolates representing ~9.1% of the total COVID-19 cases during the "first coronavirus wave" in the country (early March, 2020-mid-September, 2020) being completely sequenced as of today, here, we provide a first report on the genetic diversity of Latvian SARS-CoV-2 isolates.Entities:
Keywords: 2019-nCoV; COVID-19; HCoV-19; Latvia; SARS-CoV-2; genetic diversity; next-generation sequencing
Year: 2021 PMID: 33889583 PMCID: PMC8055824 DOI: 10.3389/fmed.2021.626000
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Major SARS-CoV-2 clades defining genetic markers and their occurrence in Latvia, Europe, and Worldwide (as of 14 September 2020).
| S | C | C | A | T | G | G | G | C | G | 5.79 | 2.82 | 0 |
| L | C | C | A | C | G | G | G | T | G | 4.31 | 5.78 | 3.01 |
| V | C | C | A | C | T | G | T | T | G | 5.16 | 8.46 | 0 |
| G | T | T | G | C | G | G | G | T | G | 22.59 | 28.54 | 16.54 |
| GH | T | T | G | C | G | T | G | T | G | 22.14 | 10.56 | 30.83 |
| GR | T | T | G | C | G | G | G | T | A | 34.92 | 41.79 | 48.12 |
| Other | X | X | X | X | X | X | X | X | X | 5.09 | 2.05 | 1.5 |
X denotes any nucleotide.
Figure 1Daily numbers of positive COVID-19 cases (A) and tests performed (B) in Latvia. x-axis is the same for both tiles and represents daily time series from 28th of February, 2020 to 11th of September, 2020. The red vertical line indicates the date of the first COVID-19 case registered in Latvia. (A) Y value represents the total number of positive cases registered on a given day. Blue area shows the number of only successfully sequenced isolates, while the red area represents the positive cases not sequenced during this study. (B) Y value represents the number of tests carried out on a given date in Latvia.
Figure 2Distribution of sequenced SARS-CoV-2 isolates by clades in major regions of the world, worldwide, and in Latvia. y-axis depicts cumulative complete SARS-CoV-2 genome count (with unambiguous collection date) from a particular region and has different scale within the subplots. x-axis is the same for all subplots and depicts sampling time-series from 24th of December, 2019 till 12th of September, 2020.
Ten most frequently mutated genome positions among Latvian SARS-CoV-2 isolates (n = 133).
| 241 | C | 5′UTR:241 | T | Extragenic | 5′UTR | 241 | N/A | 129 | 96.99% | ||
| 3,037 | C | NSP3:F106F | T | Silent | NSP3 | F106F | Predicted phosphoesterase, papain-like proteinase | 128 | 96.24% | ||
| 14,408 | C | NSP12b:P314L | T | SNP | NSP12b | P314L | RNA-dependent RNA polymerase, post-ribosomal frameshift | 128 | 96.24% | ||
| 23,403 | A | S:D614G | G | SNP | S | D614G | Spike | 128 | 96.24% | ||
| GGG | N:RG203KR | AAC | RG203KR | 59 | 44.36% | ||||||
| 28,881 | G | N:R203K | A | SNP | N | R203K | Nucleocapsid protein | 32 | 96 | 24.06% | 72.18% |
| GGGG | N:RG203KL | AACT | RG203KL | 5 | 3.76% | ||||||
| 25,563 | G | ORF3a:Q57H | T | SNP | ORF3a | Q57H | ORF3a protein | 41 | 30.83% | ||
| 18,877 | C | NSP14:C279C | T | Silent | NSP14 | C279C | 3′-to-5′ exonuclease | 36 | 27.07% | ||
| 1,202 | A | NSP2:N133D | G | SNP | NSP2 | N133D | Non-Structural protein 2 | 34 | 25.56% | ||
| 12,513 | C | NSP8:T141M | T | SNP | NSP8 | T141M | Non-Structural Protein 8 | 34 | 25.56% | ||
| 25,710 | C | ORF3a:L106L | T | Silent | ORF3a | L106L | ORF3a protein | 33 | 24.81% | ||
Color coding is based on the variant class, as follows: red represents extragenic variants; green, silent variants; and blue, single nucleotide polymorphisms. Asterisk
in “Variant class” column represents that there are multiple variants present at a given genome position (28,881); some of them are neighboring loci polynucleotide variants rather than SNP.
Figure 3Maximum clade credibility tree (mean node heights) estimated from the completely sequenced Latvian isolates (n = 133) and Wuhan-Hu-1 isolate. Node labels are colored according to the GISAID major clade of particular isolate, as follows: green, GR; yellow, GH; red, G; blue, L; purple, O (other); black, Wuhan-Hu-1 reference sequence. The tree is time scaled and axis represents time in a decimal year notation (1 months is ~0.08333 of a year and 1 day is ~0.00274 of a year). Nodes are colored according to their respective posterior probabilities in gradient from blue (lowest value) to red (highest value). Dated node bars represent 95% highest posterior density intervals and are shown for the selected nodes.
Figure 4Evolutionary relationships of 133 sequenced Latvian and Wuhan-Hu-1 SARS-CoV-2 isolates. The evolutionary history was inferred using the Maximum-likelihood method allowing for polytomies. The tree is rooted at Wuhan-Hu-1 reference sequence. The tree is drawn to scale; branch lengths correspond to nucleotide substitutions. The analysis involved 134 nucleotide sequences (133 Latvian SARS-CoV-2 isolates and Wuhan-Hu-1 reference sequence). There were a total of 29,903 positions in the final dataset. Node labels are colored according to the GISAID major clade of particular isolate, as follows: green, GR; yellow, GH; red, G; blue, L; purple, O (other); black, Wuhan-Hu-1 reference sequence.