| Literature DB >> 32841689 |
Qingtian Guan1, Mukhtar Sadykov1, Sara Mfarrej1, Sharif Hala1, Raeece Naeem1, Raushan Nugmanova1, Awad Al-Omari2, Samer Salih3, Abbas Al Mutair3, Michael J Carr4, William W Hall5, Stefan T Arold6, Arnab Pain7.
Abstract
OBJECTIVE: The SARS-CoV-2 pathogen has established endemicity in humans. This necessitates the development of rapid genetic surveillance methodologies to serve as an adjunct with existing comprehensive, albeit though slower, genome sequencing-driven approaches.Entities:
Keywords: SARS-CoV-2; barcoding; genetic surveillance; genome variation
Mesh:
Substances:
Year: 2020 PMID: 32841689 PMCID: PMC7443060 DOI: 10.1016/j.ijid.2020.08.052
Source DB: PubMed Journal: Int J Infect Dis ISSN: 1201-9712 Impact factor: 3.623
Primer sets for targeted multiplex PCR
| Clade | Position in the genome | Primer sequence | Amplicon size (bp) | Melting temperature Tm (°C) |
|---|---|---|---|---|
| G614 | 241 | GF-1: 5’- TGTCGTTGACAGGACACGAG-3’ | 228 | 60.94 |
| G614 | 3037 | GF-2: 5’- ATGAGTTCGCCTGTGTTGTG-3’ | 392 | 58.77 |
| G614 | 14408 | GF-3: 5’ - TGGGATCAGACATACCACCCA - 3’ | 334 | 60.27 |
| G614 | 23403 | GF-4: 5’ -CTGATGCTGTCCGTGATCCA - 3’ | 302 | 59.82 |
| S84 | 8782 | SF-1: 5’- GCGTCATATTAATGCGCAGGT-3’ | 663 | 59.47 |
| S84 | 28144 | SF-2: 5’ - CGTGGATGAGGCTGGTTCTA - 3’ | 300 | 59.18 |
| V251 | 26144 | VF-1: 5’- TCAGGTGATGGCACAACAAGT-3’ | 468 | 60.13 |
| I378 | 1397 | IF-1: 5’ – GAAACTTCATGGCAGACGGG-3’ | 303 | 59.20 |
| I378 | 28688 | IF-2: 5’ - ACCGCTCTCACTCAACATGG - 3’ | 632 | 60.04 |
| D392 | 1440 | DF-1: 5’ - AGGTGCCACTACTTGTGGTT - 3’ | 607 | 59.16 |
| D392 | 2891 | DF-2: 5’ - CGGTGCACCAACAAAGGTTAC - 3’ | 450 | 60.27 |
Fig. 1Clades of SARS-CoV-2. (A) A global SNP-based radial phylogeny of SARS-CoV-2 genomes defining five major clades (G614, S84, V251, I378 and D392) and several subclades based on nucleotide substitution events. (B) A simplified phylogenetic tree to illustrate the evolutionary relationship of the clades/subclades based on random sampling of complete genomes from each subclade. (C) The clade and subclade-defining SNPs for each clade and subclade. *These SNPs are developed independently in more than one clade hence are not clade-defining SNPs (refer to Fig. S2). (D) A comparative guide to clades defined by our study and the lineages recently defined by Rambaut et al. (Rambaut et al. 2020) and GISAID (Shu and McCauley 2017).
Fig. 2Global distribution of various major and minor clades of SARS-CoV2 genomes and their relative prevalence over a 6 month period from December 24, 2019 to June 30, 2020 from the outbreak and early stages of the pandemic. The size of each pie chart is proportional to the numbers within each respective clade. The cumulative trend of the clades is shown on the right and the span of time indicates the first and last observed case in each particular clade.
Fig. 3Workflow from clinical sample collection, next-generation sequencing and SARS-CoV-2 clade assignment. (A) Schematic representation of the genotyping method described in this study. Positive samples were subjected to RNA extraction and multiplex RT-PCR. The amplicons were purified and prepared for the Illumina library. The sequencing was performed using MiSeq 600 cycles V3 kit and results were analyzed using our clade-defining script. (B) Boxplot of the coverage showing the log fold depth of the 11 clade-defining positions across the 24 SARS-CoV-2 genomes in multiplex sequencing-based genotyping. The primer sequences and PCR products of each pair of the primer are shown in Table 1 and Fig.S3.
Fig. 4Mapping of SARS-CoV-2 clade-defining mutations onto the proteins. Nonsynonymous mutations for proteins where the 3D structure was experimentally determined (spike, nsp12/7/8) or can be inferred with reasonable confidence. Mutations are colour-coded as for the corresponding clades in Fig.3(D: magenta; G: light green; I: blue; V: orange). For a detailed analysis, see Fig.S5-11. (A) The structure of the SARS-CoV-2 spike trimer in its open conformation (chains are cyan, magenta and grey) bound to the human receptor ACE2 (black) modeled based on PDB accessions 6m17 and 6vyb. Identified nonsynonymous mutations are shown as spheres in the model. For reasons of visibility only mutations of two of the three spike chains are labeled. memb. indicates the plasma membrane. (B) Fragment comprising residues 180-534 of nsp2, modelled by AlphaFold35. Both clade-defining mutations are located in solvent-exposed regions and would not lead to steric clashes. (C) The substitution A876 T (corresponding to residue A58 in the nsp3 cleavage product numbering) is situated in the N-terminal ubiquitin-like domain of nsp3. The structure of this domain can be inferred based on the 79% identical structure of residues 1-112 from SARS-CoV (PDB id 2idy). The substitution A876 T can be accommodated with only minor structural adjustments and is not expected to have a substantial influence on the protein stability or function. (D) The structure shows the nsp12 in complex with nsp7 (magenta) and nsp8 (cyan and teal), based on PDB 7btf. P4720 (P323 in nsp12 numbering) is located in the ‘interface domain’ (black). In this position, the P323 L substitution is not predicted to disrupt the folding or protein interactions and hence is not expected to have strong effects. (E) A theoretical model for the Orf3a monomer has been proposed by AlphaFold36. The structure-function relationship of this protein remains to be clarified. The mutation G251 V is located C-terminal to the β-sandwich domain and the tail (marked by an asterisk).