| Literature DB >> 33792750 |
Oscar González-Recio1,2, Mónica Gutiérrez-Rivas3, Ramón Peiró-Pastor3, Pilar Aguilera-Sepúlveda4, Cristina Cano-Gómez4, Miguel Ángel Jiménez-Clavero4,5, Jovita Fernández-Pinero4.
Abstract
Nanopore sequencing has emerged as a rapid and cost-efficient tool for diagnostic and epidemiological surveillance of SARS-CoV-2 during the COVID-19 pandemic. This study compared the results from sequencing the SARS-CoV-2 genome using R9 vs R10 flow cells and a Rapid Barcoding Kit (RBK) vs a Ligation Sequencing Kit (LSK). The R9 chemistry provided a lower error rate (3.5%) than R10 chemistry (7%). The SARS-CoV-2 genome includes few homopolymeric regions. Longest homopolymers were composed of 7 (TTTTTTT) and 6 (AAAAAA) nucleotides. The R10 chemistry resulted in a lower rate of deletions in thymine and adenine homopolymeric regions than the R9, at the expenses of a larger rate (~10%) of mismatches in these regions. The LSK had a larger yield than the RBK, and provided longer reads than the RBK. It also resulted in a larger percentage of aligned reads (99 vs 93%) and also in a complete consensus genome. The results from this study suggest that the LSK preparation library provided longer DNA fragments which contributed to a better assembly of the SARS-CoV-2, despite an impaired detection of variants in a R10 flow cell. Nanopore sequencing could be used in epidemiological surveillance of SARS-CoV-2. KEY POINTS: • Sequencing SARS-CoV-2 genome is of great importance for the pandemic surveillance. • Nanopore offers a low cost and accurate method to sequence SARS-CoV-2 genome. • Ligation sequencing is preferred rather than the rapid kit using transposases.Entities:
Keywords: COVID-19; Flow cell; Genome assembly; Nanopore; SARS-CoV-2; Sequencing
Mesh:
Year: 2021 PMID: 33792750 PMCID: PMC8014908 DOI: 10.1007/s00253-021-11250-w
Source DB: PubMed Journal: Appl Microbiol Biotechnol ISSN: 0175-7598 Impact factor: 4.813
Minimap2 alignment summary results
| Flow cell and sequencing kit | Reads | Aligned reads | Unaligned reads | %aligned | Non-sense read fraction |
|---|---|---|---|---|---|
| R9-RBK004 | 16,991 | 15,827 | 1164 | 93.15 | 42% |
| R10-LSK009 | 9658 | 9548 | 110 | 98.86 | 18% |
Fig. 1GC content distribution from ONT sequences from (a) R9 set, (b) R10 flow-cells. The blue bars come from entire reads, and the red ones were computed from chunked (150 bp) subsequences
Fig. 2Fraction of mismatches per read against the SARS-CoV-2 reference genome
Fig. 3Read accuracy at homopolymeric sites in the SARS-CoV-2 genome for each FC type
Fig. 4Smoothed normalized coverage of reads by position from each type of FC (R9 and R10) against the SARS-CoV-2 genome (smoothing window width = 200 bp)
Fig. 5Common and unique SNPs (a) for R9 set, (b) for R10 set
Fig. 6Common and unique SNPs detected by (a) LoFreq, (b) Pilon, (c) VarScan
Fig. 7Allele frequencies of SNPs located in the SARS-CoV-2 genome detected by VarScan, LoFreq and Pilon
Fig. 8Allele frequencies of INDELs detected by VarScan located in the SARS-CoV-2 genome
Fig. 9Dotplot comparison of SARS-CoV-2 reference (x-axis) vs. R9 assembly (y-axis)
Fig. 10Dotplot comparison of SARS-CoV-2 reference (x-axis) vs. R10 assembly (y-axis)