Hosoon Choi1, Munok Hwang1, Dhammika H Navarathna2, Jing Xu1, Janell Lukey2, Chetan Jinadatha3,4. 1. Department of Research, Central Texas Veterans Health Care Systemgrid.413775.3, Temple, Texas, USA. 2. Department of Pathology and Laboratory Medicine Services, Central Texas Veterans Health Care Systemgrid.413775.3, Temple, Texas, USA. 3. Department of Medicine, Central Texas Veterans Health Care Systemgrid.413775.3, Temple, Texas, USA. 4. College of Medicine, Texas A&M University, Bryan, Texas, USA.
Abstract
Entities:
Keywords:
COVID; COVIDSeq; CT value; SARS-CoV-2; clade; lineage; sequencing; swift
Along with real-time reverse transcriptase PCR (RT-PCR) diagnostic testing, whole-genomic sequencing (WGS) of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been an irreplaceable tool for epidemiological evaluation, and genomic sequence analysis has provided essential information in the development of antiviral therapeutics and vaccines (1, 2). Good sequencing coverage needs to be obtained to get information: the default threshold for lineage call at Pangolin is at least 50% of non-N bases (3). Obtaining sufficient WGS coverage from low-viral-load samples is challenging and often not subject to WGS in many labs. However, WGS of low-viral-load patient samples will enable a more complete picture of viral transmission and viral evolution (4).Here, we report the evaluation of two SARS-CoV-2 amplicon-based library prep kits, COVIDSeq with ARTIC v3 primers (Illumina) and Swift Normalase amplicon SARS-CoV-2 panels (SNAP) (Swift Biosciences) (5–7) with digital PCR and WGS using a total of 121 COVID-19-positive samples with threshold cycle (C) values between 11 and 45 (BD Max SARS-CoV-2 reverse transcriptase quantitative PCR [RT-qPCR] assay; please refer to supplementary material for library preparation and sequencing). About one-third (42) of samples were low viral load (C > 30). Out of 121 samples, >95% genome coverage was obtained for 107 samples by SNAP and 89 samples by COVIDSeq. This sequence yield exceeds many previous SARS-CoV-2 sequencing reports, especially for low-viral-load samples.Through testing of various viral load samples along with digital PCR, we were able to obtain a tentative cutoff value of sample viral load above which one can expect to acquire an informative sequencing outcome. Sequence coverage of >95% was obtained by using SNAP for all of the samples with C ≤ 35 and by COVIDSeq for 97% of samples with C ≤ 30. Sample RNA quantitation obtained using digital PCR provided more precise cutoff values. The quantitative digital PCR (TaqPath COVID combo kit [Thermo Fisher] analysis with QIAcuity [Qiagen]) cutoff values for obtaining >95% coverage are 10.5 copies/μL for SNAP and 147 copies/μL for COVIDSeq (Fig. 1). The median sequence read depth was 2,913× and 3,578× with SNAP and COVIDSeq, respectively. All samples >750× coverage from SNAP achieved >95% sequence coverage. Read depth of >2,800× is sufficient to have >95% sequence coverage by COVIDSeq.
FIG 1
Correlation between viral loads and percent coverage. (A) Correlations between C values and percent coverage of SNAP protocol. 100% of samples sample with >95% coverage above the cutoff values. (B) Correlations between C values and percent coverage of COVIDSeq protocol. 97% of samples sample with >95% coverage above the cutoff values. (C) Correlations between copy number obtained by digital PCR and percent coverage of SNAP protocol. 100% of samples sample with >95% coverage above the cutoff values. (D) Correlations between copy number obtained by digital PCR and percent coverage of COVIDSeq protocol. 100% of samples sample with >95% coverage above the cutoff values. Red bars are proposed C cutoff values with which all of the samples located to the right side produced >95% genome coverage. (E) Summary of sequencing coverage of SNAP protocol and COVIDSeq protocol.
Correlation between viral loads and percent coverage. (A) Correlations between C values and percent coverage of SNAP protocol. 100% of samples sample with >95% coverage above the cutoff values. (B) Correlations between C values and percent coverage of COVIDSeq protocol. 97% of samples sample with >95% coverage above the cutoff values. (C) Correlations between copy number obtained by digital PCR and percent coverage of SNAP protocol. 100% of samples sample with >95% coverage above the cutoff values. (D) Correlations between copy number obtained by digital PCR and percent coverage of COVIDSeq protocol. 100% of samples sample with >95% coverage above the cutoff values. Red bars are proposed C cutoff values with which all of the samples located to the right side produced >95% genome coverage. (E) Summary of sequencing coverage of SNAP protocol and COVIDSeq protocol.In addition, combining FASTQ files obtained from two kits improved the sequencing coverage and read depth significantly (please refer to supplementary material for combining FASTQ strategy). Combining FASTQ files was performed for 52 samples in which coverage is <95% or samples with lineage call discrepancy by either kit. By combining FASTQ files, >95% was obtained from all 21 samples with C ≤ 30 and from 22 out of 31 samples with C > 30 (ranges C, ∼31 to ∼42.7) (Fig. 2). The strategy to combine FASTQ files could also be effective in the case of amplicon loss (8), which could occur due to novel mutations in emerging variants like Omicron.
FIG 2
Increases of percent coverage by combined FASTQ. (A) Comparison of percent coverage between SNAP protocol and combined FASTQ of low-C samples. (B) Comparison of percent coverage between SNAP protocol and combined FASTQ of high-C samples. (C) Comparison of percent coverage between COVIDSeq protocol and combined FASTQ of low-C samples. (D) Comparison of percent coverage between COVIDSeq protocol and combined FASTQ of high C samples. (E) Summary of sequencing coverage of SNAP protocol, COVIDSeq protocol, and combined FASTQ.
Increases of percent coverage by combined FASTQ. (A) Comparison of percent coverage between SNAP protocol and combined FASTQ of low-C samples. (B) Comparison of percent coverage between SNAP protocol and combined FASTQ of high-C samples. (C) Comparison of percent coverage between COVIDSeq protocol and combined FASTQ of low-C samples. (D) Comparison of percent coverage between COVIDSeq protocol and combined FASTQ of high C samples. (E) Summary of sequencing coverage of SNAP protocol, COVIDSeq protocol, and combined FASTQ.As a result, all 79 samples with C ≤ 30 achieved genome coverage >95%. For samples with low viral load, C > 30 (ranges C, ∼31 to ∼42.7), genome coverage >95% was obtained for 33 samples out of 42 samples with SNAP only or combining COVIDSeq and SNAP. Overall, SNAP produced better coverage and depth on moderate- and low-titer samples; all of the samples with C ≤ 35 or >10.5 copies/μL achieved >95% sequence coverage. Combining FASTQ files from COVIDseq and SNAP also increased sequence coverage and read depth for low-titer samples.
Authors: Richard A Teran; Kelly A Walblay; Elizabeth L Shane; Shannon Xydis; Stephanie Gretsch; Alexandra Gagner; Usha Samala; Hyeree Choi; Christy Zelinski; Stephanie R Black Journal: Am J Transplant Date: 2021-06 Impact factor: 8.086
Authors: Andrew Rambaut; Edward C Holmes; Áine O'Toole; Verity Hill; John T McCrone; Christopher Ruis; Louis du Plessis; Oliver G Pybus Journal: Nat Microbiol Date: 2020-07-15 Impact factor: 17.745
Authors: Jacob Kames; David D Holcomb; Ofer Kimchi; Michael DiCuccio; Nobuko Hamasaki-Katagiri; Tony Wang; Anton A Komar; Aikaterini Alexaki; Chava Kimchi-Sarfaty Journal: Sci Rep Date: 2020-09-24 Impact factor: 4.996