| Literature DB >> 33330863 |
Harshavardhan Doddapaneni, Sara Javornik Cregeen, Richard Sucgang, Qingchang Meng, Xiang Qin, Vasanthi Avadhanula, Hsu Chao, Vipin Menon, Erin Nicholson, David Henke, Felipe-Andres Piedra, Anubama Rajan, Zeineen Momin, Kavya Kottapalli, Kristi L Hoffman, Fritz J Sedlazeck, Ginger Metcalf, Pedro A Piedra, Donna M Muzny, Joseph F Petrosino, Richard A Gibbs.
Abstract
The newly emerged and rapidly spreading SARS-CoV-2 causes coronavirus disease 2019 (COVID-19). To facilitate a deeper understanding of the viral biology we developed a capture sequencing methodology to generate SARS-CoV-2 genomic and transcriptome sequences from infected patients. We utilized an oligonucleotide probe-set representing the full-length genome to obtain both genomic and transcriptome (subgenomic open reading frames [ORFs]) sequences from 45 SARS-CoV-2 clinical samples with varying viral titers. For samples with higher viral loads (cycle threshold value under 33, based on the CDC qPCR assay) complete genomes were generated. Analysis of junction reads revealed regions of differential transcriptional activity and provided evidence of expression of ORF10. Heterogeneous allelic frequencies along the 20kb ORF1ab gene suggested the presence of a defective interfering viral RNA species subpopulation in one sample. The associated workflow is straightforward, and hybridization-based capture offers an effective and scalable approach for sequencing SARS-CoV-2 from patient samples.Entities:
Year: 2020 PMID: 33330863 PMCID: PMC7743067 DOI: 10.1101/2020.12.11.421057
Source DB: PubMed Journal: bioRxiv
Fig 1.Schematic workflow.
Presented in the workflow are the different steps involved in the SARS-CoV-2 capture and sequencing methodology.
Fig 2.Sequence data.
Ct value vs percent raw sequencing reads mapped to SARS-CoV-2 in (a) Capture enriched samples; (b) Pre-capture samples; (c) Positive and negative controls. Percentage of reads mapped to the ‘SARS-CoV-2’ genome, to the ‘human’ reference genome and a third category called the ‘reads others’, which is the combined total of trimmed reads and reads that do not fall under the two other categories are plotted in this figure. CT values in bold indicate samples that provided full-length genome assemblies.
Fig 3.Scatter plot showing genome completeness as a function of Ct value.
Pink circles represent post-capture samples and black asterisks represent pre-capture samples.
Fig 4.Schematic representation of 192000051B assembly.
Black bars represent loci where the assembly called alleles different from the NCBI reference sequence NC_045512. Green bars represent mixed loci where both reference and alternative alleles were called. All mixed loci are in the ORF1ab gene, and are listed in the table, along with the frequency of the alternate allele at the position, and the predicted effect in translation.
Fig 5.SARS-CoV-2 subgenomic mRNAs.
(a) Junction read quantification per gene estimated as number of junction reads per million (log transformed) showing values generated from five pre-capture and 17 capture samples. Samples chosen for this analysis have above 95% genome completeness. The coverage level per sample is shown below the gene heatmap. Samples in bold denote same sample sequenced as pre-capture and capture. (b) ORF read coverage shown as normalized read counts (RPKM) per gene for 17 capture samples.