Bingyu Yan1, Srishti Chakravorty1, Carmen Mirabelli2, Christiane E Wobus2, Behdad Afzali3, Majid Kazemian1,4, Luopin Wang4, Jorge L Trujillo-Ochoa3, Daniel Chauss3, Dhaneshwar Kumar1,3, Michail S Lionakis5, Matthew R Olson6. 1. Department of Biochemistry, Purdue University, West Lafayette, Indiana, USA. 2. Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, USA. 3. Immunoregulation Section, Kidney Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), NIH, Bethesda, Maryland, USA. 4. Department of Computer Science, Purdue University, West Lafayette, Indiana, USA. 5. Fungal Pathogenesis Section, Laboratory of Clinical Immunology and Microbiology, National Institute of Allergy and Infectious Diss, Bethesda, Maryland, USA. 6. Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA.
High throughput sequencing reads from virally infected cells provide detailed information about both the infected host cells and invading viruses (1). For example, RNA-sequencing techniques from infected cells contains reads that unequivocally align to either the host or the viral transcriptomes, enabling quantification of host and viral gene expressions (2). Occasionally, there are reads with split characteristics, having one part (e.g., the 5′ end) unambiguously matching the host and another part (e.g., the 3′ end) clearly matching the viral genomes. The split characteristic with unambiguous matching on either part is the key here, typically requiring convincing stretches of sequence matches such as >30 bp that we used in our analysis (3). Such reads are termed host-virus chimeric reads (HVCRs). Indeed, HVCRs that surpass statistical reproducibility and signal-to-noise standards might carry novel insights into the biology of host-virus interactions (4, 5). Thus, it is important to unambiguously detect statistically rigorous and biologically relevant HVCRs. We and others have shown that detection of relevant HVCRs is complicated by unfaithful reverse transcriptase and polymerase enzymes that template-switch during typical high throughput sequencing library preparation protocols (6–9).The conventional HVCRs with split characteristics that we and others used in our studies should not be confused with what we term “composite” host reads that contain short matches to the viral genome or, vice-versa, viral reads that contain short sequence matches to the host genome in the middle of the reads. Such “composite” viral reads seem to be the subject of the letter contributed by Grigoriev et al. Our work only evaluated the biological relevance of conventional HVCRs and showed that in the context of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, they are most likely artifacts of library construction. Due to the short nature of sequence matches within “composite” reads (such as those identified by Grigoriev et al.), they are more prone to statistical anomalies and alignment errors and are likely to align by chance to at least some regions of the 3.2 billion base pairs encoded in the human genome. Thus, any analysis of “composite” events would need to include empirical or theoretical probabilities of such observations under rigorous control experiments to rule out template switching, alignment errors, or statistical anomalies.Nevertheless, to avoid any misinterpretation, it is important to note that the observations of composite reads by Grigoriev et al. have no bearing on our original findings (3) and follow-up studies by others (10, 11) that HVCRs in SARS-CoV-2-infected cells do not support integration events and are infrequent and artifactual.
Authors: Bingyu Yan; Tilo Freiwald; Daniel Chauss; Luopin Wang; Erin West; Claudia Kemper; Behdad Afzali; Majid Kazemian; Carmen Mirabelli; Charles J Zhang; Eva-Maria Nichols; Nazish Malik; Richard Gregory; Marcus Bantscheff; Sonja Ghidelli-Disse; Martin Kolev; Tristan Frum; Jason R Spence; Jonathan Z Sexton; Konstantinos D Alysandratos; Darrell N Kotton; Stefania Pittaluga; Jack Bibby; Nathalie Niyonzima; Matthew R Olson; Shahram Kordasti; Didier Portilla; Christiane E Wobus; Arian Laurence; Michail S Lionakis Journal: Sci Immunol Date: 2021-04-07
Authors: Marc Zapatka; Ivan Borozan; Daniel S Brewer; Murat Iskar; Adam Grundhoff; Malik Alawi; Nikita Desai; Holger Sültmann; Holger Moch; Colin S Cooper; Roland Eils; Vincent Ferretti; Peter Lichter Journal: Nat Genet Date: 2020-02-05 Impact factor: 38.330
Authors: Dave T P Tang; Charles Plessy; Md Salimullah; Ana Maria Suzuki; Raffaella Calligaris; Stefano Gustincich; Piero Carninci Journal: Nucleic Acids Res Date: 2012-11-24 Impact factor: 16.971