Nasir Riaz1,2, Preston Leung1, Kirston Barton3, Martin A Smith3, Shaun Carswell3, Rowena Bull1,4, Andrew R Lloyd1, Chaturaka Rodrigo5,6. 1. Kirby Institute, UNSW Sydney, Sydney, NSW, 2052, Australia. 2. Department of Microbiology, Hazara University, KPK, Maneshra, 21120, Pakistan. 3. Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Sydney, Australia. 4. Department of Pathology, School of Medical Sciences, UNSW Sydney, Sydney, NSW, 2052, Australia. 5. Kirby Institute, UNSW Sydney, Sydney, NSW, 2052, Australia. c.rodrigo@unsw.edu.au. 6. Department of Pathology, School of Medical Sciences, UNSW Sydney, Sydney, NSW, 2052, Australia. c.rodrigo@unsw.edu.au.
Abstract
BACKGROUND: Hepatitis C (HCV) and many other RNA viruses exist as rapidly mutating quasi-species populations in a single infected host. High throughput characterization of full genome, within-host variants is still not possible despite advances in next generation sequencing. This limitation constrains viral genomic studies that depend on accurate identification of hemi-genome or whole genome, within-host variants, especially those occurring at low frequencies. With the advent of third generation long read sequencing technologies, including Oxford Nanopore Technology (ONT) and PacBio platforms, this problem is potentially surmountable. ONT is particularly attractive in this regard due to the portable nature of the MinION sequencer, which makes real-time sequencing in remote and resource-limited locations possible. However, this technology (termed here 'nanopore sequencing') has a comparatively high technical error rate. The present study aimed to assess the utility, accuracy and cost-effectiveness of nanopore sequencing for HCV genomes. We also introduce a new bioinformatics tool (Nano-Q) to differentiate within-host variants from nanopore sequencing. RESULTS: The Nanopore platform, when the coverage exceeded 300 reads, generated comparable consensus sequences to Illumina sequencing. Using HCV Envelope plasmids (~ 1800 nt) mixed in known proportions, the capacity of nanopore sequencing to reliably identify variants with an abundance as low as 0.1% was demonstrated, provided the autologous reference sequence was available to identify the matching reads. Successful pooling and nanopore sequencing of 52 samples from patients with HCV infection demonstrated its cost effectiveness (AUD$ 43 per sample with nanopore sequencing versus $100 with paired-end short read technology). The Nano-Q tool successfully separated between-host sequences, including those from the same subtype, by bulk sorting and phylogenetic clustering without an autologous reference sequence (using only a subtype-specific generic reference). The pipeline also identified within-host viral variants and their abundance when the parameters were appropriately adjusted. CONCLUSION: Cost effective HCV whole genome sequencing and within-host variant identification without haplotype reconstruction are potential advantages of nanopore sequencing.
BACKGROUND: Hepatitis C (HCV) and many other RNA viruses exist as rapidly mutating quasi-species populations in a single infected host. High throughput characterization of full genome, within-host variants is still not possible despite advances in next generation sequencing. This limitation constrains viral genomic studies that depend on accurate identification of hemi-genome or whole genome, within-host variants, especially those occurring at low frequencies. With the advent of third generation long read sequencing technologies, including Oxford Nanopore Technology (ONT) and PacBio platforms, this problem is potentially surmountable. ONT is particularly attractive in this regard due to the portable nature of the MinION sequencer, which makes real-time sequencing in remote and resource-limited locations possible. However, this technology (termed here 'nanopore sequencing') has a comparatively high technical error rate. The present study aimed to assess the utility, accuracy and cost-effectiveness of nanopore sequencing for HCV genomes. We also introduce a new bioinformatics tool (Nano-Q) to differentiate within-host variants from nanopore sequencing. RESULTS: The Nanopore platform, when the coverage exceeded 300 reads, generated comparable consensus sequences to Illumina sequencing. Using HCV Envelope plasmids (~ 1800 nt) mixed in known proportions, the capacity of nanopore sequencing to reliably identify variants with an abundance as low as 0.1% was demonstrated, provided the autologous reference sequence was available to identify the matching reads. Successful pooling and nanopore sequencing of 52 samples from patients with HCV infection demonstrated its cost effectiveness (AUD$ 43 per sample with nanopore sequencing versus $100 with paired-end short read technology). The Nano-Q tool successfully separated between-host sequences, including those from the same subtype, by bulk sorting and phylogenetic clustering without an autologous reference sequence (using only a subtype-specific generic reference). The pipeline also identified within-host viral variants and their abundance when the parameters were appropriately adjusted. CONCLUSION: Cost effective HCV whole genome sequencing and within-host variant identification without haplotype reconstruction are potential advantages of nanopore sequencing.
Entities:
Keywords:
Haplotypes; Hepatitis C virus; Nano-Q; Oxford Nanopore technology; Third generation sequencing
Authors: Joshua Quick; Nathan D Grubaugh; Steven T Pullan; Ingra M Claro; Andrew D Smith; Karthik Gangavarapu; Glenn Oliveira; Refugio Robles-Sikisaka; Thomas F Rogers; Nathan A Beutler; Dennis R Burton; Lia Laura Lewis-Ximenez; Jaqueline Goes de Jesus; Marta Giovanetti; Sarah C Hill; Allison Black; Trevor Bedford; Miles W Carroll; Marcio Nunes; Luiz Carlos Alcantara; Ester C Sabino; Sally A Baylis; Nuno R Faria; Matthew Loose; Jared T Simpson; Oliver G Pybus; Kristian G Andersen; Nicholas J Loman Journal: Nat Protoc Date: 2017-05-24 Impact factor: 13.491
Authors: Peter J A Cock; Tiago Antao; Jeffrey T Chang; Brad A Chapman; Cymon J Cox; Andrew Dalke; Iddo Friedberg; Thomas Hamelryck; Frank Kauff; Bartek Wilczynski; Michiel J L de Hoon Journal: Bioinformatics Date: 2009-03-20 Impact factor: 6.937
Authors: B D Betz-Stablein; A Töpfer; M Littlejohn; L Yuen; D Colledge; V Sozzi; P Angus; A Thompson; P Revill; N Beerenwinkel; N Warner; F Luciani Journal: J Virol Date: 2016-07-27 Impact factor: 5.103
Authors: Felipe Gomes Naveca; Ingra Claro; Marta Giovanetti; Jaqueline Goes de Jesus; Joilson Xavier; Felipe Campos de Melo Iani; Valdinete Alves do Nascimento; Victor Costa de Souza; Paola Paz Silveira; José Lourenço; Mauricio Santillana; Moritz U G Kraemer; Josh Quick; Sarah C Hill; Julien Thézé; Rodrigo Dias de Oliveira Carvalho; Vasco Azevedo; Flavia Cristina da Silva Salles; Márcio Roberto Teixeira Nunes; Poliana da Silva Lemos; Darlan da Silva Candido; Glauco de Carvalho Pereira; Marluce Aparecida Assunção Oliveira; Cátia Alexandra Ribeiro Meneses; Rodrigo Melo Maito; Claudeth Rocha Santa Brígida Cunha; Daniela Palha de Sousa Campos; Marcia da Costa Castilho; Thalita Caroline da Silva Siqueira; Tiza Matos Terra; Carlos F Campelo de Albuquerque; Laura Nogueira da Cruz; André Luis de Abreu; Divino Valerio Martins; Daniele Silva de Moraes Vanlume Simoes; Renato Santana de Aguiar; Sérgio Luiz Bessa Luz; Nicholas Loman; Oliver G Pybus; Ester C Sabino; Osnei Okumoto; Luiz Carlos Junior Alcantara; Nuno Rodrigues Faria Journal: PLoS Negl Trop Dis Date: 2019-03-07
Authors: Camilla L C Ip; Matthew Loose; John R Tyson; Mariateresa de Cesare; Bonnie L Brown; Miten Jain; Richard M Leggett; Ewan Birney; David Buck; Sara Goodwin; Hans J Jansen; Justin O'Grady; Hugh E Olsen; David A Eccles; Vadim Zalunin; John M Urban; Paolo Piazza; Rory J Bowden; Benedict Paten; Solomon Mwaigwisya; Elizabeth M Batty; Jared T Simpson; Terrance P Snutch Journal: F1000Res Date: 2015-10-15