| Literature DB >> 30987147 |
Joshua B Singer1, Emma C Thomson2, Joseph Hughes3, Elihu Aranday-Cortes4, John McLauchlan5, Ana da Silva Filipe6, Lily Tong7, Carmen F Manso8, Robert J Gifford9, David L Robertson10, Eleanor Barnes11, M Azim Ansari12, Jean L Mbisa13, David F Bibby14, Daniel Bradshaw15, David Smith16.
Abstract
Using deep sequencing technologies such as Illumina's platform, it is possible to obtain reads from the viral RNA population revealing the viral genome diversity within a single host. A range of software tools and pipelines can transform raw deep sequencing reads into Sequence Alignment Mapping (SAM) files. We propose that interpretation tools should process these SAM files, directly translating individual reads to amino acids in order to extract statistics of interest such as the proportion of different amino acid residues at specific sites. This preserves per-read linkage between nucleotide variants at different positions within a codon location. The samReporter is a subsystem of the GLUE software toolkit which follows this direct read translation approach in its processing of SAM files. We test samReporter on a deep sequencing dataset obtained from a cohort of 241 UK HCV patients for whom prior treatment with direct-acting antivirals has failed; deep sequencing and resistance testing have been suggested to be of clinical use in this context. We compared the polymorphism interpretation results of the samReporter against an approach that does not preserve per-read linkage. We found that the samReporter was able to properly interpret the sequence data at resistance-associated locations in nine patients where the alternative approach was equivocal. In three cases, the samReporter confirmed that resistance or an atypical substitution was present at NS5A position 30. In three further cases, it confirmed that the sofosbuvir-resistant NS5B substitution S282T was absent. This suggests the direct read translation approach implemented is of value for interpreting viral deep sequencing data.Entities:
Keywords: bioinformatics; deep sequencing; drug resistance; hepatitis C virus; sequence interpretation; variant calling; virus genomics
Mesh:
Substances:
Year: 2019 PMID: 30987147 PMCID: PMC6520954 DOI: 10.3390/v11040323
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Ambiguous resistance-associated locations resolved using GLUE samReporter.
| Sequencing | Sample | Subtype | Virus | Codon | Ambiguous | Typical | Possible | Confirmed |
|---|---|---|---|---|---|---|---|---|
| Facility | ID | Protein | Location | Triplet | Residue (s) | Residues Set | Residues Set | |
| Glasgow | HCV294 | 3b | NS5B | 282 | WSY | S | CST | S |
| Glasgow | HCV300 | 3a | NS5A | 30 | RMG | A | AEKT | AK |
| PHE | R127 | 1a | NS5A | 24 | RSG | K | AGRT | GT |
| PHE | R164 | 3a | NS5A | 30 | RMG | A | AEKT | AK |
| PHE | R25 | 4r | NS5B | 159 | YTM | L | FL | L |
| PHE | R25 | 4r | NS5B | 282 | WSC | S | CST | S |
| PHE | R36 | 4r | NS5B | 282 | WSC | S | CST | S |
| PHE | R67 | 1a | NS5A | 30 | YAW | Q | HQY | QY |
| PHE | R91 | 1a | NS5A | 28 | RYG | M | AMTV | MV |
| Oxford | 7444 | 3a | NS5A | 62 | SYA | ST | ALPV | AL |
Figure 1The chain of pairwise homology relationships between reads and the master reference sequence (H77 for HCV), established during the operation of GLUE samReporter.
GLUE samReporter commands.
| Command | Description |
|---|---|
|
| Generate a table of nucleotide frequencies within a specific genome region. |
|
| Generate a table of read depths within a specific genome region. |
|
| Generate a FASTA consensus file, optionally using ambiguity codes. |
|
| Generate a table of amino acid residue frequencies within a specific protein-coding region. |
|
| Generate a table of codon frequencies within a specific protein-coding region. |
|
| Scan for the presence or absence of GLUE |
|
| Export a specific part of the SAM alignment as a FASTA file. |