| Literature DB >> 29268961 |
Somsubhra Barik1, Shreepriya Das2, Haris Vikalo3.
Abstract
RNA viruses are characterized by high mutation rates that give rise to populations of closely related genomes, known as viral quasispecies. Underlying heterogeneity enables the quasispecies to adapt to changing conditions and proliferate over the course of an infection. Determining genetic diversity of a virus (i.e., inferring haplotypes and their proportions in the population) is essential for understanding its mutation patterns, and for effective drug developments. Here, we present QSdpR, a method and software for the reconstruction of quasispecies from short sequencing reads. The reconstruction is achieved by solving a correlation clustering problem on a read-similarity graph and the results of the clustering are used to estimate frequencies of sub-species; the number of sub-species is determined using pseudo F index. Extensive tests on both synthetic datasets and experimental HIV-1 and Zika virus data demonstrate that QSdpR compares favorably to existing methods in terms of various performance metrics.Entities:
Keywords: Clustering; Max K-cut; Next generation sequencing; Quasispecies; RNA viruses
Mesh:
Year: 2017 PMID: 29268961 DOI: 10.1016/j.ygeno.2017.12.007
Source DB: PubMed Journal: Genomics ISSN: 0888-7543 Impact factor: 5.736