Literature DB >> 31683009

A novel alignment-free method for HIV-1 subtype classification.

Lily He1, Rui Dong1, Rong Lucy He2, Stephen S-T Yau3.   

Abstract

HIV-1 is the most common and pathogenic strain of human immunodeficiency virus consisting of many subtypes. To study the difference among HIV-1 subtypes in infection, diagnosis and drug design, it is important to identify HIV-1 subtypes from clinical HIV-1 samples. In this work, we propose an effective numeric representation called Subsequence Natural Vector (SNV) to encode HIV-1 sequences. Using the representation, we introduce an improved linear discriminant analysis method to classify HIV-1 viruses correctly. SNV is based on distribution of nucleotides in HIV-1 viral sequences. It not only computes the number of nucleotides, but also describes the position and variance of nucleotides in viruses. To validate our alignment-free method, 6902 complete genomes and 11,668 pol gene sequences of HIV-1 subtypes were collected from the up-to-date Los Alamos HIV database. SNV outperforms the three popular methods, Kameris, Comet and REGA, with almost 100% Sensitivity and Specificity, also with much less time. Our subtyping algorithm especially works better for circulating recombinant forms (CRFs) consisting of a few sequences. Our approach is also powerful to separate unique recombinant forms (URFs) from other subtypes with 100% Sensitivity and Specificity. Moreover, phylogenetic trees based on SNV representation are constructed using full-length HIV-1 genomes and pol genes respectively, where viruses from the same subtype are clustered together correctly.
Copyright © 2019 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Alignment-free; Classification; HIV-1; SNV

Mesh:

Substances:

Year:  2019        PMID: 31683009     DOI: 10.1016/j.meegid.2019.104080

Source DB:  PubMed          Journal:  Infect Genet Evol        ISSN: 1567-1348            Impact factor:   3.342


  2 in total

1.  4D-Dynamic Representation of DNA/RNA Sequences: Studies on Genetic Diversity of Echinococcus multilocularis in Red Foxes in Poland.

Authors:  Dorota Bielińska-Wąż; Piotr Wąż; Anna Lass; Jacek Karamon
Journal:  Life (Basel)       Date:  2022-06-10

2.  Classification of genomic components and prediction of genes of Begomovirus based on subsequence natural vector and support vector machine.

Authors:  Shaojun Pei; Rui Dong; Yiming Bao; Rong Lucy He; Stephen S-T Yau
Journal:  PeerJ       Date:  2020-08-03       Impact factor: 2.984

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.