Literature DB >> 17495995

Nucleotide composition string selection in HIV-1 subtyping using whole genomes.

Xiaomeng Wu1, Zhipeng Cai, Xiu-Feng Wan, Tin Hoang, Randy Goebel, Guohui Lin.   

Abstract

MOTIVATION: The availability of the whole genomic sequences of HIV-1 viruses provides an excellent resource for studying the HIV-1 phylogenies using all the genetic materials. However, such huge volumes of data create computational challenges in both memory consumption and CPU usage.
RESULTS: We propose the complete composition vector representation for an HIV-1 strain, and a string scoring method to extract the nucleotide composition strings that contain the richest evolutionary information for phylogenetic analysis. In this way, a large-scale whole genome phylogenetic analysis for thousands of strains can be done both efficiently and effectively. By using 42 carefully curated strains as references, we apply our method to subtype 1156 HIV-1 strains (10.5 million nucleotides in total), which include 825 pure subtype strains and 331 recombinants. Our results show that our nucleotide composition string selection scheme is computationally efficient, and is able to define both pure subtypes and recombinant forms for HIV-1 strains using the 5000 top ranked nucleotide strings. AVAILABILITY: The Java executable and the HIV-1 datasets are accessible through 'http://www.cs.ualberta.ca/~ghlin/src/WebTools/hiv.php. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2007        PMID: 17495995     DOI: 10.1093/bioinformatics/btm248

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  18 in total

1.  New powerful statistics for alignment-free sequence comparison under a pattern transfer model.

Authors:  Xuemei Liu; Lin Wan; Jing Li; Gesine Reinert; Michael S Waterman; Fengzhu Sun
Journal:  J Theor Biol       Date:  2011-06-25       Impact factor: 2.691

2.  Large local analysis of the unaligned genome and its application.

Authors:  Lianping Yang; Xiangde Zhang; Tianming Wang; Hegui Zhu
Journal:  J Comput Biol       Date:  2013-01       Impact factor: 1.479

3.  DIME: a novel framework for de novo metagenomic sequence assembly.

Authors:  Xuan Guo; Ning Yu; Xiaojun Ding; Jianxin Wang; Yi Pan
Journal:  J Comput Biol       Date:  2015-02       Impact factor: 1.479

4.  Phylogenetic analysis of protein sequences based on distribution of length about common sub-string.

Authors:  Guisong Chang; Tianming Wang
Journal:  Protein J       Date:  2011-03       Impact factor: 2.371

5.  A classification approach for genotyping viral sequences based on multidimensional scaling and linear discriminant analysis.

Authors:  Jiwoong Kim; Yongju Ahn; Kichan Lee; Sung Hee Park; Sangsoo Kim
Journal:  BMC Bioinformatics       Date:  2010-08-21       Impact factor: 3.169

6.  Reliability of rapid subtyping tools compared to that of phylogenetic analysis for characterization of human immunodeficiency virus type 1 non-B subtypes and recombinant forms.

Authors:  Africa Holguín; Marisa López; Vincent Soriano
Journal:  J Clin Microbiol       Date:  2008-10-08       Impact factor: 5.948

7.  Image correlation method for DNA sequence alignment.

Authors:  Millaray Curilem Saldías; Felipe Villarroel Sassarini; Carlos Muñoz Poblete; Asticio Vargas Vásquez; Iván Maureira Butler
Journal:  PLoS One       Date:  2012-06-27       Impact factor: 3.240

8.  A protein domain co-occurrence network approach for predicting protein function and inferring species phylogeny.

Authors:  Zheng Wang; Xue-Cheng Zhang; Mi Ha Le; Dong Xu; Gary Stacey; Jianlin Cheng
Journal:  PLoS One       Date:  2011-03-24       Impact factor: 3.240

9.  ComPhy: prokaryotic composite distance phylogenies inferred from whole-genome gene sets.

Authors:  Guan Ning Lin; Zhipeng Cai; Guohui Lin; Sounak Chakraborty; Dong Xu
Journal:  BMC Bioinformatics       Date:  2009-01-30       Impact factor: 3.169

10.  An evolutionary model-based algorithm for accurate phylogenetic breakpoint mapping and subtype prediction in HIV-1.

Authors:  Sergei L Kosakovsky Pond; David Posada; Eric Stawiski; Colombe Chappey; Art F Y Poon; Gareth Hughes; Esther Fearnhill; Mike B Gravenor; Andrew J Leigh Brown; Simon D W Frost
Journal:  PLoS Comput Biol       Date:  2009-11-26       Impact factor: 4.475

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.