Literature DB >> 16845050

jpHMM at GOBICS: a web server to detect genomic recombinations in HIV-1.

Ming Zhang1, Anne-Kathrin Schultz, Charles Calef, Carla Kuiken, Thomas Leitner, Bette Korber, Burkhard Morgenstern, Mario Stanke.   

Abstract

Detecting recombinations in the genome sequence of human immunodeficiency virus (HIV-1) is crucial for epidemiological studies and for vaccine development. Herein, we present a web server for subtyping and localization of phylogenetic breakpoints in HIV-1. Our software is based on a jumping profile Hidden Markov Model (jpHMM), a probabilistic generalization of the jumping-alignment approach proposed by Spang et al. The input data for our server is a partial or complete genome sequence from HIV-1; our tool assigns regions of the input sequence to known subtypes of HIV-1 and predicts phylogenetic breakpoints. jpHMM is available online at http://jphmm.gobics.de/.

Entities:  

Mesh:

Year:  2006        PMID: 16845050      PMCID: PMC1538796          DOI: 10.1093/nar/gkl255

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Currently, more than 150 000 partial or complete HIV genome sequences are available in the central HIV database at Los Alamos National Laboratory (1); these data are crucial for the development of drugs against AIDS. Analysis of HIV sequence data is challenging, however, since HIV is among the most genetically variable organisms known and recombinations of different HIV subtypes are very common (2). HIV-1 is divided into three major phylogenetic groups, one of which—the M group—is responsible for the AIDS pandemic (3,4). This group is classified into ten subtypes, some of which are further divided into sub-subtypes. Accurate classification of HIV-1 subtypes and recombinants is of crucial importance for epidemiological monitoring and drug development. Therefore, a number of software tools have been developed to classify HIV genome sequences and to identify phylogenetic breakpoints and subtypes in recombinant strains (5,6). We recently developed a HMM-based method to compare nucleic acid sequences to a given multiple alignment A of a sequence family S for which a classification into subclasses is available (7). We called this method jumping profile Hidden Markov Model (jpHMM) since our approach is a probabilistic generalization of the jumping-alignment (JALI) algorithm proposed by Spang et al. (8,9). In JALI, a query sequence s is aligned to a multiple alignment A of a sequence family S = {s1, … , s}—but s is not aligned to the alignment A as a whole, but different parts of s can be aligned to different individual sequences s from A. Within an alignment of the query s to the sequence family S, ‘jumps’ are allowed between different sequences from S depending on where the strongest degree of similarity is found. For a jump between two sequences s and s, a penalty is imposed, similar to the familiar gap penalty used in standard sequence alignment. This approach is particularly useful if the query sequence s is a result of phylogenetic recombinations such that different parts of s are related to different sequences from the family S. JALI has been shown to perform well if an alignment A is to be searched against a sequence database (9). In our jpHMM approach, we assume that a partition of the sequences from the family S into subclasses is given. Each subclass is modeled as a profile Hidden Markov Model (10). Within a subclass, the usual transitions between match, insert and delete states are possible, as in standard profile HMM theory—but in addition, our model allows transitions between profile HMMs corresponding to different subclasses, so a path through our model can switch back and forth between different subclasses. Jumps between subclasses are associated with so-called jump probabilities. A detailed description of this approach is given in Schultz et al. (7).

PREDICTION OF PHYLOGENETIC RECOMBINATION POINTS IN HIV-1 AT GOBICS

In (7), we found that jpHMM is a useful tool to predict phylogenetic breakpoints and subtypes in recombinant HIV and hepatitis C sequences (11). For HIV subtyping, we start with a pre-calculated multiple alignment of HIV-1 genome sequences consisting of all major subtypes and sub-subtypes; these (sub-)subtypes are modeled as profile HMMs in our jpHMM approach. It turned out that ‘jumps’ between these (sub-)subtypes correspond quite well to known phylogenetic breakpoints and (sub-)subtypes to which a query sequence s is aligned, reliably indicate the real (sub-)subtypes in recombinant HIV sequences. To evaluate our tool and to compare its prediction accuracy to competing methods such as Simplot (12) and RDP (13), we used a large set of real and simulated data from HIV-1 and hepatitis C. These test runs demonstrated that jpHMM is far more accurate than existing tools for phylogenetic breakpoint detection. Details of this program evaluation are described in (7). To make jpHMM available to the HIV research community, we set up an easy-to-use WWW interface at Göttingen Bioinformatics Compute Server (GOBICS): At our server, the user can paste or upload up to 5 full-length HIV-1 genome sequences that is to be searched for phylogenetic breakpoints and subtypes. Our server uses a pre-calculated multiple alignment of 309 HIV sequences from the major HIV (sub-)subtypes obtained from the HIV database at . These sequences include nine subtypes A–D, F, G, H, J, K, and a persumed recombinant 01_AE. Subtype A has two sub-subtypes, A1 and A2; similarly F has two sub-subtypes, F1 and F2. B and D could be regarded as sub-subtypes because their relative distance and relation are similar to A1 and A2, F1 and F2, respectively. But we still consider B and D as subtypes, not sub-subtypes because of historical reasons (14). 01_AE, though being called recombinant, contains the only information of subtype E. Thus we include 01_AE in the alignment. The alignment of these sequences has been carried out using HMMER (15) and subsequent manual improvement. A hyperlink to the results of the program run is returned to the user by e-mail. The result file contains a list of fragments of the input sequence that are assigned to different subtypes and sub-subtypes, including predicted breakpoints between these fragments. In addition, the output file contains a graphical representation of the predicted recombinant fragments within the HIV-1 genome. A sample output file is shown in Figure 1. The predicted breakpoint positions are provided in two ways. One is based on the original sequence position, and the other is based on HXB2 numbering. HXB2 (GenBank accession number K03455) is the most commonly used reference strain for many different kinds of HIV-1 functional studies. The HXB2 numbering provided for the output breakpoints is especially useful to facilitate the identification of the precise location of interest in HIV sequences.
Figure 1

Sample output from our jpHMM web server. The output file contains a list of fragments from the input HIV-1 sequences that are assigned to different HIV subtypes, including predicted breakpoints. At the bottom of the file, a graphical representation of the input sequence is given where recombinant subtypes are color coded. Gray regions denote missing subtype information due to uninformative subtype models.

PROGRAM LIMITATIONS AND FUTURE WORK

It should be mentioned that our tool is sometimes not sensitive to detect HIV-1 subtypes H, J, K, as only few full-length genome sequences of these subtypes are available to train our model. For these subtypes, we recommend to compare the results of jpHMM with those of other HIV-1 subtyping tools, for example, RIP (). As shown in (7), the overall prediction accuracy of our method is high compared with alternative approaches. Nevertheless, it would be useful for the user to assess the relative reliability of individual predicted breakpoints. In principle, this is possible by using posterior probabilities that can be calculated using the Forward and Backward algorithms as explained in (16). We are currently implementing these algorithms to estimate the (local) reliability of our predictions. This feature will be available on our web site in the near future. For predicted recombinants, users of our software may want to know putative parental sequences. Our method cannot provide this information directly, since jpHMM compares input sequences to a model derived from a pre-calculated alignment of representative sequences. It is possible, however, to search predicted recombinant segments of input sequences against the HIV-1 database to retrieve potential parent sequences. We are planning to add this functionality to our web server soon.
  11 in total

1.  HIV-1 nomenclature proposal.

Authors:  D L Robertson; J P Anderson; J A Bradac; J K Carr; B Foley; R K Funkhouser; F Gao; B H Hahn; M L Kalish; C Kuiken; G H Learn; T Leitner; F McCutchan; S Osmanov; M Peeters; D Pieniazek; M Salminen; P M Sharp; S Wolinsky; B Korber
Journal:  Science       Date:  2000-04-07       Impact factor: 47.728

2.  RDP: detection of recombination amongst aligned sequences.

Authors:  D Martin; E Rybicki
Journal:  Bioinformatics       Date:  2000-06       Impact factor: 6.937

3.  Detection of HIV-1 subtypes, recombinants, and dual infections in east Africa by a multi-region hybridization assay.

Authors:  Michael Hoelscher; William E Dowling; Eric Sanders-Buell; Jean K Carr; Matthew E Harris; Angelika Thomschke; Merlin L Robb; Deborah L Birx; Francine E McCutchan
Journal:  AIDS       Date:  2002-10-18       Impact factor: 4.177

4.  A novel approach to remote homology detection: jumping alignments.

Authors:  Rainer Spang; Marc Rehmsmeier; Jens Stoye
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

5.  RDP2: recombination detection and analysis from sequence alignments.

Authors:  D P Martin; C Williamson; D Posada
Journal:  Bioinformatics       Date:  2004-09-17       Impact factor: 6.937

6.  A computer program designed to screen rapidly for HIV type 1 intersubtype recombinant sequences.

Authors:  A C Siepel; A L Halpern; C Macken; B T Korber
Journal:  AIDS Res Hum Retroviruses       Date:  1995-11       Impact factor: 2.205

7.  Hidden Markov models in computational biology. Applications to protein modeling.

Authors:  A Krogh; M Brown; I S Mian; K Sjölander; D Haussler
Journal:  J Mol Biol       Date:  1994-02-04       Impact factor: 5.469

8.  Full-length human immunodeficiency virus type 1 genomes from subtype C-infected seroconverters in India, with evidence of intersubtype recombination.

Authors:  K S Lole; R C Bollinger; R S Paranjape; D Gadkari; S S Kulkarni; N G Novak; R Ingersoll; H W Sheppard; S C Ray
Journal:  J Virol       Date:  1999-01       Impact factor: 5.103

Review 9.  Simian immunodeficiency virus infection of chimpanzees.

Authors:  Paul M Sharp; George M Shaw; Beatrice H Hahn
Journal:  J Virol       Date:  2005-04       Impact factor: 6.549

10.  A jumping profile Hidden Markov Model and applications to recombination sites in HIV and HCV genomes.

Authors:  Anne-Kathrin Schultz; Ming Zhang; Thomas Leitner; Carla Kuiken; Bette Korber; Burkhard Morgenstern; Mario Stanke
Journal:  BMC Bioinformatics       Date:  2006-05-22       Impact factor: 3.169

View more
  29 in total

1.  Characterization of primary isolates of HIV type 1 CRF28_BF, CRF29_BF, and unique BF recombinants circulating in São Paulo, Brazil.

Authors:  Fernando Lucas Melo; Leda Fátima Jamal; Paolo Marinho de Andrade Zanotto
Journal:  AIDS Res Hum Retroviruses       Date:  2012-04-16       Impact factor: 2.205

2.  HIV classification using the coalescent theory.

Authors:  Ingo Bulla; Anne-Kathrin Schultz; Fabian Schreiber; Ming Zhang; Thomas Leitner; Bette Korber; Burkhard Morgenstern; Mario Stanke
Journal:  Bioinformatics       Date:  2010-04-16       Impact factor: 6.937

3.  Molecular characterization of unique intersubtype HIV type 1 A1/C recombinant strain circulating in Pune, India.

Authors:  Sudhanshu Pandey; Srikanth Tripathy; Ramesh Paranjape
Journal:  AIDS Res Hum Retroviruses       Date:  2013-07-19       Impact factor: 2.205

4.  Near full-length genome sequence of a novel HIV type 1 second-generation recombinant form (CRF01_AE/CRF07_BC) identified among men who have sex with men in Jilin, China.

Authors:  Xingguang Li; Chuanyi Ning; Xiang He; Yao Yang; Hui Xing; Kunxue Hong; Yiming Shao; Rongge Yang
Journal:  AIDS Res Hum Retroviruses       Date:  2013-07-27       Impact factor: 2.205

5.  Implementation of an HIV-1 Triple-Target NAT Assay in the Routine Screening at Three German Red Cross Blood Centres.

Authors:  Silke De Zolt; Rolf Thermann; Thorsten Bangsow; Lutz Pichl; Benjamin Müller; Christine Jork; Marijke Weber-Schehl; Doris Hedges; Ingo Schupp; Patrick Unverzagt; Katrin de Rue; W Kurt Roth
Journal:  Transfus Med Hemother       Date:  2016-05-11       Impact factor: 3.747

6.  Genetic Characterization of a Unique Recombinant Originating from CRF55_01B, CRF01_AE, and CRF07_BC in Shenzhen, China.

Authors:  Tao Gui; Jin Zhao; Changrong Sun; Lin Chen; Yongjian Liu; Chenli Zheng; Hanping Li; Tianyi Li; Zuoyi Bao; Xiaolin Wang; Jingyun Li; Lin Li
Journal:  AIDS Res Hum Retroviruses       Date:  2015-05       Impact factor: 2.205

7.  Molecular Epidemiology of HIV-1 Virus in Puerto Rico: Novel Cases of HIV-1 Subtype C, D, and CRF-24BG.

Authors:  Pablo López; Omayra De Jesús; Yasuhiro Yamamura; Nayra Rodríguez; Andrea Arias; Raphael Sánchez; Yadira Rodríguez; Vivian Tamayo-Agrait; Wilfredo Cuevas; Vanessa Rivera-Amill
Journal:  AIDS Res Hum Retroviruses       Date:  2018-05-23       Impact factor: 2.205

8.  The role of recombination in the emergence of a complex and dynamic HIV epidemic.

Authors:  Ming Zhang; Brian Foley; Anne-Kathrin Schultz; Jennifer P Macke; Ingo Bulla; Mario Stanke; Burkhard Morgenstern; Bette Korber; Thomas Leitner
Journal:  Retrovirology       Date:  2010-03-23       Impact factor: 4.602

9.  A classification approach for genotyping viral sequences based on multidimensional scaling and linear discriminant analysis.

Authors:  Jiwoong Kim; Yongju Ahn; Kichan Lee; Sung Hee Park; Sangsoo Kim
Journal:  BMC Bioinformatics       Date:  2010-08-21       Impact factor: 3.169

10.  jpHMM: improving the reliability of recombination prediction in HIV-1.

Authors:  Anne-Kathrin Schultz; Ming Zhang; Ingo Bulla; Thomas Leitner; Bette Korber; Burkhard Morgenstern; Mario Stanke
Journal:  Nucleic Acids Res       Date:  2009-05-14       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.