B Gaschen1, C Kuiken, B Korber, B Foley. 1. HIV Database and Analysis Group (T10), Los Alamos National Laboratory, Mail stop K710, Los Alamos, NM 87545, USA. bkg@lanl.gov
Abstract
MOTIVATION: The amount of HIV-1 sequence data generated (presently around 42000 sequences, of which more than 22000 are from the V3 region of the viral envelope) presents a challenge for anyone working on the analysis of these data. A major problem is obtaining the region of interest from the stored sequences, which often contain but are not limited to that region. In addition, multiple alignment programs generally cannot deal with the large numbers of sequences that are available for many HIV-1 regions. We set out to provide our users with a tool that will retrieve and create an initial alignment of the HIV sequences that are available for a given genomic region. RESULTS: The MPAlign (Multiple Pairwise Alignment) web interface is a collection of Perl scripts that retrieves sequences from the Los Alamos HIV sequence database based on a number of search parameters. All sequences were pairwise-aligned to a model sequence using the Hidden Markov Model-based program HMMER. The HMMER model is general enough to accommodate virtually all HIV-1 sequences stored in the database. To create a multiple sequence alignment, gaps were inserted into the sequences during retrieval, so that they are aligned to one another. Retrieving and aligning the almost 560 gp120 sequences (approximately>1500 nt) stored in the database is at least 1500 times faster than a similar Clustal alignment.
MOTIVATION: The amount of HIV-1 sequence data generated (presently around 42000 sequences, of which more than 22000 are from the V3 region of the viral envelope) presents a challenge for anyone working on the analysis of these data. A major problem is obtaining the region of interest from the stored sequences, which often contain but are not limited to that region. In addition, multiple alignment programs generally cannot deal with the large numbers of sequences that are available for many HIV-1 regions. We set out to provide our users with a tool that will retrieve and create an initial alignment of the HIV sequences that are available for a given genomic region. RESULTS: The MPAlign (Multiple Pairwise Alignment) web interface is a collection of Perl scripts that retrieves sequences from the Los Alamos HIV sequence database based on a number of search parameters. All sequences were pairwise-aligned to a model sequence using the Hidden Markov Model-based program HMMER. The HMMER model is general enough to accommodate virtually all HIV-1 sequences stored in the database. To create a multiple sequence alignment, gaps were inserted into the sequences during retrieval, so that they are aligned to one another. Retrieving and aligning the almost 560 gp120 sequences (approximately>1500 nt) stored in the database is at least 1500 times faster than a similar Clustal alignment.
Authors: Hiromi Imamichi; Robin L Dewar; Joseph W Adelsberger; Catherine A Rehm; Una O'Doherty; Ellen E Paxinos; Anthony S Fauci; H Clifford Lane Journal: Proc Natl Acad Sci U S A Date: 2016-07-18 Impact factor: 11.205
Authors: Natalie N Kinloch; Guinevere Q Lee; Jonathan M Carlson; Steven W Jin; Chanson J Brumme; Helen Byakwaga; Conrad Muzoora; Mwebesa B Bwana; Kyle D Cobarrubias; Peter W Hunt; Jeff N Martin; Mary Carrington; David R Bangsberg; P Richard Harrigan; Mark A Brockman; Zabrina L Brumme Journal: J Virol Date: 2018-12-10 Impact factor: 5.103
Authors: Zabrina L Brumme; Hanwei Sudderuddin; Carrie Ziemniak; Katherine Luzuriaga; Bradley R Jones; Jeffrey B Joy; Coleen K Cunningham; Thomas Greenough; Deborah Persaud Journal: AIDS Date: 2019-02-01 Impact factor: 4.177
Authors: Fredrick H Omondi; Sandali Chandrarathna; Shariq Mujib; Chanson J Brumme; Steven W Jin; Hanwei Sudderuddin; Rachel L Miller; Asa Rahimi; Oliver Laeyendecker; Phil Bonner; Feng Yun Yue; Erika Benko; Colin M Kovacs; Mark A Brockman; Mario Ostrowski; Zabrina L Brumme Journal: J Virol Date: 2019-03-05 Impact factor: 5.103
Authors: Doris G Ransy; Alena Motorina; Natacha Merindol; Bertine S Akouamba; Johanne Samson; Yolanda Lie; Laura A Napolitano; Normand Lapointe; Marc Boucher; Hugo Soudeyns Journal: AIDS Res Hum Retroviruses Date: 2013-10-25 Impact factor: 2.205
Authors: S Gnanakaran; Marcus G Daniels; Tanmoy Bhattacharya; Alan S Lapedes; Anurag Sethi; Ming Li; Haili Tang; Kelli Greene; Hongmei Gao; Barton F Haynes; Myron S Cohen; George M Shaw; Michael S Seaman; Amit Kumar; Feng Gao; David C Montefiori; Bette Korber Journal: PLoS Comput Biol Date: 2010-10-07 Impact factor: 4.475