Literature DB >> 33557755

NGlyAlign: an automated library building tool to align highly divergent HIV envelope sequences.

Elma H Akand1, John M Murray2.   

Abstract

BACKGROUND: The high variability in envelope regions of some viruses such as HIV allow the virus to establish infection and to escape subsequent immune surveillance. This variability, as well as increasing incorporation of N-linked glycosylation sites, is fundamental to this evasion. It also creates difficulties for multiple sequence alignment methods (MSA) that provide the first step in their analysis. Existing MSA tools often fail to properly align highly variable HIV envelope sequences requiring extensive manual editing that is impractical with even a moderate number of these variable sequences.
RESULTS: We developed an automated library building tool NGlyAlign, that organizes similar N-linked glycosylation sites as block constraints and statistically conserved global sites as single site constraints to automatically enforce partial columns in consistency-based MSA methods such as Dialign. This combined method accurately aligns variable HIV-1 envelope sequences. We tested the method on two datasets: a set of 156 founder and chronic gp160 HIV-1 subtype B sequences as well as a set of reference sequences of gp120 in the highly variable region 1. On measures such as entropy scores, sum of pair scores, column score, and similarity heat maps, NGlyAlign+Dialign proved superior against methods such as T-Coffee, ClustalOmega, ClustalW, Praline, HIValign and Muscle. The method is scalable to large sequence sets producing accurate alignments without requiring manual editing. As well as this application to HIV, our method can be used for other highly variable glycoproteins such as hepatitis C virus envelope.
CONCLUSIONS: NGlyAlign is an automated tool for mapping and building glycosylation motif libraries to accurately align highly variable regions in HIV sequences. It can provide the basis for many studies reliant on single robust alignments. NGlyAlign has been developed as an open-source tool and is freely available at https://github.com/UNSW-Mathematical-Biology/NGlyAlign_v1.0 .

Entities:  

Keywords:  Anchored alignment; Glycosylation; HIV; Sequence alignment

Mesh:

Substances:

Year:  2021        PMID: 33557755      PMCID: PMC7869453          DOI: 10.1186/s12859-020-03901-y

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  33 in total

1.  Retrieval and on-the-fly alignment of sequence fragments from the HIV database.

Authors:  B Gaschen; C Kuiken; B Korber; B Foley
Journal:  Bioinformatics       Date:  2001-05       Impact factor: 6.937

2.  PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.

Authors:  David T Jones; Daniel W A Buchan; Domenico Cozzetto; Massimiliano Pontil
Journal:  Bioinformatics       Date:  2011-11-17       Impact factor: 6.937

3.  The amino acid following an asn-X-Ser/Thr sequon is an important determinant of N-linked core glycosylation efficiency.

Authors:  J L Mellquist; L Kasturi; S L Spitalnik; S H Shakin-Eshleman
Journal:  Biochemistry       Date:  1998-05-12       Impact factor: 3.162

4.  A comprehensive benchmark study of multiple sequence alignment methods: current challenges and future perspectives.

Authors:  Julie D Thompson; Benjamin Linard; Odile Lecompte; Olivier Poch
Journal:  PLoS One       Date:  2011-03-31       Impact factor: 3.240

5.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Authors:  Fabian Sievers; Andreas Wilm; David Dineen; Toby J Gibson; Kevin Karplus; Weizhong Li; Rodrigo Lopez; Hamish McWilliam; Michael Remmert; Johannes Söding; Julie D Thompson; Desmond G Higgins
Journal:  Mol Syst Biol       Date:  2011-10-11       Impact factor: 11.429

6.  AlignStat: a web-tool and R package for statistical comparison of alternative multiple sequence alignments.

Authors:  Thomas Shafee; Ira Cooke
Journal:  BMC Bioinformatics       Date:  2016-10-26       Impact factor: 3.169

7.  Fitness landscape of the human immunodeficiency virus envelope protein that is targeted by antibodies.

Authors:  Raymond H Y Louie; Kevin J Kaczorowski; John P Barton; Arup K Chakraborty; Matthew R McKay
Journal:  Proc Natl Acad Sci U S A       Date:  2018-01-08       Impact factor: 11.205

8.  Differentiating founder and chronic HIV envelope sequences.

Authors:  John M Murray; Stephen Maher; Talia Mota; Kazuo Suzuki; Anthony D Kelleher; Rob J Center; Damian Purcell
Journal:  PLoS One       Date:  2017-02-10       Impact factor: 3.240

Review 9.  Upcoming challenges for multiple sequence alignment methods in the high-throughput era.

Authors:  Carsten Kemena; Cedric Notredame
Journal:  Bioinformatics       Date:  2009-07-30       Impact factor: 6.937

10.  Structural and functional roles of coevolved sites in proteins.

Authors:  Saikat Chakrabarti; Anna R Panchenko
Journal:  PLoS One       Date:  2010-01-06       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.