Literature DB >> 35771633

HSMotifDiscover: identification of motifs in sequences composed of non-single-letter elements.

Vinod Kumar Singh1, Rohan Misra1, Steven C Almo2, Ulrich G Steidl3, Hannes E Bülow1,4, Deyou Zheng1,2,3.   

Abstract

SUMMARY: The functional sub-string(s) of a biopolymer sequence defines the specificity of its interaction with other biomolecules and is often referred to as motifs. Computational algorithms and software have been broadly developed for finding such motifs in sequences in which the individual elements are single characters, such as those in DNA and protein sequences. However, there are more complex scenarios where the motifs exist in non-single-letter contexts, for example, preferred patterns of chemical modifications on proteins, DNAs, RNAs, or polysaccharides. To search for those motifs, we describe a new method that converts the modified sequence elements to representative single-letter codes and then uses a modified Gibbs-sampling algorithm to define the position specific scoring matrix (PSSM) representing the motif(s). As a proof of principle, we describe the implementation and application of an R package for discovering heparan sulfate (HS) motifs in glycan sequences, which are important in regulating protein-protein interactions. This software can be valuable for analyzing high-throughput glycoprotein binding data using microarrays with HS oligosaccharides or other biological polymers.
AVAILABILITY AND IMPLEMENTATION: HSMotifDiscover is freely available as an open source R package released under an MIT license at https://github.com/bioinfoDZ/HSMotifDiscover and also available in the form of an app at https://hsmotifdiscover.shinyapps.io/HSMotifDiscover_ShinyApp/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) (2022). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Year:  2022        PMID: 35771633      PMCID: PMC9364371          DOI: 10.1093/bioinformatics/btac437

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.931


  17 in total

Review 1.  Molecular diversity of heparan sulfate.

Authors:  J D Esko; U Lindahl
Journal:  J Clin Invest       Date:  2001-07       Impact factor: 14.808

2.  What are DNA sequence motifs?

Authors:  Patrik D'haeseleer
Journal:  Nat Biotechnol       Date:  2006-04       Impact factor: 54.908

3.  Disaccharide structure code for the easy representation of constituent oligosaccharides from glycosaminoglycans.

Authors:  Roger Lawrence; Hong Lu; Robert D Rosenberg; Jeffrey D Esko; Lijuan Zhang
Journal:  Nat Methods       Date:  2008-04       Impact factor: 28.547

Review 4.  The molecular diversity of glycosaminoglycans shapes animal development.

Authors:  Hannes E Bülow; Oliver Hobert
Journal:  Annu Rev Cell Dev Biol       Date:  2006       Impact factor: 13.827

5.  Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment.

Authors:  C E Lawrence; S F Altschul; M S Boguski; J S Liu; A F Neuwald; J C Wootton
Journal:  Science       Date:  1993-10-08       Impact factor: 47.728

6.  Modular synthesis of heparan sulfate oligosaccharides for structure-activity relationship studies.

Authors:  Sailaja Arungundram; Kanar Al-Mafraji; Jinkeng Asong; Franklin E Leach; I Jonathan Amster; Andre Venot; Jeremy E Turnbull; Geert-Jan Boons
Journal:  J Am Chem Soc       Date:  2009-12-02       Impact factor: 15.419

Review 7.  Specificity of glycosaminoglycan-protein interactions.

Authors:  Lena Kjellén; Ulf Lindahl
Journal:  Curr Opin Struct Biol       Date:  2018-02-09       Impact factor: 6.809

8.  Assessing computational tools for the discovery of transcription factor binding sites.

Authors:  Martin Tompa; Nan Li; Timothy L Bailey; George M Church; Bart De Moor; Eleazar Eskin; Alexander V Favorov; Martin C Frith; Yutao Fu; W James Kent; Vsevolod J Makeev; Andrei A Mironov; William Stafford Noble; Giulio Pavesi; Graziano Pesole; Mireille Régnier; Nicolas Simonis; Saurabh Sinha; Gert Thijs; Jacques van Helden; Mathias Vandenbogaert; Zhiping Weng; Christopher Workman; Chun Ye; Zhou Zhu
Journal:  Nat Biotechnol       Date:  2005-01       Impact factor: 54.908

9.  ggseqlogo: a versatile R package for drawing sequence logos.

Authors:  Omar Wagih
Journal:  Bioinformatics       Date:  2017-11-15       Impact factor: 6.937

10.  An integrated encyclopedia of DNA elements in the human genome.

Authors: 
Journal:  Nature       Date:  2012-09-06       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.