Literature DB >> 21071798

Finding significant matches of position weight matrices in linear time.

Cinzia Pizzi1, Pasi Rastas, Esko Ukkonen.   

Abstract

Position weight matrices are an important method for modeling signals or motifs in biological sequences, both in DNA and protein contexts. In this paper, we present fast algorithms for the problem of finding significant matches of such matrices. Our algorithms are of the online type, and they generalize classical multipattern matching, filtering, and superalphabet techniques of combinatorial string matching to the problem of weight matrix matching. Several variants of the algorithms are developed, including multiple matrix extensions that perform the search for several matrices in one scan through the sequence database. Experimental performance evaluation is provided to compare the new techniques against each other as well as against some other online and index-based algorithms proposed in the literature. Compared to the brute-force O(mn) approach, our solutions can be faster by a factor that is proportional to the matrix length m. Our multiple-matrix filtration algorithm had the best performance in the experiments. On a current PC, this algorithm finds significant matches (p = 0.0001) of the 123 JASPAR matrices in the human genome in about 18 minutes.

Entities:  

Mesh:

Substances:

Year:  2011        PMID: 21071798     DOI: 10.1109/TCBB.2009.35

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  11 in total

1.  Fast matching of transcription factor motifs using generalized position weight matrix models.

Authors:  Emanuele Giaquinta; Szymon Grabowski; Esko Ukkonen
Journal:  J Comput Biol       Date:  2013-08-06       Impact factor: 1.479

2.  Quorum Sensing Regulators Are Required for Metabolic Fitness in Vibrio parahaemolyticus.

Authors:  Sai Siddarth Kalburge; Megan R Carpenter; Sharon Rozovsky; E Fidelma Boyd
Journal:  Infect Immun       Date:  2017-02-23       Impact factor: 3.441

3.  A protein activity assay to measure global transcription factor activity reveals determinants of chromatin accessibility.

Authors:  Bei Wei; Arttu Jolma; Biswajyoti Sahu; Lukas M Orre; Fan Zhong; Fangjie Zhu; Teemu Kivioja; Inderpreet Sur; Janne Lehtiö; Minna Taipale; Jussi Taipale
Journal:  Nat Biotechnol       Date:  2018-05-21       Impact factor: 54.908

4.  MEPP: more transparent motif enrichment by profiling positional correlations.

Authors:  Nathaniel P Delos Santos; Sascha Duttke; Sven Heinz; Christopher Benner
Journal:  NAR Genom Bioinform       Date:  2022-10-17

5.  MOODS: fast search for position weight matrix matches in DNA sequences.

Authors:  Janne Korhonen; Petri Martinmäki; Cinzia Pizzi; Pasi Rastas; Esko Ukkonen
Journal:  Bioinformatics       Date:  2009-09-22       Impact factor: 6.937

6.  DEEP: a general computational framework for predicting enhancers.

Authors:  Dimitrios Kleftogiannis; Panos Kalnis; Vladimir B Bajic
Journal:  Nucleic Acids Res       Date:  2014-11-05       Impact factor: 16.971

7.  Predicting physiologically relevant SH3 domain mediated protein-protein interactions in yeast.

Authors:  Shobhit Jain; Gary D Bader
Journal:  Bioinformatics       Date:  2016-02-09       Impact factor: 6.937

8.  Short DNA sequence patterns accurately identify broadly active human enhancers.

Authors:  Laura L Colbran; Ling Chen; John A Capra
Journal:  BMC Genomics       Date:  2017-07-17       Impact factor: 3.969

9.  Efficient computation of spaced seed hashing with block indexing.

Authors:  Samuele Girotto; Matteo Comin; Cinzia Pizzi
Journal:  BMC Bioinformatics       Date:  2018-11-30       Impact factor: 3.169

10.  DNA binding specificities of the long zinc-finger recombination protein PRDM9.

Authors:  Timothy Billings; Emil D Parvanov; Christopher L Baker; Michael Walker; Kenneth Paigen; Petko M Petkov
Journal:  Genome Biol       Date:  2013-04-24       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.