Literature DB >> 25886979

FastMotif: spectral sequence motif discovery.

Nicoló Colombo1, Nikos Vlassis2.   

Abstract

MOTIVATION: Sequence discovery tools play a central role in several fields of computational biology. In the framework of Transcription Factor binding studies, most of the existing motif finding algorithms are computationally demanding, and they may not be able to support the increasingly large datasets produced by modern high-throughput sequencing technologies.
RESULTS: We present FastMotif, a new motif discovery algorithm that is built on a recent machine learning technique referred to as Method of Moments. Based on spectral decompositions, our method is robust to model misspecifications and is not prone to locally optimal solutions. We obtain an algorithm that is extremely fast and designed for the analysis of big sequencing data. On HT-Selex data, FastMotif extracts motif profiles that match those computed by various state-of-the-art algorithms, but one order of magnitude faster. We provide a theoretical and numerical analysis of the algorithm's robustness and discuss its sensitivity with respect to the free parameters.
AVAILABILITY AND IMPLEMENTATION: The Matlab code of FastMotif is available from http://lcsb-portal.uni.lu/bioinformatics. CONTACT: vlassis@adobe.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25886979     DOI: 10.1093/bioinformatics/btv208

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  3 in total

1.  ProSampler: an ultrafast and accurate motif finder in large ChIP-seq datasets for combinatory motif discovery.

Authors:  Yang Li; Pengyu Ni; Shaoqiang Zhang; Guojun Li; Zhengchang Su
Journal:  Bioinformatics       Date:  2019-11-01       Impact factor: 6.937

2.  Fast Moment Estimation for Generalized Latent Dirichlet Models.

Authors:  Shiwen Zhao; Barbara E Engelhardt; Sayan Mukherjee; David B Dunson
Journal:  J Am Stat Assoc       Date:  2018-11-13       Impact factor: 4.369

3.  Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets.

Authors:  Jarkko Toivonen; Teemu Kivioja; Arttu Jolma; Yimeng Yin; Jussi Taipale; Esko Ukkonen
Journal:  Nucleic Acids Res       Date:  2018-05-04       Impact factor: 16.971

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.