Literature DB >> 7584402

Fitting a mixture model by expectation maximization to discover motifs in biopolymers.

T L Bailey1, C Elkan.   

Abstract

The algorithm described in this paper discovers one or more motifs in a collection of DNA or protein sequences by using the technique of expectation maximization to fit a two-component finite mixture model to the set of sequences. Multiple motifs are found by fitting a mixture model to the data, probabilistically erasing the occurrences of the motif thus found, and repeating the process to find successive motifs. The algorithm requires only a set of unaligned sequences and a number specifying the width of the motifs as input. It returns a model of each motif and a threshold which together can be used as a Bayes-optimal classifier for searching for occurrences of the motif in other databases. The algorithm estimates how many times each motif occurs in each sequence in the dataset and outputs an alignment of the occurrences of the motif. The algorithm is capable of discovering several different motifs with differing numbers of occurrences in a single dataset.

Mesh:

Substances:

Year:  1994        PMID: 7584402

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  2000 in total

1.  Cloning, characterization and mapping of the human ATP5E gene, identification of pseudogene ATP5EP1, and definition of the ATP5E motif.

Authors:  Q Tu; L Yu; P Zhang; M Zhang; H Zhang; J Jiang; C Chen; S Zhao
Journal:  Biochem J       Date:  2000-04-01       Impact factor: 3.857

2.  Relationships within the aldehyde dehydrogenase extended family.

Authors:  J Perozich; H Nicholas; B C Wang; R Lindahl; J Hempel
Journal:  Protein Sci       Date:  1999-01       Impact factor: 6.725

3.  Cloning and mapping of human PKIB and PKIG, and comparison of tissue expression patterns of three members of the protein kinase inhibitor family, including PKIA.

Authors:  L Zheng; L Yu; Q Tu; M Zhang; H He; W Chen; J Gao; J Yu; Q Wu; S Zhao
Journal:  Biochem J       Date:  2000-07-15       Impact factor: 3.857

4.  EBP2 is a member of the yeast RRB regulon, a transcriptionally coregulated set of genes that are required for ribosome and rRNA biosynthesis.

Authors:  C Wade; K A Shea; R V Jensen; M A McAlear
Journal:  Mol Cell Biol       Date:  2001-12       Impact factor: 4.272

5.  Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome.

Authors:  Benjamin P Berman; Yutaka Nibu; Barret D Pfeiffer; Pavel Tomancak; Susan E Celniker; Michael Levine; Gerald M Rubin; Michael B Eisen
Journal:  Proc Natl Acad Sci U S A       Date:  2002-01-22       Impact factor: 11.205

6.  Global regulation by gidA in Pseudomonas syringae.

Authors:  Thomas G Kinscherf; David K Willis
Journal:  J Bacteriol       Date:  2002-04       Impact factor: 3.490

7.  The evolution of DNA regulatory regions for proteo-gamma bacteria by interspecies comparisons.

Authors:  Nikolaus Rajewsky; Nicholas D Socci; Martin Zapotocky; Eric D Siggia
Journal:  Genome Res       Date:  2002-02       Impact factor: 9.043

8.  ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites.

Authors:  O Emanuelsson; H Nielsen; G von Heijne
Journal:  Protein Sci       Date:  1999-05       Impact factor: 6.725

9.  Additivity in protein-DNA interactions: how good an approximation is it?

Authors:  Panayiotis V Benos; Martha L Bulyk; Gary D Stormo
Journal:  Nucleic Acids Res       Date:  2002-10-15       Impact factor: 16.971

10.  The let-7-Imp axis regulates ageing of the Drosophila testis stem-cell niche.

Authors:  Hila Toledano; Cecilia D'Alterio; Benjamin Czech; Erel Levine; D Leanne Jones
Journal:  Nature       Date:  2012-05-23       Impact factor: 49.962

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.