Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences.

Literature DB >> 2184437

An expectation maximization (EM) algorithm for the identification and characterization of common sites in unaligned biopolymer sequences.

Abstract

Statistical methodology for the identification and characterization of protein binding sites in a set of unaligned DNA fragments is presented. Each sequence must contain at least one common site. No alignment of the sites is required. Instead, the uncertainty in the location of the sites is handled by employing the missing information principle to develop an "expectation maximization" (EM) algorithm. This approach allows for the simultaneous identification of the sites and characterization of the binding motifs. The reliability of the algorithm increases with the number of fragments, but the computations increase only linearly. The method is illustrated with an example, using known cyclic adenosine monophosphate receptor protein (CRP) binding sites. The final motif is utilized in a search for undiscovered CRP binding sites.

Mesh：

Substances：

Year: 1990 PMID： 2184437 DOI： 10.1002/prot.340070105

Source DB: PubMed Journal: Proteins ISSN： 0887-3585

Keyword Cloud
Cited

92 in total

1. Discovering regulatory elements in non-coding sequences by analysis of spaced dyads.

Authors: J van Helden; A F Rios; J Collado-Vides
Journal: Nucleic Acids Res Date: 2000-04-15 Impact factor: 16.971

Review 2. In silico identification of metazoan transcriptional regulatory regions.

Authors: Wyeth W Wasserman; William Krivan
Journal: Naturwissenschaften Date: 2003-03-27

3. Additivity in protein-DNA interactions: how good an approximation is it?

Authors: Panayiotis V Benos; Martha L Bulyk; Gary D Stormo
Journal: Nucleic Acids Res Date: 2002-10-15 Impact factor: 16.971

4. Finding important sites in protein sequences.

Authors: Peter J Bickel; Katherina J Kechris; Philip C Spector; Gary J Wedemayer; Alexander N Glazer
Journal: Proc Natl Acad Sci U S A Date: 2002-11-04 Impact factor: 11.205

5. Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences.

Authors: Martin C Frith; John L Spouge; Ulla Hansen; Zhiping Weng
Journal: Nucleic Acids Res Date: 2002-07-15 Impact factor: 16.971

6. Gibbs Recursive Sampler: finding transcription factor binding sites.

Authors: William Thompson; Eric C Rouchka; Charles E Lawrence
Journal: Nucleic Acids Res Date: 2003-07-01 Impact factor: 16.971

Review 7. Computational approaches to identify promoters and cis-regulatory elements in plant genomes.

Authors: Stephane Rombauts; Kobe Florquin; Magali Lescot; Kathleen Marchal; Pierre Rouzé; Yves van de Peer
Journal: Plant Physiol Date: 2003-07 Impact factor: 8.340

8. Characterization of a new tissue-specific transcription factor binding to the simian virus 40 enhancer TC-II (NF-kappa B) element.

Authors: A L Lattion; E Espel; P Reichenbach; C Fromental; P Bucher; A Israël; P Baeuerle; N R Rice; M Nabholz
Journal: Mol Cell Biol Date: 1992-11 Impact factor: 4.272

9. MotifPrototyper: a Bayesian profile model for motif families.

Authors: Eric P Xing; Richard M Karp
Journal: Proc Natl Acad Sci U S A Date: 2004-07-13 Impact factor: 11.205

10. Discovery of sequence motifs related to coexpression of genes using evolutionary computation.

Authors: Gary B Fogel; Dana G Weekes; Gabor Varga; Ernst R Dow; Harry B Harlow; Jude E Onyia; Chen Su
Journal: Nucleic Acids Res Date: 2004-07-20 Impact factor: 16.971