Literature DB >> 17032678

DASS: efficient discovery and p-value calculation of substructures in unordered data.

Jens Hollunder1, Maik Friedel, Andreas Beyer, Christopher T Workman, Thomas Wilhelm.   

Abstract

MOTIVATION: Pattern identification in biological sequence data is one of the main objectives of bioinformatics research. However, few methods are available for detecting patterns (substructures) in unordered datasets. Data mining algorithms mainly developed outside the realm of bioinformatics have been adapted for that purpose, but typically do not determine the statistical significance of the identified patterns. Moreover, these algorithms do not exploit the often modular structure of biological data.
RESULTS: We present the algorithm DASS (Discovery of All Significant Substructures) that first identifies all substructures in unordered data (DASS(Sub)) in a manner that is especially efficient for modular data. In addition, DASS calculates the statistical significance of the identified substructures, for sets with at most one element of each type (DASS(P(set))), or for sets with multiple occurrence of elements (DASS(P(mset))). The power and versatility of DASS is demonstrated by four examples: combinations of protein domains in multi-domain proteins, combinations of proteins in protein complexes (protein subcomplexes), combinations of transcription factor target sites in promoter regions and evolutionarily conserved protein interaction subnetworks. AVAILABILITY: The program code and additional data are available at http://www.fli-leibniz.de/tsb/DASS

Mesh:

Substances:

Year:  2006        PMID: 17032678     DOI: 10.1093/bioinformatics/btl511

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  5 in total

1.  DASS-GUI: a user interface for identification and analysis of significant patterns in non-sequential data.

Authors:  Jens Hollunder; Maik Friedel; Martin Kuiper; Thomas Wilhelm
Journal:  Bioinformatics       Date:  2010-02-19       Impact factor: 6.937

2.  Identifying the topology of protein complexes from affinity purification assays.

Authors:  Caroline C Friedel; Ralf Zimmer
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

3.  One Hand Clapping: detection of condition-specific transcription factor interactions from genome-wide gene activity data.

Authors:  Sebastian Dümcke; Martin Seizl; Stefanie Etzold; Nicole Pirkl; Dietmar E Martin; Patrick Cramer; Achim Tresch
Journal:  Nucleic Acids Res       Date:  2012-07-25       Impact factor: 16.971

4.  On the detection of functionally coherent groups of protein domains with an extension to protein annotation.

Authors:  William A McLaughlin; Ken Chen; Tingjun Hou; Wei Wang
Journal:  BMC Bioinformatics       Date:  2007-10-16       Impact factor: 3.169

5.  High-resolution analysis of condition-specific regulatory modules in Saccharomyces cerevisiae.

Authors:  Hun-Goo Lee; Hyo-Soo Lee; Sang-Hoon Jeon; Tae-Hoon Chung; Young-Sung Lim; Won-Ki Huh
Journal:  Genome Biol       Date:  2008-01-03       Impact factor: 13.583

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.