Literature DB >> 10869032

SPLASH: structural pattern localization analysis by sequential histograms.

A Califano1.   

Abstract

MOTIVATION: The discovery of sparse amino acid patterns that match repeatedly in a set of protein sequences is an important problem in computational biology. Statistically significant patterns, that is patterns that occur more frequently than expected, may identify regions that have been preserved by evolution and which may therefore play a key functional or structural role. Sparseness can be important because a handful of non-contiguous residues may play a key role, while others, in between, may be changed without significant loss of function or structure. Similar arguments may be applied to conserved DNA patterns. Available sparse pattern discovery algorithms are either inefficient or impose limitations on the type of patterns that can be discovered.
RESULTS: This paper introduces a deterministic pattern discovery algorithm, called Splash, which can find sparse amino or nucleic acid patterns matching identically or similarly in a set of protein or DNA sequences. Sparse patterns of any length, up to the size of the input sequence, can be discovered without significant loss in performances. Splash is extremely efficient and embarrassingly parallel by nature. Large databases, such as a complete genome or the non-redundant SWISS-PROT database can be processed in a few hours on a typical workstation. Alternatively, a protein family or superfamily, with low overall homology, can be analyzed to discover common functional or structural signatures. Some examples of biologically interesting motifs discovered by Splash are reported for the histone I and for the G-Protein Coupled Receptor families. Due to its efficiency, Splash can be used to systematically and exhaustively identify conserved regions in protein family sets. These can then be used to build accurate and sensitive PSSM or HMM models for sequence analysis. AVAILABILITY: Splash is available to non-commercial research centers upon request, conditional on the signing of a test field agreement. CONTACT: acal@us.ibm.com, Splash main page http://www.research.ibm.com/splash

Entities:  

Mesh:

Substances:

Year:  2000        PMID: 10869032     DOI: 10.1093/bioinformatics/16.4.341

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  23 in total

Review 1.  Discovering patterns in microarray data.

Authors:  H B Burke
Journal:  Mol Diagn       Date:  2000-12

2.  Serum proteome profiling detects myelodysplastic syndromes and identifies CXC chemokine ligands 4 and 7 as markers for advanced disease.

Authors:  Manuel Aivado; Dimitrios Spentzos; Ulrich Germing; Gil Alterovitz; Xiao-Ying Meng; Franck Grall; Aristoteles A N Giagounidis; Giannoula Klement; Ulrich Steidl; Hasan H Otu; Akos Czibere; Wolf C Prall; Christof Iking-Konert; Michelle Shayne; Marco F Ramoni; Norbert Gattermann; Rainer Haas; Constantine S Mitsiades; Eric T Fung; Towia A Libermann
Journal:  Proc Natl Acad Sci U S A       Date:  2007-01-12       Impact factor: 11.205

3.  Discovering transcriptional regulatory regions in Drosophila by a nonalignment method for phylogenetic footprinting.

Authors:  Alona Sosinsky; Barry Honig; Richard S Mann; Andrea Califano
Journal:  Proc Natl Acad Sci U S A       Date:  2007-03-29       Impact factor: 11.205

4.  Gene expression profiling of B cell chronic lymphocytic leukemia reveals a homogeneous phenotype related to memory B cells.

Authors:  U Klein; Y Tu; G A Stolovitzky; M Mattioli; G Cattoretti; H Husson; A Freedman; G Inghirami; L Cro; L Baldini; A Neri; A Califano; R Dalla-Favera
Journal:  J Exp Med       Date:  2001-12-03       Impact factor: 14.307

5.  Gene expression analysis of peripheral T cell lymphoma, unspecified, reveals distinct profiles and new potential therapeutic targets.

Authors:  Pier Paolo Piccaluga; Claudio Agostinelli; Andrea Califano; Maura Rossi; Katia Basso; Simonetta Zupo; Philip Went; Ulf Klein; Pier Luigi Zinzani; Michele Baccarani; Riccardo Dalla Favera; Stefano A Pileri
Journal:  J Clin Invest       Date:  2007-02-15       Impact factor: 14.808

6.  Detection and preliminary analysis of motifs in promoters of anaerobically induced genes of different plant species.

Authors:  Bijayalaxmi Mohanty; S P T Krishnan; Sanjay Swarup; Vladimir B Bajic
Journal:  Ann Bot       Date:  2005-07-18       Impact factor: 4.357

7.  Gene expression profiling suggests primary central nervous system lymphomas to be derived from a late germinal center B cell.

Authors:  M Montesinos-Rongen; A Brunn; S Bentink; K Basso; W K Lim; W Klapper; C Schaller; G Reifenberger; J Rubenstein; O D Wiestler; R Spang; R Dalla-Favera; R Siebert; M Deckert
Journal:  Leukemia       Date:  2007-11-08       Impact factor: 11.528

8.  Iron-dependent regulation of MDM2 influences p53 activity and hepatic carcinogenesis.

Authors:  Paola Dongiovanni; Anna Ludovica Fracanzani; Gaetano Cairo; Chiara Paola Megazzini; Stefano Gatti; Raffaela Rametta; Silvia Fargion; Luca Valenti
Journal:  Am J Pathol       Date:  2009-12-17       Impact factor: 4.307

9.  A systems biology approach to transcription factor binding site prediction.

Authors:  Xiang Zhou; Pavel Sumazin; Presha Rajbhandari; Andrea Califano
Journal:  PLoS One       Date:  2010-03-26       Impact factor: 3.240

10.  Gene expression in Wilms' tumor mimics the earliest committed stage in the metanephric mesenchymal-epithelial transition.

Authors:  Chi-Ming Li; Meirong Guo; Alain Borczuk; Charles A Powell; Michelle Wei; Harshwardhan M Thaker; Richard Friedman; Ulf Klein; Benjamin Tycko
Journal:  Am J Pathol       Date:  2002-06       Impact factor: 4.307

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.