Literature DB >> 18781200

Identification and analysis of co-occurrence networks with NetCutter.

Heiko Müller1, Francesco Mancuso.   

Abstract

BACKGROUND: Co-occurrence analysis is a technique often applied in text mining, comparative genomics, and promoter analysis. The methodologies and statistical models used to evaluate the significance of association between co-occurring entities are quite diverse, however. METHODOLOGY/PRINCIPAL
FINDINGS: We present a general framework for co-occurrence analysis based on a bipartite graph representation of the data, a novel co-occurrence statistic, and software performing co-occurrence analysis as well as generation and analysis of co-occurrence networks. We show that the overall stringency of co-occurrence analysis depends critically on the choice of the null-model used to evaluate the significance of co-occurrence and find that random sampling from a complete permutation set of the bipartite graph permits co-occurrence analysis with optimal stringency. We show that the Poisson-binomial distribution is the most natural co-occurrence probability distribution when vertex degrees of the bipartite graph are variable, which is usually the case. Calculation of Poisson-binomial P-values is difficult, however. Therefore, we propose a fast bi-binomial approximation for calculation of P-values and show that this statistic is superior to other measures of association such as the Jaccard coefficient and the uncertainty coefficient. Furthermore, co-occurrence analysis of more than two entities can be performed using the same statistical model, which leads to increased signal-to-noise ratios, robustness towards noise, and the identification of implicit relationships between co-occurring entities. Using NetCutter, we identify a novel protein biosynthesis related set of genes that are frequently coordinately deregulated in human cancer related gene expression studies. NetCutter is available at http://bio.ifom-ieo-campus.it/NetCutter/).
CONCLUSION: Our approach can be applied to any set of categorical data where co-occurrence analysis might reveal functional relationships such as clinical parameters associated with cancer subtypes or SNPs associated with disease phenotypes. The stringency of our approach is expected to offer an advantage in a variety of applications.

Entities:  

Mesh:

Year:  2008        PMID: 18781200      PMCID: PMC2526157          DOI: 10.1371/journal.pone.0003178

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


  39 in total

1.  The identification of functional modules from the genomic association of genes.

Authors:  Berend Snel; Peer Bork; Martijn A Huynen
Journal:  Proc Natl Acad Sci U S A       Date:  2002-04-30       Impact factor: 11.205

2.  Association of genes to genetically inherited diseases using data mining.

Authors:  Carolina Perez-Iratxeta; Peer Bork; Miguel A Andrade
Journal:  Nat Genet       Date:  2002-05-13       Impact factor: 38.330

3.  Assigning protein functions by comparative genome analysis: protein phylogenetic profiles.

Authors:  M Pellegrini; E M Marcotte; M J Thompson; D Eisenberg; T O Yeates
Journal:  Proc Natl Acad Sci U S A       Date:  1999-04-13       Impact factor: 11.205

4.  Distribution of information in biomedical abstracts and full-text publications.

Authors:  M J Schuemie; M Weeber; B J A Schijvenaars; E M van Mulligen; C C van der Eijk; R Jelier; B Mons; J A Kors
Journal:  Bioinformatics       Date:  2004-05-06       Impact factor: 6.937

5.  Knowledge discovery by automated identification and ranking of implicit relationships.

Authors:  Jonathan D Wren; Raffi Bekeredjian; Jelena A Stewart; Ralph V Shohet; Harold R Garner
Journal:  Bioinformatics       Date:  2004-01-22       Impact factor: 6.937

6.  Data mining for regulatory elements in yeast genome.

Authors:  A Brazma; J Vilo; E Ukkonen; K Valtonen
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1997

7.  Genome-wide co-occurrence of promoter elements reveals a cis-regulatory cassette of rRNA transcription motifs in Saccharomyces cerevisiae.

Authors:  Priya Sudarsanam; Yitzhak Pilpel; George M Church
Journal:  Genome Res       Date:  2002-11       Impact factor: 9.043

8.  Recognition of NFATp/AP-1 composite elements within genes induced upon the activation of immune cells.

Authors:  A Kel; O Kel-Margoulis; V Babenko; E Wingender
Journal:  J Mol Biol       Date:  1999-05-07       Impact factor: 5.469

9.  Mining microarray expression data by literature profiling.

Authors:  Damien Chaussabel; Alan Sher
Journal:  Genome Biol       Date:  2002-09-13       Impact factor: 13.583

10.  Application of a new probabilistic model for mining implicit associated cancer genes from OMIM and medline.

Authors:  Shanfeng Zhu; Yasushi Okuno; Gozoh Tsujimoto; Hiroshi Mamitsuka
Journal:  Cancer Inform       Date:  2007-02-25
View more
  6 in total

1.  BrainKnowledge: a human brain function mapping knowledge-base system.

Authors:  Mei-Yu Hsiao; Chien-Chung Chen; Jyh-Horng Chen
Journal:  Neuroinformatics       Date:  2011-03

2.  Metabolite-mediated modelling of microbial community dynamics captures emergent behaviour more effectively than species-species modelling.

Authors:  J D Brunner; N Chia
Journal:  J R Soc Interface       Date:  2019-10-23       Impact factor: 4.118

3.  Clique-based data mining for related genes in a biomedical database.

Authors:  Tsutomu Matsunaga; Chikara Yonemori; Etsuji Tomita; Masaaki Muramatsu
Journal:  BMC Bioinformatics       Date:  2009-07-01       Impact factor: 3.169

4.  The large-scale organization of the bacterial network of ecological co-occurrence interactions.

Authors:  Shiri Freilich; Anat Kreimer; Isacc Meilijson; Uri Gophna; Roded Sharan; Eytan Ruppin
Journal:  Nucleic Acids Res       Date:  2010-03-01       Impact factor: 16.971

5.  Aberrant somatic hypermutation of CCND1 generates non-coding drivers of mantle cell lymphomagenesis.

Authors:  Heiko Müller; Wencke Walter; Stephan Hutter; Niroshan Nadarajah; Manja Meggendorfer; Wolfgang Kern; Torsten Haferlach; Claudia Haferlach
Journal:  Cancer Gene Ther       Date:  2022-02-10       Impact factor: 5.987

6.  The research on gene-disease association based on text-mining of PubMed.

Authors:  Jie Zhou; Bo-Quan Fu
Journal:  BMC Bioinformatics       Date:  2018-02-07       Impact factor: 3.169

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.