Literature DB >> 27091963

Boosting association rule mining in large datasets via Gibbs sampling.

Guoqi Qian1, Calyampudi Radhakrishna Rao2, Xiaoying Sun3, Yuehua Wu3.   

Abstract

Current algorithms for association rule mining from transaction data are mostly deterministic and enumerative. They can be computationally intractable even for mining a dataset containing just a few hundred transaction items, if no action is taken to constrain the search space. In this paper, we develop a Gibbs-sampling-induced stochastic search procedure to randomly sample association rules from the itemset space, and perform rule mining from the reduced transaction dataset generated by the sample. Also a general rule importance measure is proposed to direct the stochastic search so that, as a result of the randomly generated association rules constituting an ergodic Markov chain, the overall most important rules in the itemset space can be uncovered from the reduced dataset with probability 1 in the limit. In the simulation study and a real genomic data example, we show how to boost association rule mining by an integrated use of the stochastic search and the Apriori algorithm.

Keywords:  Gibbs sampling; association rule; genomic data; transaction data

Year:  2016        PMID: 27091963      PMCID: PMC4983808          DOI: 10.1073/pnas.1604553113

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  2 in total

1.  Common genetic variants associated with breast cancer and mammographic density measures that predict disease.

Authors:  Fabrice Odefrey; Jennifer Stone; Lyle C Gurrin; Graham B Byrnes; Carmel Apicella; Gillian S Dite; Jennifer N Cawson; Graham G Giles; Susan A Treloar; Dallas R English; John L Hopper; Melissa C Southey
Journal:  Cancer Res       Date:  2010-02-09       Impact factor: 12.701

2.  Familial risks, early-onset breast cancer, and BRCA1 and BRCA2 germline mutations.

Authors:  Gillian S Dite; Mark A Jenkins; Melissa C Southey; Jane S Hocking; Graham G Giles; Margaret R E McCredie; Deon J Venter; John L Hopper
Journal:  J Natl Cancer Inst       Date:  2003-03-19       Impact factor: 13.506

  2 in total
  3 in total

1.  Triglycerides as Biomarker for Predicting Systemic Lupus Erythematosus Related Kidney Injury of Negative Proteinuria.

Authors:  Mingjun Si; Danyang Li; Ting Liu; Yuanyan Cai; Jingyu Yang; Lili Jiang; Haitao Yu
Journal:  Biomolecules       Date:  2022-07-05

2.  The Impact of the Association between Cancer and Diabetes Mellitus on Mortality.

Authors:  Sung-Soo Kim; Hun-Sung Kim
Journal:  J Pers Med       Date:  2022-07-01

3.  A Novel Risk prediction Model for Patients with Combined Hepatocellular-Cholangiocarcinoma.

Authors:  Meng-Xin Tian; Wen-Jun He; Wei-Ren Liu; Jia-Cheng Yin; Lei Jin; Zheng Tang; Xi-Fei Jiang; Han Wang; Pei-Yun Zhou; Chen-Yang Tao; Zhen-Bin Ding; Yuan-Fei Peng; Zhi Dai; Shuang-Jian Qiu; Jian Zhou; Jia Fan; Ying-Hong Shi
Journal:  J Cancer       Date:  2018-02-28       Impact factor: 4.207

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.