Literature DB >> 17197246

A statistical methodology for analyzing co-occurrence data from a large sample.

Hui Cao1, George Hripcsak, Marianthi Markatou.   

Abstract

Determining important associations among items in a large database is challenging due to multiple simultaneous hypotheses and the ability to select weak associations that are statistically but not clinically significant. The simple application of the chi2 test among all possible pairs of items results in mostly inappropriate associations surpassing the traditional (alpha=.05, chi2=3.94) threshold. One can choose a stricter threshold to find stronger associations, but the choice may be arbitrary. We combined the volume test of Diaconis and Efron with a p-value plot to select a more rigorous and less arbitrary threshold. The volume test adjusts the p-value of the chi2-statistic. A plot of adjusted p-values (1 - p versus N(p)), where N(p) is the number of test statistics with a p-value greater than p, should be linear if there are no true associations. The point where the plot deviates from a line can be used as a threshold. We used linear regression to select the threshold in a reproducible fashion. In one experiment, we found that the method selected a threshold similar to that previously obtained by manually reviewing associations.

Mesh:

Year:  2006        PMID: 17197246      PMCID: PMC2041889          DOI: 10.1016/j.jbi.2006.11.003

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  1 in total

1.  Mining a clinical data warehouse to discover disease-finding associations using co-occurrence statistics.

Authors:  Hui Cao; Marianthi Markatou; Genevieve B Melton; Michael F Chiang; George Hripcsak
Journal:  AMIA Annu Symp Proc       Date:  2005
  1 in total
  19 in total

1.  Detection of practice pattern trends through Natural Language Processing of clinical narratives and biomedical literature.

Authors:  Elizabeth S Chen; Peter D Stetson; Yves A Lussier; Marianthi Markatou; George Hripcsak; Carol Friedman
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

2.  Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study.

Authors:  Xiaoyan Wang; George Hripcsak; Marianthi Markatou; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2009-03-04       Impact factor: 4.497

3.  Automated knowledge acquisition from clinical narrative reports.

Authors:  Xiaoyan Wang; Amy Chused; Noémie Elhadad; Carol Friedman; Marianthi Markatou
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

4.  Large datasets in biomedicine: a discussion of salient analytic issues.

Authors:  Anshu Sinha; George Hripcsak; Marianthi Markatou
Journal:  J Am Med Inform Assoc       Date:  2009-08-28       Impact factor: 4.497

5.  A flexible framework for deriving assertions from electronic medical records.

Authors:  Kirk Roberts; Sanda M Harabagiu
Journal:  J Am Med Inform Assoc       Date:  2011-07-01       Impact factor: 4.497

6.  PubMedMiner: Mining and Visualizing MeSH-based Associations in PubMed.

Authors:  Yucan Zhang; Indra Neil Sarkar; Elizabeth S Chen
Journal:  AMIA Annu Symp Proc       Date:  2014-11-14

7.  Exploring generalized association rule mining for disease co-occurrences.

Authors:  Rhonda Kost; Benjamin Littenberg; Elizabeth S Chen
Journal:  AMIA Annu Symp Proc       Date:  2012-11-03

8.  Using classification models for the generation of disease-specific medications from biomedical literature and clinical data repository.

Authors:  Liqin Wang; Peter J Haug; Guilherme Del Fiol
Journal:  J Biomed Inform       Date:  2017-04-20       Impact factor: 6.317

9.  Exploring clinical associations using '-omics' based enrichment analyses.

Authors:  David A Hanauer; Daniel R Rhodes; Arul M Chinnaiyan
Journal:  PLoS One       Date:  2009-04-13       Impact factor: 3.240

10.  Characterizing environmental and phenotypic associations using information theory and electronic health records.

Authors:  Xiaoyan Wang; George Hripcsak; Carol Friedman
Journal:  BMC Bioinformatics       Date:  2009-09-17       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.