Literature DB >> 21818427

Diagnostic analysis of patients with essential hypertension using association rule mining.

A Mi Shin1, In Hee Lee, Gyeong Ho Lee, Hee Joon Park, Hyung Seop Park, Kyung Il Yoon, Jung Jeung Lee, Yoon Nyun Kim.   

Abstract

OBJECTIVES: The purpose of this study was to analyze the records of patients diagnosed with essential hypertension using association rule mining (ARM).
METHODS: Patients with essential hypertension (ICD code, I10) were extracted from a hospital's data warehouse and a data mart constructed for analysis. Apriori modeling of the ARM method and web node in the Clementine 12.0 program were used to analyze patient data.
RESULTS: Patients diagnosed with essential hypertension totaled 5,022 and the diagnostic data extracted from those patients numbered 53,994. As a result of the web node, essential hypertension, non-insulin dependent diabetes mellitus (NIDDM), and cerebral infarction were shown to be associated. Based on the results of ARM, NIDDM (support, 35.15%; confidence, 100%) and cerebral infarction (support, 21.21%; confidence, 100%) were determined to be important diseases associated with essential hypertension.
CONCLUSIONS: Essential hypertension was strongly associated with NIDDM and cerebral infarction. This study demonstrated the practicality of ARM in co-morbidity studies using a large clinic database.

Entities:  

Keywords:  Data Mining; Diagnosis; Hypertension

Year:  2010        PMID: 21818427      PMCID: PMC3089860          DOI: 10.4258/hir.2010.16.2.77

Source DB:  PubMed          Journal:  Healthc Inform Res        ISSN: 2093-3681


I. Introduction

Cardiovascular and cerebrovascular diseases, along with cancer, are the three major causes of deaths. The mortality rate of diseases of the circulatory system is 117.2 per 10,000. Among diseases of the circulatory system, the mortality rate per 10,000 is in the following order: cerebrovascular disease (59.6), cardiovascular disease (43.7), and hypertensive disease (11.0) [1]. Moreover, hypertension has the highest prevalence among diseases of the circulatory system. However, over one-half of patients with hypertension are not aware of their disease, and even if they are diagnosed with hypertension, they are not compliant with the recommended management. Indeed, after being diagnosed with hypertension, approximately 20% of patients with hypertension continue with the recommended treatment as prescribed, and over 65% of patients discontinue treatment against medical advice [2-4]. Hypertension alone is not important, unlike comorbidities, such as stroke, myocardial infarction, congestive heart failure, and peripheral vascular disease. Of greatest importance, hypertension contributes to the occurrence of cerebrovascular disease (35%) and ischemic heart disease (21%) [5]. Various conventional studies have shown that hypertension is related to other diseases, such as cerebrovascular and cardiovascular diseases. However, studies demonstrating an association among co-morbidities of hypertension have not been proposed. Therefore, in this study, we determined the relationship among co-morbidities of hypertension based on association rule mining (ARM). ARM is a powerful method to analyze the association among tree and more 3 co-morbidities for the following the reasons: 1) ARM can manage the relationship of several items, and 2) the confidence value can be used in arithmetic operations [6].

II. Methods

1. Subject of Investigation

In this study, the data of inpatients over 18 years of age with essential hypertension at A hospital in D city was used. The period of data collection was from May 2005 to December 2007 using electronic medical records.

2. The Process of Study and Data Collection

The process based on ARM to analyze patients diagnosed with essential hypertension is shown in Figure 1. We collected diagnostic data of patients with essential hypertension which were classified into I10 according to International Classification of Disease (ICD) and Korea Classification of Disease (KCD) from the data warehouse (D/W). The personal information, such as name, resident registration number, and telephone number were removed from the data.
Figure 1

The analysis process.

3. Constructing Data Mart for Patients with Essential Hypertension

A total of 5,022 patients were diagnosed with essential hypertension and the total diagnostic data numbered 53,994. Moreover, high support for the disease occurred if a patient was diagnosed with the same disease several times. Therefore, we have removed duplicated data by comparing the registration number and diagnosis code. Diagnoses related to external factors, such as injury, poisoning, certain other consequences of external causes (SOO-T98), external causes of morbidity and mortality (V01-Y98), factors influencing health status and contact with health services (Z00-Z99), and codes for special purposes (U00-U99) have been removed from the data mart. Data mart with 26,823 cases was constructed and used for correlation analysis.

4. Analysis Method

The statistical analysis program, SPSS Clementine 12.0 (SPSS Inc., Chicago, IL, USA), was used. Frequency analysis was performed on gender, age, and other diseases of the patients with hypertension. Moreover, Apriori modeling and web node were performed to analyze the strengths of associations among hypertension and other diseases. Web node is a visualization tool to represent the relationship between items, and Apriori modeling is a modeling method of ARM that makes it possible to apply binominal or multi-nominal data types. ARM is used to analyze the tendency of how often item A and item B occur together. Then the support is defined as the percentage of transactions that contains diagnosis case 1 (Dx1) and diagnosis case 2 (Dx2), and may be regarded as P (Dx1∪Dx2) which is direction-independent. The confidence is defined as the ratio of the support of the item set (Dx1∪Dx2) to the support of the item set, Dx1, which roughly corresponds to the conditional probability, P (Dx1|Dx2), and is direction-dependent. In terms of epidemiology, the support resembles the prevalence rate of Dx1 and Dx2 within a certain period of time. The confidence, ratio of the co-occurrence rate of Dx1 and Dx2 over the prevalence of Dx1, resembles the co-morbidity of Dx2 with Dx1 within the same period of time, but is direction-dependent. As a result of the Apriori modeling, association rules are evaluated on the values of support and confidence [6-9]:

III. Results

1. Patient Gender and Age Distribution

The data consisted of 2,508 males (49.94%) and 2,514 females (50.06%) for a total 5,022 patients. Moreover, in the age distribution, the patients over 70 years of age were the most frequent (1,882; 37.48%), and the patients between over 18 years and less 29 years of age were the least frequent (35; 0.70%), as shown in Figure 2. The mean and standard deviation was 65 ± 11 years of age.
Figure 2

The distribution of patients according to age.

2. Distribution of Other Diseases in Patients with Essential Hypertension

The frequency of other diseases in patients with essential hypertension is shown in Table 1. Non-insulin-dependent diabetes mellitus (E11) was the most frequent disease (1,765 patients), and cerebral infarction (I63), angina pectoris (I20), and chronic renal failure (N18) showed a high frequency in that order. In the case of distribution of diseases according to gender, non-insulin-dependent diabetes mellitus was the most frequent disease, and cerebral infarction and angina pectoris showed a high frequency as well. In case of males, acute myocardial infarction (I21) and gastric ulcer (K25) had a statistically significant difference (p < 0.05), although cerebral infarction (I63), angina pectoris (I20), chronic renal failure (N18), acute myocardial infarction (I21), gastric ulcer (K25), and prostatic hyperplasia (N40) had a higher frequency than females. In the case of females, gastritis and duodenitis (K29), heart failure (I50), and osteoporosis without pathologic fractures (M81) had a statistically significant difference (p < 0.05); non-insulin-dependent diabetes mellitus (E11), gastritis and duodenitis (K29), disorders of lipoprotein metabolism and other lipidaemias (E78), hemiplegia (G81), heart failure (I50), and osteoporosis without pathologic fractures (M81) showed a higher frequency than males.
Table 1

The distribution of other diseases in the patients with hypertension

3. Result Visualization by Web Node

Figure 3 shows the results of the relationship among essential hypertension and high frequency diseases listed in Table 1 using web node. Co-morbid diseases were linked with each other. From the results shown in Figure 3, essential hypertension was linked with non-insulin-dependent diabetes mellitus and cerebral infarction, and non-insulin-dependent diabetes mellitus was linked with cerebral infarction. Therefore, it was shown that non-insulin-dependent diabetes mellitus and cerebral infarction have a relationship with essential hypertension. Other diseases, such as disorders of lipoprotein metabolism and other lipidaemias (E78) and acute myocardial infarction (I21), did not have a relationship with essential hypertension.
Figure 3

Association graph by using web node. E11: non-insulin-dependent diabetes mellitus, E87: other disorders of fluid, electrolytes and acid-base balance, I10: essential hypertension, I21: acute myocardial infarction, I50: heart failure, I63: cerebral infarction, K21: gastrooesophageal reflux disease, N18: chronic renal failure.

4. Results of ARM Using the Apriori Modeling

Based on the results of the Apriori modeling, the association rules among essential hypertension and specific diseases are shown in Table 2. We extracted 8 association rules and the used threshold values were as follows: support, ≥5%; and confidence, ≥15%. The rule with the highest support and confidence was 'non-insulin-dependent diabetes mellitus to essential hypertension', which had confidence and support values of 100% and 35.15%, respectively. The second rule was 'cerebral infarction to essential hypertension', which had confidence and support values of 100% and 21.19%, respectively. The third rule was 'essential hypertension and cerebral infarction to non-insulin-dependent diabetes mellitus', which had confidence and support values of 37.31% and 7.91%, respectively. The rule for 'essential hypertension and non-insulin-dependent diabetes mellitus to cerebral infarction' had confidence and support values of 22.49% and 7.91%, respectively. The other rules for 'essential hypertension and non-insulin-dependent diabetes mellitus to angina pectoris' and 'essential hypertension and non-insulin-dependent diabetes mellitus to chronic renal failure' had a confidence less than 20%.
Table 2

The result of a priori modeling application

E11: non-insulin-dependent diabetes mellitus, I10: essential hypertension, I63: cerebral infarction, I20: angina pectoris, N18: chronic renal failure.

IV. Discussion

This study aimed to analyze the association among essential hypertension and other diseases using the Apriori modeling, which is a popular and powerful method in data mining [10]. In this study, we used 53,994 diagnoses data extracted from the D/W accumulated based on electronic medical records. Therefore, using the D/W was possible to analyze massive data, which was different from an epidemiologic study by reviewing paper-based medical records or a prospective study [11]. Moreover, this study was meaningful to analyze the association among essential hypertension and various comorbid diseases. Hypertension is known as a risk factor for diabetes mellitus, cardiovascular disease, and cerebrovascular disease. In this study, the results based on web node showed that essential hypertension, non-insulin-dependent diabetes mellitus, and cerebral infarction have a relationship with each other. Based on the results of the Apriori modeling, the association rule for 'essential hypertension to non-insulin-dependent diabetes mellitus' had the highest confidence and support. Thus, essential hypertension and non-insulin-dependent diabetes mellitus were associated with one another. Lee and Park [12] stated that 39% of first-diagnosed diabetes mellitus patients had co-morbid hypertension. The patients with hypertension had a 2.5-fold higher prevalence than people with normal blood pressure, and hypertension occurred 3-fold higher in patients with diabetes mellitus. Therefore, the patients with either hypertension or diabetes mellitus need to care for both blood pressure and blood glucose together because the comorbidity of hypertension and diabetes mellitus could be the basis for the increased clinical attack rate of cardiovascular disease and cerebral infarction, myocardial infarction, heart failure, and renal failure [13]. In another study [12] it was reported that >80% of patients with diabetic microangiopathy or diabetic nephropathy had hypertension as a co-morbidity, and patients with hypertension had a 2-fold higher attack rate for coronary artery disease and a 2-6-fold higher attack rate for cerebrovascular disease than non-diabetics of the same age group [12]. In this study, we investigated the relationship between essential hypertension, non-insulin-dependent diabetes, and other diseases based on ARM. Based on the results, we showed that essential hypertension and non-insulin-dependent diabetes influenced co-morbid cerebral infarction, angina pectoris, and chronic renal failure. We have applied ARM to a large electron medical record data base of patients with hypertension to analyze the association with co-morbid diseases. However, the data that we used in this study were the inpatients' clinical records of the one hospital located in D city. Moreover, it was difficult to analyze sequential patterns because patients were diagnosed with several diseases at the same time in some cases. Therefore, studies based on data collected from various hospitals to find out general and sequential rules will be the subject of further studies.
  3 in total

1.  Smoking and atherosclerotic cardiovascular disease in men with low levels of serum cholesterol: the Korea Medical Insurance Corporation Study.

Authors:  S H Jee; I Suh; I S Kim; L J Appel
Journal:  JAMA       Date:  1999-12-08       Impact factor: 56.272

2.  Comorbidity study of ADHD: applying association rule mining (ARM) to National Health Insurance Database of Taiwan.

Authors:  Yueh-Ming Tai; Hung-Wen Chiu
Journal:  Int J Med Inform       Date:  2009-10-22       Impact factor: 4.046

3.  Seventh report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure.

Authors:  Aram V Chobanian; George L Bakris; Henry R Black; William C Cushman; Lee A Green; Joseph L Izzo; Daniel W Jones; Barry J Materson; Suzanne Oparil; Jackson T Wright; Edward J Roccella
Journal:  Hypertension       Date:  2003-12-01       Impact factor: 10.190

  3 in total
  11 in total

1.  Survival association rule mining towards type 2 diabetes risk assessment.

Authors:  Gyorgy J Simon; John Schrom; M Regina Castro; Peter W Li; Pedro J Caraballo
Journal:  AMIA Annu Symp Proc       Date:  2013-11-16

2.  Comorbidity study on type 2 diabetes mellitus using data mining.

Authors:  Hye Soon Kim; A Mi Shin; Mi Kyung Kim; Yoon Nyun Kim
Journal:  Korean J Intern Med       Date:  2012-05-31       Impact factor: 2.884

3.  Prevalence and comorbidities of known diabetes in northeastern Italy.

Authors:  Francesca Valent; Silvia Tillati; Loris Zanier
Journal:  J Diabetes Investig       Date:  2013-02-21       Impact factor: 4.232

4.  Quantitative population-health relationship (QPHR) for assessing metabolic syndrome.

Authors:  Apilak Worachartcheewan; Chanin Nantasenamat; Chartchalerm Isarankura-Na-Ayudhya; Virapong Prachayasittikul
Journal:  EXCLI J       Date:  2013-06-26       Impact factor: 4.068

5.  Comorbidity study of borderline personality disorder: applying association rule mining to the Taiwan national health insurance research database.

Authors:  Cheng-Che Shen; Li-Yu Hu; Ya-Han Hu
Journal:  BMC Med Inform Decis Mak       Date:  2017-01-11       Impact factor: 2.796

6.  Can process mining automatically describe care pathways of patients with long-term conditions in UK primary care? A study protocol.

Authors:  Ian Litchfield; Ciaron Hoye; David Shukla; Ruth Backman; Alice Turner; Mark Lee; Phil Weber
Journal:  BMJ Open       Date:  2018-12-04       Impact factor: 2.692

7.  Fluvoxamine treatment response prediction in obsessive-compulsive disorder: association rule mining approach.

Authors:  Hesam Hasanpour; Ramak Ghavamizadeh Meibodi; Keivan Navi; Jamal Shams; Sareh Asadi; Abolhassan Ahmadiani
Journal:  Neuropsychiatr Dis Treat       Date:  2019-04-10       Impact factor: 2.570

8.  Improving rule-based classification using Harmony Search.

Authors:  Hesam Hasanpour; Ramak Ghavamizadeh Meibodi; Keivan Navi
Journal:  PeerJ Comput Sci       Date:  2019-11-18

9.  Machine learning approaches for discerning intercorrelation of hematological parameters and glucose level for identification of diabetes mellitus.

Authors:  Apilak Worachartcheewan; Chanin Nantasenamat; Pisit Prasertsrithong; Jakraphob Amranan; Teerawat Monnor; Tassaneya Chaisatit; Wilairat Nuchpramool; Virapong Prachayasittikul
Journal:  EXCLI J       Date:  2013-10-21       Impact factor: 4.068

10.  Evaluation of rational nonsteroidal anti-inflammatory drugs and gastro-protective agents use; association rule data mining using outpatient prescription patterns.

Authors:  Oraluck Pattanaprateep; Mark McEvoy; John Attia; Ammarin Thakkinstian
Journal:  BMC Med Inform Decis Mak       Date:  2017-07-04       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.