Literature DB >> 25574181

Interobserver reliability of four diagnostic methods using traditional korean medicine for stroke patients.

Ju Ah Lee1, Mi Mi Ko1, Byoung-Kab Kang1, Terje Alraek2, Stephen Birch3, Myeong Soo Lee1.   

Abstract

Objective. The aim of this study is to evaluate the consistency of pattern identification (PI), a set of diagnostic indicators used by traditional Korean medicine (TKM) clinicians. Methods. A total of 168 stroke patients who were admitted into oriental medical university hospitals from June 2012 through January 2013 were included in the study. Using the PI indicators, each patient was independently diagnosed by two experts from the same department. Interobserver consistency was assessed by simple percentage agreement as well as by kappa and AC1 statistics. Results. Interobserver agreement on the PI indicators (for all patients) was generally high: pulse diagnosis signs (AC1 = 0.66-0.89); inspection signs (AC1 = 0.66-0.95); listening/smelling signs (AC1 = 0.67-0.88); and inquiry signs (AC1 = 0.62-0.94). Conclusion. In four examinations, there was moderate agreement between the clinicians on the PI indicators. To improve clinician consistency (e.g., in the diagnostic criteria used), it is necessary to analyze the reasons for inconsistency and to improve clinician training.

Entities:  

Year:  2014        PMID: 25574181      PMCID: PMC4276114          DOI: 10.1155/2014/465471

Source DB:  PubMed          Journal:  Evid Based Complement Alternat Med        ISSN: 1741-427X            Impact factor:   2.629


1. Introduction

In traditional Korean medicine (TKM) and traditional Chinese medicine (TCM), the diagnostic process is called pattern identification (PI) or syndrome differentiation [1]. TKM or TCM clinicians use the PI system to diagnose the cause, nature, and location of the illness as well as the patient's physical condition and the patient's treatment; they also determine the appropriate treatment (e.g., acupuncture, herbal medicine, and moxibustion) [2]. Therefore, the PI system plays an important role in TCM and TKM. The PI system is a synthetic and analytical process that analyzes information obtained from four examinations. The term “four examinations” is a general term that includes visual inspection, listening and smelling, inquiry, and pulse diagnosis [1]. To successfully perform PI, an objective and precise process using the four examinations is essential. However, the clinical competence of this process is determined by the experience and the knowledge of the clinicians. Several environmental factors, such as the differences between light sources and brightness levels, can significantly influence the visual inspection. Additionally, subjective factors, such as the patient's emotion and the clinician's interrogatory approach or technical skills, can significantly influence the examination. Pulse diagnosis is also determined by the clinician's experience and knowledge [3]. Further, many experiences in the traditional four examinations have not been scientifically or quantitatively verified. Therefore, additional studies are required to improve the reproducibility and objectivity of the TCM and TKM diagnostic processes. Interobserver reproducibility is regarded as one of the foundations of high quality research design [4]. Many common clinical symptoms and signs fail to overcome the lack of reliability limitations when they are subjected to an interobserver study [5]. Previous reports have described the interobserver reliability of pulse diagnosis, tongue diagnosis, and PI for stroke patients [5-9]. However, the actual diagnoses are conducted by pooling information from the four diagnostic methods [9]. Therefore, in this study, we investigated the reliability of the TKM four examinations with stroke patients by evaluating the interobserver reliability regarding how these indicators demonstrated the signs or symptoms that were observed by TKM clinicians.

2. Methods

2.1. Participants

Data for this analysis were collected from a multicenter study of the standardization and objectification of pattern identification in traditional Korean medicine for stroke (SOPI-Stroke) [6, 10, 11]. Stroke patients were admitted between June 2012 and January 2013 to the following oriental medical university hospitals: Kyung Hee Oriental Medical Center (Seoul), Kang dong Kyung Hee Medical Center (Seoul), Daejeon Oriental Medical Hospital (Daejeon), and Dong-eui Oriental Medical Hospital (Pusan) (Figure 1). All patients provided informed consent, according to the procedures that were approved by the institutional review boards (IRBs) at the participating institutions. The following inclusion criteria were applied. The participants had to be enrolled in the study as stroke patients within 30 days of the onset of their symptoms, as confirmed by imaging diagnosis, such as computerized tomography (CT) or magnetic resonance imaging (MRI). Traumatic stroke patients, such as those with subarachnoid, subdural, or epidural hemorrhage, were excluded from the study. The present study was approved by the IRB of the Korean Institute of Oriental Medicine (KIOM) and by each of the oriental medical university hospitals.
Figure 1

Flow diagram of patients enrolled in the study.

In particular, the clinicians had to measure stroke PI of each patient following the fire-heat pattern, the phlegm-dampness pattern, the qi deficiency pattern, and the yin deficiency pattern, as suggested by the KIOM [5].

2.2. Data Processing and Analysis

All patients were examined by two experts (from the same TKM department) who were well trained in standard operation procedures (SOPs). The patients were subjected to the following diagnoses: pulse diagnosis (pulse location: floating or sunken, pulse rate: slow or rapid, pulse force: strong or weak, and pulse shape: slippery, fine, or surging); inspection (tongue: color, fur color, fur quality, special tongue appearance, facial complexion, abnormal eye appearance, body type, mouth, and vigor); listening and smelling (vocal sound energy and sputum, tongue and mouth, and particularly fetid mouth odor); and inquiry (headache, tongue and mouth: dry mouth and thirst in the mouth, temperature, chest, sleep, sweating, urine, and vigor). The examination parameters were extracted from portions of a case report form (CRF) for the PI for stroke, which was developed by an expert committee organized by the KIOM. These assessments were individually and independently conducted without discussion among the clinicians. The descriptions for grading the severity of each variable were scored as follows: 1 = very significant; 2 = significant; and 3 = not significant. Interobserver reliability was measured using the simple percentage agreement, Cohen's kappa coefficient, and Gwet's AC1 statistic [12] as well as the corresponding confidence intervals (CI). For most purposes, kappa values ≤0.40 represent poor agreement, values between 0.40 and 0.75 represent moderate-to-good agreement, and values ≥0.75 indicate excellent agreement [13]. The AC1 statistic is not vulnerable to the well-known paradoxes that make kappa appear to be ineffective [12, 14, 15]. Data were statistically analyzed using SAS software, version 9.1.3 (SAS Institute Inc., Cary, NC, USA).

3. Results

The general characteristics of the study subjects are shown in Table 1. The interobserver reliability results regarding pulse diagnosis domain for all subjects (n = 168) are shown in Table 2. The kappa value measures of agreement for the two experts ranged from “poor” (κ = 0.37) to “moderate” (κ = 0.61). The AC1 measures of agreement for the two experts were generally high for pulse diagnosis domain and ranged from 0.66 to 0.89.
Table 1

Demographic parameters of study subjects.

Characteristics
N 168
Sex (M/F)75/93
Age (mean ± SD)68.89 ± 10.92
Weight (kg) (mean ± SD)61.02 ± 11.07
Height (cm) (mean ± SD)161.15 ± 9.04
BMI (mean ± SD)23.41 ± 3.26
WHR (mean ± SD)0.93 ± 0.07
WC (cm) (mean ± SD)85.76 ± 10.08
HC (cm) (mean ± SD)92.50 ± 7.28
TOAST classification
 LAA46
 CE6
 SVO113
 SOE1
 Others2
Hypertension (yes/no)103/65
Hyperlipidemia (yes/no)25/143
DM (yes/no)47/121
Smoking (none/stop/active)109/18/39
Drinking (none/stop/active)104/8/55

BMI: body mass index. WHR: waist hip ratio. WC: waist circumference. HC: hip circumference. TOAST: trial of ORG 10172 in acute stroke Treatment. LAA: large-artery atherosclerosis. CE: cardioembolism. SVO: small-vessel occlusion. SOE: stroke of other etiology. SUE: stroke of undetermined etiology. DM: diabetes mellitus.

Table 2

Agreement between raters in total subjects (diagnosis by palpation; pulse diagnosis).

Variables% AgreementKappa (K)CI of KAC1 CI of AC1
Pulse location:
 Floating88.020.53(0.35, 0.71)0.84(0.77, 0.91)
 Sunken85.020.56(0.41, 0.72)0.77(0.68, 0.87)
Pulse rate:
 Slow90.410.5(0.29, 0.72)0.88(0.82, 0.94)
 Rapid80.830.56(0.43, 0.70)0.66(0.55, 0.78)
Pulse force:
 Strong81.920.51(0.35, 0.66)0.72(0.61, 0.82)
 Weak86.140.61(0.47, 0.76)0.78(0.69, 0.88)
Pulse shape:
 Slippery pulse77.840.51(0.38, 0.65)0.71(0.63, 0.80)
 Fine pulse73.650.37(0.22, 0.52)0.67(0.58, 0.76)
 Surging pulse 90.360.52(0.32, 0.72)0.89(0.84, 0.95)

CI: confidence interval.

The interobserver reliability results regarding visual inspection domain for all subjects are shown in Table 3. The kappa value measures of agreement for the two experts ranged from “poor” (κ = 0.26) to “moderate” (κ = 0.84). The AC1 measures of agreement for the two experts were generally high for the inspection signs and ranged from 0.66 to 0.95. The interobserver agreement was nearly perfect for several signs (e.g., mirror tongue and aphtha and sores of tongue/mouth indicators, AC1 = 0.95 and AC1 = 0.91).
Table 3

Agreement between raters in total subjects (diagnosis by visual inspection).

Variables% AgreementKappa (K)CI of KAC1 CI of AC1
Tongue color:
 Pale78.310.54(0.41, 0.67)0.72(0.63, 0.80)
 Red77.840.64(0.54, 0.74)0.68(0.59, 0.77)
Fur color:
 White fur76.640.58(0.46, 0.70)0.68(0.59, 0.77)
 Yellow fur79.510.57(0.45, 0.70)0.73(0.65, 0.82)
Fur quality:
 Thick fur83.830.54(0.39, 0.68)0.80(0.73, 0.88)
 Dry fur77.240.33(0.16, 0.50)0.73(0.64, 0.81)
Special tongue appearance:
 Teeth marked84.930.26(0.04, 0.48)0.83(0.77, 0.90)
 Enlarged84.430.41(0.22, 0.60)0.82(0.75, 0.89)
 Mirror95.80.76(0.60, 0.93)0.95(0.92, 0.99)

Facial complexion:
 Reddened complexion84.330.70(0.60, 0.80)0.79(0.71, 0.87)
 Dark face discoloration83.830.68(0.57, 0.79)0.78(0.71, 0.86)
 White complexion83.130.48(0.32, 0.64)0.80(0.73, 0.87)
 Pale face and red zygomatic site 87.950.58(0.42, 0.75)0.86(0.80, 0.92)
 Dark inferior palpebra84.430.47(0.30, 0.64)0.82(0.75, 0.89)

Eye's abnormal condition:
 Congestive eyes86.820.65(0.52, 0.79)0.84(0.77, 0.90)

Body type:
 Underweight87.420.69(0.57, 0.81)0.79(0.70, 0.88)
 Overweight93.410.84(0.75, 0.93)0.89(0.82, 0.96)

Tong and mouth:
 Aphtha and tongues sores92.160.55(0.34, 0.77)0.91(0.87, 0.96)
Vigor
 Look powerless and lazy77.240.65(0.55, 0.75)0.66(0.57, 0.76)

CI: confidence interval.

The interobserver reliability results regarding the listening and smelling domain for all subjects are shown in Table 4. The kappa value measures of agreement for the two experts were “moderate” (κ = 0.60). The AC1 measures of agreement for the two experts were generally high for the observation signs and ranged from 0.67 to 0.88.
Table 4

Agreement between raters in total subjects (diagnosis by listening and smelling).

Variables% AgreementKappa (K)CI of KAC1 CI of AC1
Vocal sound energy:
 Disinclined to speak or speaking at a low volume76.640.61(0.5, 0.71)0.67(0.57, 0.76)
Sputum
 Phlegm rale90.410.74(0.63, 0.86)0.88(0.83, 0.94)
Tongue and mouth:
 Fetid mouth odor84.930.60(0.47, 0.74)0.81(0.74, 0.89)

CI: confidence interval.

The interobserver reliability results regarding the inquiry domain for all subjects are shown in Table 5. The kappa value measures of agreement for the two experts ranged from “poor” (κ = 0.27) to “moderate” (κ = 0.76). The AC1 measures of agreement for the two experts were generally high for the inquiry signs and ranged from 0.62 to 0.94. Agreement, as assessed by the kappa values, was considerably lower than the AC1 values in the majority of cases.
Table 5

Agreement between raters in total subjects (diagnosis by inquiry).

Variables% AgreementKappa (K)CI of KAC1 CI of AC1
Headache:
 Hot flush in head89.150.74(0.63, 0.85)0.86(0.80, 0.93)
 An unpleasant sensation with an urge to vomit69.870.27(0.12, 0.42)0.62(0.53, 0.72)
Tongue and mouth:
 Dry mouth80.120.68(0.58, 0.78)0.71(0.62, 0.80)
 Thirst in the mouth79.510.63(0.52, 0.75)0.72(0.63, 0.80)
Temperature:
 Aversion to heat81.320.62(0.50, 0.73)0.75(0.67, 0.84)
 Vexing heat in the extremities90.360.46(0.24, 0.67)0.94(0.90, 0.98)
 Heat in the palmar and plantar93.970.56(0.31, 0.81)0.89(0.84, 0.95)
 Reversal cold of the extremities90.360.64(0.47, 0.80)0.89(0.84, 0.94)
 Afternoon tidal fever91.560.52(0.30, 0.74)0.91(0.86, 0.96)
Chest:
 Heat vexation in the chest87.950.76(0.66, 0.85)0.84(0.77, 0.91)
Sleep:
 Vexation and insomnia81.920.63(0.52, 0.75)0.76(0.68, 0.84)
Sweating:
 Night sweating89.690.70(0.57, 0.83)0.88(0.82, 0.93)
Urine:
 Turbid urine84.820.70(0.59, 0.81)0.80(0.72, 0.88)
Vigor:
 Like to lie down83.130.72(0.62, 0.81)0.76(0.68, 0.84)
 Feel powerless and lazy77.10.64(0.54, 0.74)0.66(0.57, 0.76)

CI: confidence interval.

4. Discussion

Recently, several studies have investigated the importance of education in the PI process [16, 17]. Additionally, several studies have focused on the reliability of a clinician's decision regarding PI [4, 18–20]. However, PI is achieved by comprehensively analyzing the signs or symptoms of the four examinations and it refers to a comprehensive consideration of the data obtained from these examinations [1]. Therefore, it is necessary to check the reliability among clinicians for each sign or symptom that is used to diagnose PI. Very few studies reported about importance of diagnostic variables in the four examinations [21-23]. This study aimed to use AC1 and kappa statistics to assess the interobserver reliability of the signs or symptoms of PI in stroke patients. Finally, we aimed to improve the objectivity and reproducibility of the PI decisions among clinicians. For convenience, all signs and symptoms are referred to as indicators. Palpation means touching and pressing the body surface using the fingers to diagnose the pulse diagnosis [1]. Regarding interobserver agreement for pulse diagnosis among all subjects, we found that one item (fine pulse) had a poor kappa value; however, 8 items had moderate-to-good values. In particular, fine pulse had a poor value compared to other items of kappa value; but it did not have a poor value for the percentage agreement and AC1. We realized that many clinicians checked “3 = not significant” because of difficulties in detecting low-frequency appearance. Therefore, contrary to the kappa value, in the percentage agreement and AC1, there were high values (93.29%, 0.93), respectively. Pulse diagnosis has many limitations because the clinical skill of four diagnoses depends on the clinician's experience and knowledge; moreover, environmental factors have a considerable influence on the clinician's willingness. However, the results in this study showed that pulse diagnosis has good agreement. Visual inspection means observing the patient's mental state, facial expression, complexion, and physical condition as well as the condition of the tongue [1]. Regarding interobserver inspection agreement, we found that two items (dry fur and teeth marked tongue) had poor kappa values. However, the other items had moderate-to-good values. Tongue diagnosis is the inspection of the size, shape, color, and moisture of the tongue proper and its coating [1]. Several studies have emphasized the interobserver reliability among clinicians regarding tongue diagnosis [24, 25]. Inspection, including tongue diagnosis, has unavoidable limitations because the clinical skills of observation and diagnosis depend on the clinician's experience and knowledge, and environmental factors can influence whether the clinician can obtain diagnostic results from the patient's body. Therefore, to improve the consistency of inspection, it is necessary to standardize the process and inspection skills. The listening and smelling diagnosis constitutes one of the four examinations. Listening specifically focuses on listening to the patient's voice, breathing sounds, cough, vomiting, and so forth. Smelling is the smell from a patient's body or mouth [1]. Regarding interobserver agreement of listening and smelling diagnosis among all subjects, we found that 3 items had moderate-to-good values. Numerous studies have scored the listening and smelling diagnosis low compared with the other examinations. Therefore, additional studies of the listening and smelling diagnosis are warranted. Inquiry, which is one of the four diagnostic examinations, is used to gain information concerning diagnosis by asking the patient about the complaint and the history of the illness [1]. We found that one inquiry item (an unpleasant sensation with an urge to vomit) had a poor kappa value. Although there were no large differences among the diagnoses, pulse diagnosis had a low AC1 value. However, the results are better than those reported in a previous study [7, 8]. It is thought that clinicians have been trained in SOPs many times for this diagnosis. In this study, simple percentage agreements and kappa value and AC1 statistics were used to evaluate the interobserver reliability of TKM clinicians for PI indicators in stroke patients. When investigating observer agreement, clinicians have long used kappa values and other chance-adjusted measures, with a commonly used scale for interpreting kappa [26]. However, the appropriateness of kappa value as a measure of agreement has recently been debated [14, 15]. According to published research, the AC1 statistic has been suggested to adjust for chance agreement [12, 27]. In TKM and TCM, the primary problem is the reproducibility of the diagnosis and the lack of objectivity. To solve these problems, interobserver reliability of PI should be increased. Thus, the interobserver reliability of indicators should be increased. To overcome these issues in the larger stroke study, the researchers regularly conducted SOPs training, and shortcomings were identified. Therefore, it is necessary that diagnostic indicators should be standardized to improve agreement among clinicians. As a result of these efforts, standardization of the TCM and TKM diagnosis will likely be achieved in the near future. In this study, there are a few limitations. First, only two raters were included in this study. Second, this study project focused on certain kinds of signs and symptoms relevant for stroke. Therefore, the study is limited on the generalizability of findings to the general field of TCM/TKM.
  20 in total

1.  Understanding the reliability of diagnostic variables in a Chinese Medicine examination.

Authors:  Kylie A O'Brien; Estelle Abbas; Jiansheng Zhang; Zhi-Xin Guo; Ruizhi Luo; Alan Bensoussan; Paul A Komesaroff
Journal:  J Altern Complement Med       Date:  2009-07       Impact factor: 2.579

2.  High agreement but low kappa: II. Resolving the paradoxes.

Authors:  D V Cicchetti; A R Feinstein
Journal:  J Clin Epidemiol       Date:  1990       Impact factor: 6.437

3.  Reliability of Chinese medicine diagnostic variables in the examination of patients with osteoarthritis of the knee.

Authors:  Bin Hua; Estelle Abbas; Alan Hayes; Peter Ryan; Lisa Nelson; Kylie O'Brien
Journal:  J Altern Complement Med       Date:  2012-08-16       Impact factor: 2.579

4.  Developing indicators of pattern identification in patients with stroke using traditional Korean medicine.

Authors:  Ju Ah Lee; Tae-Yong Park; Jungsup Lee; Tae-Woong Moon; Jiae Choi; Byoung-Kab Kang; Mi Mi Ko; Myeong Soo Lee
Journal:  BMC Res Notes       Date:  2012-03-13

5.  Traditional Chinese medicine diagnoses in a sample of women with fibromyalgia.

Authors:  Scott D Mist; Cheryl L Wright; Kim Dupree Jones; James W Carson
Journal:  Acupunct Med       Date:  2011-10-25       Impact factor: 2.267

6.  [Rapid diagnosis of TCM syndrome based on spectrometry].

Authors:  Ling Lin; Jing Zhang; Jing Zhao; Gang Li; Bao-ju Zhang; Yin Tong
Journal:  Guang Pu Xue Yu Guang Pu Fen Xi       Date:  2011-03       Impact factor: 0.589

7.  [Tongue temperature of healthy persons and patients with yin deficiency by using thermal video].

Authors:  S Q Zhang
Journal:  Zhong Xi Yi Jie He Za Zhi       Date:  1990-12

8.  Traditional Chinese medicine diagnoses in persons with ketamine abuse.

Authors:  Waikwong Tang; Ming Lam; Wingnang Leung; Waizhu Sun; Tszting Chan; Gabor S Ungvari
Journal:  J Tradit Chin Med       Date:  2013-04       Impact factor: 0.848

9.  Reliability and validity of the Korean Standard Pattern Identification for Stroke (K-SPI-Stroke) questionnaire.

Authors:  Byoung-Kab Kang; Tae-Yong Park; Ju Ah Lee; Tae-Woong Moon; Mi Mi Ko; Jiae Choi; Myeong Soo Lee
Journal:  BMC Complement Altern Med       Date:  2012-04-26       Impact factor: 3.659

10.  A Study of Tongue and Pulse Diagnosis in Traditional Korean Medicine for Stroke Patients Based on Quantification Theory Type II.

Authors:  Mi Mi Ko; Tae-Yong Park; Ju Ah Lee; Byoung-Kab Kang; Jungsup Lee; Myeong Soo Lee
Journal:  Evid Based Complement Alternat Med       Date:  2013-04-11       Impact factor: 2.629

View more
  1 in total

1.  A 3D Wrist Pulse Signal Acquisition System for Width Information of Pulse Wave.

Authors:  Chuanglu Chen; Zhiqiang Li; Yitao Zhang; Shaolong Zhang; Jiena Hou; Haiying Zhang
Journal:  Sensors (Basel)       Date:  2019-12-18       Impact factor: 3.576

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.