BACKGROUND: Evaluation is a necessary measure to ensure the effectiveness and efficiency of all systems, including expert systems. The aim of this study was to determine the diagnostic value of expert system for diagnosis of complex skin diseases. METHODS: A case-control study was conducted in 2015 to determine the diagnostic value of an expert system. The study population included patients who were referred to Razi Specialized Hospital, affiliated to Tehran University of Medical Sciences. The control group was selected from patients without the selected skin diseases. Data collection tool was a checklist of clinical signs of diseases including pemphigus vulgaris, lichen planus, basal cell carcinoma, melanoma, and scabies. The sample size formula estimated 400 patients with skin diseases selected by experts and 200 patients without the selected skin diseases. Patient selection was undertaken with randomized stratified sampling and their sign and symptoms were logged into the system. Physician's diagnosis was determined as the gold standard and was compared with the diagnosis of expert system by SPSS software version 16 and STATA. Kappa statistics, indicators of sensitivity, specificity, accuracy and confidence intervals were calculated for each disease. An accuracy of 90% was considered appropriate. RESULTS: Comparing the results of expert system and physician's diagnosis at the evaluation stage showed an accuracy of 97.1%, sensitivity of 97.5% and specificity of 96.5% The Kappa test indicated a high agreement of 93.6%. CONCLUSION: The expert system can diagnose complex skin diseases. Development of such systems is recommended to identify all skin diseases.
BACKGROUND: Evaluation is a necessary measure to ensure the effectiveness and efficiency of all systems, including expert systems. The aim of this study was to determine the diagnostic value of expert system for diagnosis of complex skin diseases. METHODS: A case-control study was conducted in 2015 to determine the diagnostic value of an expert system. The study population included patients who were referred to Razi Specialized Hospital, affiliated to Tehran University of Medical Sciences. The control group was selected from patients without the selected skin diseases. Data collection tool was a checklist of clinical signs of diseases including pemphigus vulgaris, lichen planus, basal cell carcinoma, melanoma, and scabies. The sample size formula estimated 400 patients with skin diseases selected by experts and 200 patients without the selected skin diseases. Patient selection was undertaken with randomized stratified sampling and their sign and symptoms were logged into the system. Physician's diagnosis was determined as the gold standard and was compared with the diagnosis of expert system by SPSS software version 16 and STATA. Kappa statistics, indicators of sensitivity, specificity, accuracy and confidence intervals were calculated for each disease. An accuracy of 90% was considered appropriate. RESULTS: Comparing the results of expert system and physician's diagnosis at the evaluation stage showed an accuracy of 97.1%, sensitivity of 97.5% and specificity of 96.5% The Kappa test indicated a high agreement of 93.6%. CONCLUSION: The expert system can diagnose complex skin diseases. Development of such systems is recommended to identify all skin diseases.
In recent decades, many expert systems have been used in various branches of medicine. In fact, medicine is one of the first fields where expert systems have been used. Artificial intelligence can design decision support systems in hospitals and clinics to help professionals in making clinical decisions. These systems receive clinical information, analyze and provide conclusions as output and thereby offer the possibility to analyze and diagnose diseases and improve the quality of clinical decisions (1). MYCIN, NURSExpert, CENTAUR, DIAGNOSER, MEDI, GUIDON, MEDICS and DiagFH are some of the earliest and most successful medical expert systems (2). However, one of the major challenges facing smart systems is evaluation of their performance. Assessment is a method for analyzing data that confirms the effectiveness and efficiency of expert systems that should necessarily be applied after implementation of expert system in order to ensure the adequacy of the designed system (3). Different methods are used to evaluate the accuracy of an expert system. Sometimes, performance accuracy is measured with the gold standard (which compares the performance of a designed expert system with the specialists’ function in the specified area). One of the drawbacks of this method is that there is mostly disagreement among specialists in the diagnosis or treatment of a disease and may thus estimate the system’s accuracy to be less than the actual value. Also, selection of specialists who should participate in system assessment is also a problem, because if the system accuracy is evaluated by the specialists who had designed the knowledge base and their system output is compared with their views (as the gold standard), the system accuracy reported is usually higher. Given that the majority of expert systems have no mechanism to control the accuracy of their output, this can reduce users’ trust in the designed expert system. In particular, the design of this system is to apply them in current clinical practice to improve the quality of clinical decisions (4). As a result, the evaluation of the expert systems’ accuracy is very important. Accurate diagnosis of skin diseases is very complex, especially when there is more than one disease with similar symptoms (5). An expert system was designed to help diagnose complicated skin diseases, from experts’ point of view, including pemphigus vulgaris, lichen planus, basal cell carcinoma, melanoma, and scabies diseases (6). The aim of the current study was to determine the diagnostic value of the designed expert system for complex skin diseases with measuring sensitivity, specificity, accuracy, and kappa statistics.
2. MATERIALS AND METHODS
This case-control study was conducted in 2015 to determine the diagnostic power of the expert system. The study population included patients referred to Razi specialized hospitals, affiliated to Tehran University of Medical Sciences. The control group was selected from patients without the skin diseases. Data collection was carried out with a checklist of clinical signs of diseases, including pemphigus vulgaris, lichen planus, basal cell carcinoma, melanoma, and scabies determined based on previous studies (6). The sample size formula to estimate the sensitivity and specificity was calculated at 400 patients with skin diseases selected by experts and 200 patients without the selected skin diseases. A stratified random sample of patients’ was undertaken as follows: First, for each disease, 80 patients were randomly selected. Then, 200 medical records of patients admitted for reasons other than the above diseases were randomly selected and logged into the system.A total of 600 patients were enrolled in the study. Their medical records were studied after which signs of disease and demographic characteristics were logged into the designed system. Specialists’ diagnosis was compared with the expert system’ diagnosis by SPSS software version 16 and STATA. The specialist’s diagnosis based on clinical symptoms was considered as the gold standard. Kappa statistics, indicators of sensitivity, specificity, accuracy and confidence intervals were calculated for each disease.
3. RESULTS
Results showed that the designed system could identify 79 of 80 patients (98.8%) with pemphigus vulgaris correctly (Table 1). Comparing the results of the expert system’ diagnosis with the disease recorded in the patients’ medical records as pemphigus vulgaris showed an accuracy of 99.1%, sensitivity of 98.7% confidence interval (0.95 CI =0.96.3%-101%), and specificity of 100% confidence interval (0.95 CI =100%-100%). The Kappa test showed 98.1% agreement.
Table 1
Comparison of diagnosis result of Pemphigus Vulgaris
Comparison of diagnosis result of Pemphigus VulgarisThe expert system could correctly identify 77 of 80 patients (96.2%) with basal cell carcinoma, whose data was logged into the system (Table 2). Comparing the results of expert system’s diagnosis with the diagnosis in the medical records as basal cell carcinoma showed 95.8% accuracy, 96.2% sensitivity confidence interval (0.95 CI =0.92%-100%) and 95% specificity with confidence interval (0.95 CI =88.2%-101%). The Kappa test showed 90.7% agreement.
Table 2
Comparison of diagnosis result of Basal cell Carcinoma
Comparison of diagnosis result of Basal cell CarcinomaOf 80 patients with lichen planus, 79 patients (98.8%) were correctly identified (Table 3). Comparing the results of expert system’s diagnosis with the diagnosis in the medical records as lichen planus showed accuracy of 97.5%, sensitivity of 98.7% at 95% confidence interval (0.95 CI =0.96.3%-101%), and specificity of 95% with confidence interval (0.95 CI =88.2%-101%). The Kappa test showed 94.3% agreement.
Table 3
Comparison of diagnosis result of Lichen Planus
Comparison of diagnosis result of Lichen PlanusOf 80 patients with melanoma, 78 patients (97.5%) were diagnosed correctly (Table 4). Comparing the results of expert system’s diagnosis with the diagnosis in the medical records as melanoma showed 97.5% accuracy, 97.5% sensitivity at 95% confidence interval (0.95 CI =0.94%-100%), and 97.5% specificity at 95% confidence interval (0.95 CI =92.6%-102%). The Kappa test showed 98.1% agreement.
Table 4
Comparison of diagnosis result of Malignant Melanoma
Comparison of diagnosis result of Malignant MelanomaOf 80 patients with scabies disease, 77 patients (96.2%) were identified correctly (Table 5). Comparing the results of expert system’s diagnosis with the diagnosis in the medical records as scabies showed 95.8% accuracy, 96.2% sensitivity confidence interval (0.95 CI =0.92%-100%), and 95% specificity with confidence interval (0.95 CI =88.2%-101%). The Kappa test showed 90.7% agreement (Table 6).
Table 5
Comparison of diagnosis result of Scabies
Table 6
Comparison of diagnosis result of All Data
Comparison of diagnosis result of ScabiesComparison of diagnosis result of All DataThe designed system could correctly diagnose 390 patients of the selected 400 patients (97.5%). Comparing the results of expert system’s diagnosis with the diagnosis in data logged into the system showed 97.1% accuracy, 97.5% sensitivity at 95% confidence interval (0.95 CI =0.96%-99%), and specificity of 96.5% with at 95% confidence interval (0.95 CI =94%-99%). The Kappa test showed 93.6% agreement.
4. DISCUSSION
The current study was conducted to assess the designed expert system to diagnose complex skin diseases from the experts’ view. Comparing the results of the expert system’s diagnosis with the specialists’ diagnosis in medical records showed that it could correctly diagnose 390 patients of 400 medical records and 193 other cases of 200 medical records. An accuracy of 97.1% at 95% confidence interval (0.95 CI =0.958- 0.984), sensitivity of 97.5% at 95% confidence interval (0.95 CI =0.96-0.99) and specificity of 96.5% at 95% confidence interval (0.95 CI =0.94-0.99) were calculated. Kappa test reported 93.6% agreement.These values suggest that the expert system can correctly identify 97.5% of patients with the disease, and 96.5% of patients free of these diseases and could correctly differentiate 97.1% patients with disease from other cases. Also, in 93.6% of cases, the expert system’s diagnosis agreed with the clinical diagnosis (according to Landis and Koch, kappa statistics higher than 75% is considered great agreement).Exarchur et al. (2007) reported an accuracy of 90.4% for ischemic pulses and 94.4% for arrhythmic heart pulses in evaluation of an expert system designed for diagnosis of both conditions (7). Wolf and colleagues (2008) reported 89% accuracy in the evaluation of headache diagnosis expert system (8). Although these studies have evaluated the system, they have only measured the system’s accuracy, whereas the current study has determined parameters of sensitivity, specificity and Kappa statistics. Hota’s study (2013) entitled “The diagnosis of breast cancer using smart techniques” assessed an expert system using indicators of accuracy, specificity, and sensitivity (9). Keles and Yavuz (2011) reported accuracy of 96%, specificity of 97% and sensitivity of 76% in a study to design expert system for diagnosis of breast cancer (10). Results of the above-mentioned studies are consistent with the current study in terms of evaluation, but there are differences in the system’s accuracy and Kappa statistics which were also not calculated in the aforementioned studies.However, some studies have reported design systems that have not been evaluated. Doniz and colleagues (2007) and Belmonte and colleagues (1994) designed an expert system for skin allergy and rheumatology, respectively, but did not evaluate the system(11, 12). Also, studies (Adeli and Neshat, 2010) have reported evaluating the system after design, but the evaluation method was different from that of the current study. Adeli and Neshat (2010) developed an expert system to diagnose heart disease assessed the system function matching with specialists’ view and reported that in 94% of cases, the result detected by the system was similar to the specialists’ view (13). In this study, sensitivity and specificity of the system were also not calculated.Akter and Uddin (2009) (14), and Karabatak and Cevdet (2009) entered the data of patients’ medical records into an expert system to evaluate the proposed system and reported 95.6% accuracy (15). Nucleous and colleagues (2009) reported sensitivity of 100% and specificity of 50% in their system assessment (16), which is lower (96.5%) than that reported in the current study.Fisher et al. (2007) reported an accuracy of 100% in the evaluation of strabismus diagnosis expert system (17). The system designed by Koutsoj and Hatzilygeroudis (2004) showed 79% accuracy (18). In the studies mentioned above, sensitivity, specificity, and Kappa statistics were not reported and the accuracy of the system was higher in the present study (97.1%). Akter and Uddin (2009) (14), Karabatak and Cevdet (2009) (15), and Koutsoj and Hatzilygeroudis (2004) (18) have only calculated the system’s accuracy but did not evaluate sensitivity, specificity, and Kappa agreement. Therefore, these studies results are not consistent with the present study’s results.Given that the expert systems are only able to play a significant role in medical decisions when attention is paid to the stages of its design and assessment and that confirming the performance of these systems is difficult, it is therefore important to consider system evaluation and determine accuracy, sensitivity, and specificity. This is important since many expert systems do not have a mechanism to control the accuracy of their recommendations.In order to use these systems better, addition of the ability to link pathological and laboratory results to provide a more accurate diagnosis is recommended. The implementation and application of expert systems in teaching hospitals and providing the necessary infrastructures for better access to diagnostic expert systems are recommended using new technologies, such as personal digital assistance (PDA) and mobile phones.
5. CONCLUSION
The designed system can diagnose complex skin diseases. Development of systems to identify all skin diseases is recommended.
Authors: Themis P Exarchos; Markos G Tsipouras; Costas P Exarchos; Costas Papaloukas; Dimitrios I Fotiadis; Lampros K Michalis Journal: Artif Intell Med Date: 2007-05-31 Impact factor: 5.326
Authors: Lavinia Ferrante di Ruffano; Yemisi Takwoingi; Jacqueline Dinnes; Naomi Chuchu; Susan E Bayliss; Clare Davenport; Rubeta N Matin; Kathie Godfrey; Colette O'Sullivan; Abha Gulati; Sue Ann Chan; Alana Durack; Susan O'Connell; Matthew D Gardiner; Jeffrey Bamber; Jonathan J Deeks; Hywel C Williams Journal: Cochrane Database Syst Rev Date: 2018-12-04