Fatemeh Rangraz Jeddi1, Masoud Arabfard2, Zahra Arab Kermany3. 1. Faculty of Paramedical Sciences, Kashan University of Medical Science, Kashan, Iran. 2. Tehran University, Kish International Campus, Islamic Republic of Iran. 3. Health Information Technology Depertment, Kashan University of Medical Science, Kashan, Iran.
Abstract
INTRODUCTION: Intelligent Diagnostic Assistant can be used for complicated diagnosis of skin diseases, which are among the most common causes of disability. The aim of this study was to design and implement a computerized intelligent diagnostic assistant for complicated skin diseases through C5's Algorithm. METHOD: An applied-developmental study was done in 2015. Knowledge base was developed based on interviews with dermatologists through questionnaires and checklists. Knowledge representation was obtained from the train data in the database using Excel Microsoft Office. Clementine Software and C5's Algorithms were applied to draw the decision tree. Analysis of test accuracy was performed based on rules extracted using inference chains. The rules extracted from the decision tree were entered into the CLIPS programming environment and the intelligent diagnostic assistant was designed then. RESULTS: The rules were defined using forward chaining inference technique and were entered into Clips programming environment as RULE. The accuracy and error rates obtained in the training phase from the decision tree were 99.56% and 0.44%, respectively. The accuracy of the decision tree was 98% and the error was 2% in the test phase. CONCLUSION: Intelligent diagnostic assistant can be used as a reliable system with high accuracy, sensitivity, specificity, and agreement.
INTRODUCTION: Intelligent Diagnostic Assistant can be used for complicated diagnosis of skin diseases, which are among the most common causes of disability. The aim of this study was to design and implement a computerized intelligent diagnostic assistant for complicated skin diseases through C5's Algorithm. METHOD: An applied-developmental study was done in 2015. Knowledge base was developed based on interviews with dermatologists through questionnaires and checklists. Knowledge representation was obtained from the train data in the database using Excel Microsoft Office. Clementine Software and C5's Algorithms were applied to draw the decision tree. Analysis of test accuracy was performed based on rules extracted using inference chains. The rules extracted from the decision tree were entered into the CLIPS programming environment and the intelligent diagnostic assistant was designed then. RESULTS: The rules were defined using forward chaining inference technique and were entered into Clips programming environment as RULE. The accuracy and error rates obtained in the training phase from the decision tree were 99.56% and 0.44%, respectively. The accuracy of the decision tree was 98% and the error was 2% in the test phase. CONCLUSION: Intelligent diagnostic assistant can be used as a reliable system with high accuracy, sensitivity, specificity, and agreement.
The risk of skin diseases have been reported to be up to 80% (1). According to the World Health Organization’s report in 2013, the disease burden in 15 categories of skin diseases was 3-7 million (2). Utility criteria that reported the disease burden to be in the range of zero to one have been reported to be 0.640 to 1.000 for skin diseases such as pemphigus. This rate was 0.650 for HIV and 0.740 for kidney diseases (1). Skin diseases create additional problems because of their special features such as physical deformity that necessitate early detection and treatment of these patients, but the diagnosis of skin diseases is a complicated task. In such a case, widespread knowledge of skin diseases is inevitable for diagnosis and treatment of skin diseases (3-5), which will help dermatologists in diagnosis, and can also be used in rural and remote areas where access to dermatologists is not always possible. In particular, diagnostic intelligent assistant are able to simulate and model problem-solving capabilities in a specific field. This is possible because these systems obtain knowledge from an expert and make decisions based on artificial intelligence (6). The knowledge base contains all specialized knowledge that is related to the problem and inference engine acts in a way that is similar to human reasoning (7). Inference techniques can be backward (deny or prove a pre-specified target) or forward (reach a goal that is not already known). Different methods of representing expert knowledge systems include triad of O-A-V, rules, semantic networks, and frames (7, 8).
2. AIM
The main objective of this study is to design and implement an intelligent diagnostic assistant for complicated skin diseases through C5 algorithm based on expert knowledge using the rule-based system and forward chaining inference techniques.
3. METHOD
This applied-developmental study was done in Razi Dermatology hospital, affiliated to TUMS, at 2015.The research was conducted in two phases, which include designing knowledge Base based on dermatologists’ knowledge as experts and implemented as follows:
3.1. Designing Knowledge Base
The Acquisition of knowledge was done by Delphi technique and by holding several sessions and interviews with the experts using checklists of signs and symptoms that were developed based on International Classification of Diseases 10 revision (ICD-10) and semiology book of dermatology. One coordinator organized the meetings and received dermatologists’ opinions and suggestions and consulted with other professionals until a consensus was reached. A total of 106 symptoms were determined for disease diagnosis.
3.2. Extracting and Representing Knowledge
Drawing decision tree from the trained dataFirst 5 class diseases consideration, and for every class, it was collect 60 medical records patient for the selected skin diseases that contains attribute of disease and patients. Then it was collect 150 medical records of the patients without have the selected skin diseases. So, there was 450 medical record for training the decision tree model as true class label.Overall the sum of samples are 450 medical records for 6 class contains 5 disease and 1 no disease. Data of 60 medical records of the patients for each disease (a total of 300 medical records), and 150 medical records of the patients without the selected skin diseases were entered into the Excel Microsoft Office database. Knowledge representation was obtained from the train data in the database and decision tree was drawn using Clementine Software and C5 Algorithm. Accuracy and error of the drawn decision tree was calculated.Testing and final analysis of the accuracy of decision treeTo test the validity of the decision tree, data of 20 medical records of the patients for each of the selected diseases (a total of 100 medical records) and 50 medical records of the patients without the selected skin diseases were entered into the database and accuracy of the drawn decision tree was calculated by measuring the accuracy and error rates. Accuracy higher than 90% was considered as the precision value of the decision tree.Extracting Rules from the Decision TreeRules of If/Then/Else were extracted using forward chaining inference technique from the drawn decision tree.The Implementation of a Diagnostic Intelligent AssistantExtracted rules from the decision tree were entered into CLIPS programming environment as RULE (Diagram 1).
4. RESULTS
According to dermatologist’s view, diseases including pemphigus vulgaris (PV), basal cell carcinoma (BCC), lichen planus (LP), malignant melanoma (MM), and scabies require recognition by a diagnostic intelligent assisted.
4.1. Accuracy and Error Rates Obtained in the Training Test Phase
In order to prepare data for drawing the decision tree, common symptoms for diagnosing the selected diseases were extracted (Table 1). A summarized list of specific and common symptoms of the selected diseases (Table 2) was developed. The accuracy and error rates obtained in the training phase from the decision tree (Figure 1) were 99.56% and 0.44%, respectively (Table 3). The accuracy of the decision tree was 98% and the error was 2% in the test phase (Table 4). Rules were defined using forward chaining inference technique and were entered into Clips programming environment as RULE.
Table 1
A detailed list of specific and common symptoms of the selected disease
Table 2
Brief listing of specific and common symptoms of the selected diseases
Diagram 1
Process Design, Implementation and Evaluation of Intelligent Diagnostic Assistant
Table 3
Analysis of Decision Tree in Training Phase. Accuracy: %99.56 wrong: %0.44
Table 4
Analysis of Decision Tree in Testing Phase. Accuracy: %98 wrong: %2
A detailed list of specific and common symptoms of the selected diseaseBrief listing of specific and common symptoms of the selected diseasesProcess Design, Implementation and Evaluation of Intelligent Diagnostic AssistantAnalysis of Decision Tree in Training Phase. Accuracy: %99.56 wrong: %0.44Analysis of Decision Tree in Testing Phase. Accuracy: %98 wrong: %2
5. DISCUSSION
The purpose of this research was to design and implement an intelligent diagnostic assistant for diagnosis of complicated skin diseases.Strength of this study was preparation of a preliminary checklist for skin diseases using the International Classification of Diseases 10 Revision (ICD-10). Another strong point of our study was identification of diseases according to dermatologists’ preferences.Analysis of the data showed that in the training phase, the drawn decision tree diagnosed diseases with accuracy of 99.56% and error of 0.44%. Analysis in test phase showed accuracy of 98% and error of 2%.Expert systems can be used to assist in the diagnosis of diseases (9). One study had suggested more than 60 rules as “Then-If” using the decision trees for 9 selected diseases (3). Another study had extracted rules from the decision tree by designing an expert system that was used to diagnose neurological diseases (10). Other studies had extracted rules by a two step decision tree of training and testing (11-16) or had used the decision tree to extract the rules that were used to design neural networks and estimate the risk of preeclampsia with an accuracy of 83.6% in the training phase and 93.8% in the testing phase (17). The methods used in these studies are in line with the methodology used in the current study. However, the accuracy of the drawn decision tree in the current study appeared to be 99.56% in the training phase and 98% in the testing phase.Seto and colleagues designed a rules-based system in their study and used 105 rules for knowledge representation. Training and testing of the decision tree were not used in this research (18). Semantic network was used in the phase of knowledge representation which revealed causal relationships between variables and symptoms (19). These studies do not match with the findings of the current study. In this study we used a semantic network to obtain causal relationships between variables and symptoms. Although the semantic network is a way of representing knowledge in intelligent system, the rules extracted from the decision tree is much easier to analyze due to the simplicity and ease of comprehension by the specialists, and is therefore superior to semantic network. Given that, expert systems are to be used in disease diagnosis, paying attention to the training and testing phases and evaluating the decision tree accuracy is essential. It is recommended that emphasis be placed on the training and testing phases and evaluation of the decision tree accuracy at the phase of knowledge representation as an inseparable part implementation of these systems.Following implementation of the intelligent diagnostic assistant, and entering of the extracted rules from the decision tree to the intelligent diagnostic assistant, rule-based method was also used and the rules of If/Then/Else were extracted from the decision tree and were entered into the system programming (CLIPS) as RULE using forward chaining inference technique.Doniz et al. used rule-based technique to enter rules of If/Then/Else as rule in the phase of knowledge representation to the proposed intelligent diagnostic assistant. Inference technique used in this study was forward (20). In some intelligent diagnostic assistants, rules of If/Then/Else, CLIPS programming environment and forward chaining inference technique (21, 23) or C programming language (24, 25) were used. The language used in both studies is consistent with our study. In one study. After extraction of “If/Then/Else” rules, the decision tree was drawn. By using forward chaining inference techniques, the selected rules were burned to reach a final diagnosis (26). CLIPS programming environment was also used (3, 10-12, 27) which is consistent with the current study.Phase logic, decision trees and forward chaining inference techniques were used to determine the severity of preeclampsia (28). MATLAB programming environment was also applied in the intelligent diagnostic assistant for pneumonia diagnosis (29). Taking into consideration the fact that the MATLAB development environment is an interpretive language and that its execution speed is much slower than the compiled languages, it does not work so well in complex structures. In addition, its cost is high while the operating programs written in MATLAB are common and bring about essential problems (39). Therefore, in order to fix these defects, C programming language was used in the current study.Usually, researchers use backward chaining inference technique in their studies (10) and C sharp language (30). Although some studies used the decision tree and the C language, the difference between them and the current study is the small number of tested data and the use of backward chaining in the inference engine (10, 30). Phase logic or neural network was used in designing the systems (15, 19) that is inconsistent with the current study. Given that all conditions of the current study were binary, it was not possible to use phase logic.Some studies have used neural network in the phase of knowledge representation (17) that is not in consistence with the current study. The major problem of neural networks is that the user cannot understand the method of inference and conclusion of the system, learning system is dependent on the collection system of train data and the amount as well as the number of training courses. If train data is incomplete and inadequate or the number of training courses is too many or too few, the system cannot have a good decision. The CLIPS program was usually used to design the intelligent diagnostic assistant due to its high flexibility, expandability and low cost (20).
6. CONCLUSION
Although artificial intelligence has great potential for improving medical decision-making, successful implementation of these systems requires special attention to details including using a method that is appropriate to the system target during the phase of knowledge representation. Paying attention to the details prepares the necessary ground for the use of these systems, like making specialists and physicians aware of the advantages of the intelligent diagnostic assistant in their professional area. Implementing intelligent diagnostic assistant with a user-friendly interface to increase physician’s willingness is also essential to make effective use of these systems.
Authors: Emily Seto; Kevin J Leonard; Joseph A Cafazzo; Jan Barnsley; Caterina Masino; Heather J Ross Journal: Int J Med Inform Date: 2012-03-31 Impact factor: 4.046
Authors: Themis P Exarchos; Markos G Tsipouras; Costas P Exarchos; Costas Papaloukas; Dimitrios I Fotiadis; Lampros K Michalis Journal: Artif Intell Med Date: 2007-05-31 Impact factor: 5.326