Muhammad Afzal1, Maqbool Hussain2, Wajahat Ali Khan3, Taqdir Ali4, Sungyoung Lee5, Eui-Nam Huh6, Hafiz Farooq Ahmad7, Arif Jamshed8, Hassan Iqbal9, Muhammad Irfan10, Manzar Abbas Hydari11. 1. Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, South Korea; Department of Software, Sejong University, South Korea. Electronic address: muhammad.afzal@oslab.khu.ac.kr. 2. Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, South Korea; Department of Software, Sejong University, South Korea. Electronic address: maqbool.hussain@oslab.khu.ac.kr. 3. Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, South Korea. Electronic address: wajahat.alikhan@oslab.khu.ac.kr. 4. Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, South Korea. Electronic address: taqdir.ali@oslab.khu.ac.kr. 5. Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, South Korea. Electronic address: sylee@oslab.khu.ac.kr. 6. Department of Computer Science and Engineering, Kyung Hee University, Seocheon-dong, Giheung-gu, South Korea. Electronic address: johnhuh@khu.ac.kr. 7. College of Computer Sciences and Information Technology (CCSIT), King Faisal University, Alahsa, Saudi Arabia. Electronic address: hfahmad@kfu.edu.sa. 8. Shaukat Khanum Memorial Cancer Hospital and Research Center, Lahore, Pakistan. Electronic address: arifj@skm.org.pk. 9. Department of Otolaryngology and Head and Neck Surgery, The Ohio State University, USA. Electronic address: drhassaniqbal@gmail.com. 10. Shaukat Khanum Memorial Cancer Hospital and Research Center, Lahore, Pakistan. Electronic address: mirfan@skm.org.pk. 11. Shaukat Khanum Memorial Cancer Hospital and Research Center, Lahore, Pakistan. Electronic address: Manzar@skm.org.pk.
Abstract
BACKGROUND: A wealth of clinical data exists in clinical documents in the form of electronic health records (EHRs). This data can be used for developing knowledge-based recommendation systems that can assist clinicians in clinical decision making and education. One of the big hurdles in developing such systems is the lack of automated mechanisms for knowledge acquisition to enable and educate clinicians in informed decision making. MATERIALS AND METHODS: An automated knowledge acquisition methodology with a comprehensible knowledge model for cancer treatment (CKM-CT) is proposed. With the CKM-CT, clinical data are acquired automatically from documents. Quality of data is ensured by correcting errors and transforming various formats into a standard data format. Data preprocessing involves dimensionality reduction and missing value imputation. Predictive algorithm selection is performed on the basis of the ranking score of the weighted sum model. The knowledge builder prepares knowledge for knowledge-based services: clinical decisions and education support. RESULTS: Data is acquired from 13,788 head and neck cancer (HNC) documents for 3447 patients, including 1526 patients of the oral cavity site. In the data quality task, 160 staging values are corrected. In the preprocessing task, 20 attributes and 106 records are eliminated from the dataset. The Classification and Regression Trees (CRT) algorithm is selected and provides 69.0% classification accuracy in predicting HNC treatment plans, consisting of 11 decision paths that yield 11 decision rules. CONCLUSION: Our proposed methodology, CKM-CT, is helpful to find hidden knowledge in clinical documents. In CKM-CT, the prediction models are developed to assist and educate clinicians for informed decision making. The proposed methodology is generalizable to apply to data of other domains such as breast cancer with a similar objective to assist clinicians in decision making and education.
BACKGROUND: A wealth of clinical data exists in clinical documents in the form of electronic health records (EHRs). This data can be used for developing knowledge-based recommendation systems that can assist clinicians in clinical decision making and education. One of the big hurdles in developing such systems is the lack of automated mechanisms for knowledge acquisition to enable and educate clinicians in informed decision making. MATERIALS AND METHODS: An automated knowledge acquisition methodology with a comprehensible knowledge model for cancer treatment (CKM-CT) is proposed. With the CKM-CT, clinical data are acquired automatically from documents. Quality of data is ensured by correcting errors and transforming various formats into a standard data format. Data preprocessing involves dimensionality reduction and missing value imputation. Predictive algorithm selection is performed on the basis of the ranking score of the weighted sum model. The knowledge builder prepares knowledge for knowledge-based services: clinical decisions and education support. RESULTS: Data is acquired from 13,788 head and neck cancer (HNC) documents for 3447 patients, including 1526 patients of the oral cavity site. In the data quality task, 160 staging values are corrected. In the preprocessing task, 20 attributes and 106 records are eliminated from the dataset. The Classification and Regression Trees (CRT) algorithm is selected and provides 69.0% classification accuracy in predicting HNC treatment plans, consisting of 11 decision paths that yield 11 decision rules. CONCLUSION: Our proposed methodology, CKM-CT, is helpful to find hidden knowledge in clinical documents. In CKM-CT, the prediction models are developed to assist and educate clinicians for informed decision making. The proposed methodology is generalizable to apply to data of other domains such as breast cancer with a similar objective to assist clinicians in decision making and education.
Authors: Syed Imran Ali; Su Woong Jung; Hafiz Syed Muhammad Bilal; Sang-Ho Lee; Jamil Hussain; Muhammad Afzal; Maqbool Hussain; Taqdir Ali; Taechoong Chung; Sungyoung Lee Journal: Int J Environ Res Public Health Date: 2021-12-26 Impact factor: 3.390