Literature DB >> 29730048

Identifying β-thalassemia carriers using a data mining approach: The case of the Gaza Strip, Palestine.

Alaa S AlAgha1, Hossam Faris2, Bassam H Hammo3, Ala' M Al-Zoubi4.   

Abstract

Thalassemia is considered one of the most common genetic blood disorders that has received excessive attention in the medical research fields worldwide. Under this context, one of the greatest challenges for healthcare professionals is to correctly differentiate normal individuals from asymptomatic thalassemia carriers. Usually, thalassemia diagnosis is based on certain measurable characteristic changes to blood cell counts and related indices. These characteristic changes can be derived easily when performing a complete blood count test (CBC) using a special fully automated blood analyzer or counter. However, the reliability of the CBC test alone is questionable with possible candidate characteristics that could be seen in other disorders, leading to misdiagnosis of thalassemia. Therefore, other costly and time-consuming tests should be performed that may cause serious consequences due to the delay in the correct diagnosis. To help overcoming these challenging diagnostic issues, this work presents a new novel dataset collected from Palestine Avenir Foundation for persons tested for thalassemia. We aim to compile a gold standard dataset for thalassemia and make it available for researchers in this field. Moreover, we use this dataset to predict the specific type of thalassemia known as beta thalassemia (β-thalassemia) based on hybrid data mining model. The proposed model consists of two main steps. First, to overcome the problem of the highly imbalanced class distribution in the dataset, a balancing technique called SMOTE is proposed and applied to handle this problem. In the second step, four classification models, namely k-nearest neighbors (k-NN), naïve Bayesian (NB), decision tree (DT) and the multilayer perceptron (MLP) neural network are used to differentiate between normal persons and those patients carrying β-thalassemia. Different evaluation metrics are used to assess the performance of the proposed model. The experimental results show that the SMOTE oversampling method can effectively improve the identification ratio of β-thalassemia carriers in a highly imbalanced class distribution. The results reveal also that the NB classifier achieved the best performance in differentiating between normal and β-thalassemia carriers at oversampling SMOTE ratio of 400%. This combination shows a specificity of 99.47% and a sensitivity of 98.81%.
Copyright © 2018 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Classification; Data mining; Imbalance; Medical dataset; Oversampling; SMOTE; Thalassemia

Mesh:

Substances:

Year:  2018        PMID: 29730048     DOI: 10.1016/j.artmed.2018.04.009

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  3 in total

1.  Application of Bayesian Decision Tree in Hematology Research: Differential Diagnosis of β-Thalassemia Trait from Iron Deficiency Anemia.

Authors:  Mina Jahangiri; Fakher Rahim; Najmaldin Saki; Amal Saki Malehi
Journal:  Comput Math Methods Med       Date:  2021-11-09       Impact factor: 2.238

Review 2.  Clinical Applications of Artificial Intelligence-An Updated Overview.

Authors:  Ștefan Busnatu; Adelina-Gabriela Niculescu; Alexandra Bolocan; George E D Petrescu; Dan Nicolae Păduraru; Iulian Năstasă; Mircea Lupușoru; Marius Geantă; Octavian Andronic; Alexandru Mihai Grumezescu; Henrique Martins
Journal:  J Clin Med       Date:  2022-04-18       Impact factor: 4.964

Review 3.  Diagnosis support systems for rare diseases: a scoping review.

Authors:  Carole Faviez; Xiaoyi Chen; Nicolas Garcelon; Antoine Neuraz; Bertrand Knebelmann; Rémi Salomon; Stanislas Lyonnet; Sophie Saunier; Anita Burgun
Journal:  Orphanet J Rare Dis       Date:  2020-04-16       Impact factor: 4.123

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.