Sadaf Qazi1,2, Muhammad Usman3,4, Azhar Mahmood1,2. 1. Faculty of Computing and Engineering Science, Shaheed Zulfiqar Ali Bhutto Institute of Science and Technology, Islamabad, Pakistan. 2. Predictive Analytics Lab (National Center for Cloud Computing and Big Data), Islamabad, Pakistan. 3. Faculty of Computing and Engineering Science, Shaheed Zulfiqar Ali Bhutto Institute of Science and Technology, Islamabad, Pakistan. dr.usman@szabist-isb.edu.pk. 4. Predictive Analytics Lab (National Center for Cloud Computing and Big Data), Islamabad, Pakistan. dr.usman@szabist-isb.edu.pk.
Abstract
BACKGROUND: Pakistan has a nationwide expanded program on immunization (EPI), yet vaccination coverage in Pakistan is quite low. Recently, an analytical model has been proposed to improve the coverage by identifying children who are most likely to miss any of the vaccines included in the immunization schedule, known as defaulters; however, a number of limitations remain unresolved in the previously proposed model. Firstly, it only classified children into two stages: defaulters and non-defaulters, considering all children at high risk of defaulting even if only one dose is missed. Secondly, there was no categorisation of high and low coverage areas for prioritised vaccination. The aim of this study was to propose a prediction framework for the accurate identification of defaulters. METHODS: We have utilised a sample dataset extracted from the Pakistan Demographic and Health Survey (PDHS, 2017-2018). This contained 7153 data records with 19 demographic and socioeconomic attributes, which were used for defaulter prediction and the identification of association rules to understand the relation between demographics of the child and the vaccination status. RESULTS: Using a multilayer perceptron (MLP) classifier, the proposed model achieved 98% accuracy and 0.994 for the area under the curve (AUC), to correctly identify the children who are likely to default from immunization series at different risk stages. CONCLUSION: The proposed framework in this study is a step forward towards a data-driven approach and provides a set of machine learning techniques to utilise predictive analytics. Hence, this can reinforce immunization programs by expediting targeted action to reduce drop-outs.
BACKGROUND: Pakistan has a nationwide expanded program on immunization (EPI), yet vaccination coverage in Pakistan is quite low. Recently, an analytical model has been proposed to improve the coverage by identifying children who are most likely to miss any of the vaccines included in the immunization schedule, known as defaulters; however, a number of limitations remain unresolved in the previously proposed model. Firstly, it only classified children into two stages: defaulters and non-defaulters, considering all children at high risk of defaulting even if only one dose is missed. Secondly, there was no categorisation of high and low coverage areas for prioritised vaccination. The aim of this study was to propose a prediction framework for the accurate identification of defaulters. METHODS: We have utilised a sample dataset extracted from the Pakistan Demographic and Health Survey (PDHS, 2017-2018). This contained 7153 data records with 19 demographic and socioeconomic attributes, which were used for defaulter prediction and the identification of association rules to understand the relation between demographics of the child and the vaccination status. RESULTS: Using a multilayer perceptron (MLP) classifier, the proposed model achieved 98% accuracy and 0.994 for the area under the curve (AUC), to correctly identify the children who are likely to default from immunization series at different risk stages. CONCLUSION: The proposed framework in this study is a step forward towards a data-driven approach and provides a set of machine learning techniques to utilise predictive analytics. Hence, this can reinforce immunization programs by expediting targeted action to reduce drop-outs.