| Literature DB >> 34234931 |
Rakesh Raja1, Indrajit Mukherjee1, Bikash Kanti Sarkar1.
Abstract
Preterm birth (PTB) in a pregnant woman is the most serious issue in the field of Gynaecology and Obstetrics, especially in rural India. In recent years, various clinical prediction models for PTB have been developed to improve the accuracy of learning models. However, to the best of the authors' knowledge, most of them suffer from selecting the most accurate features from the medical dataset in linear time. The present paper attempts to design a machine learning model named as risk prediction conceptual model (RPCM) for the prediction of PTB. In this paper, a feature selection approach is proposed based on the notion of entropy. The novel approach is used to find the best maternal features (responsible for PTB) from the obstetrical dataset and aims to predict the classifier's accuracy at the highest level. The paper first deals with the review of PTB cases (which is neglected in many developing countries including India). Next, we collect obstetrical data from the Community Health Centre of rural areas (Kamdara, Jharkhand). The suggested approach is then applied on collected data to identify the excellent maternal features (text-based symptoms) present in pregnant women in order to classify all birth cases into term birth and PTB. The machine learning part of the model is implemented using three different classifiers, namely, decision tree (DT), logistic regression (LR), and support vector machine (SVM) for PTB prediction. The performance of the classifiers is measured in terms of accuracy, specificity, and sensitivity. Finally, the SVM classifier generates an accuracy of 90.9%, which is higher than other learning classifiers used in this study.Entities:
Mesh:
Year: 2021 PMID: 34234931 PMCID: PMC8219409 DOI: 10.1155/2021/6665573
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Definitions used in the present study.
| Terminology | Description |
|---|---|
| Antenatal care | Antenatal care (ANC) refers to the fundamental, clinical, and nursing care suggested for ladies during pregnancy |
| Neonate | A neonate or a newborn infant is a child under 28 days of age |
| Neonatal death | A death during the first 28 days of life (0–27 days) is termed as a neonatal death |
| Live birth | A birth at which a child is born alive is termed as live birth |
| Term birth | A birth at the end of a normal duration of pregnancy between 37 and 40 weeks of gestation is termed as term birth |
| Maternal death | A maternal death is the death of a woman while pregnant or within 42 days of termination of pregnancy |
| Stillbirth | Stillbirth is the delivery, after the 20th week of pregnancy, of a baby who has died |
| Abortion | Termination of a pregnancy either medically or induced |
| Miscarriage | Natural loss of pregnancy during first trimester |
| Gestational age | Gestational age (GA) refers to the time from the first day of a woman's last menstrual period to birth |
Figure 1Conceptual model for feature selection approach.
Figure 2Framework of the proposed model.
Summary of the obstetrical (term-preterm) dataset.
| Problem name | Number of features | Number of classes | Number of instances |
|---|---|---|---|
| Birth case | 36 | 2 | 1300 |
Maternal features associated with PTB.
| S. no. | Feature ID | Feature name |
|---|---|---|
| 1 | PID | Patient identification |
| 2 | WA | Woman age |
| 3 | LMP | Last menstrual period |
| 4 | EDD | Estimated delivery date |
| 5 | G | Gravida |
| 6 | P | Parity |
| 7 | A | Abortion |
| 8 | L | Living |
| 9 | EL | Educational level |
| 10 | H | Height |
| 11 | W | Weight |
| 12 | BMI | Body mass index |
| 13 | BP | Blood pressure |
| 14 | HB | Hemoglobin |
| 15 | ANC | Antenatal care visit |
| 16 | ADD | Actual delivery date |
| 17 | OH | Obstetric history |
| 18 | PCS | Previous caesarean section |
| 19 | GA | Gestational age |
| 20 | BW | Birth weight |
| 21 | GDM | Gestational diabetes mellitus |
| 22 | FHR | Fetal heart rate |
| 23 | MG | Multiple gestation |
| 24 | ND | Normal delivery |
| 25 | MH | Previous medical history |
| 26 | LBW | Low birth weight |
| 27 | ASPX | Asphyxia |
| 28 | HT | Hypertension |
| 29 | PE | Preeclampsia |
| 30 | LV | Live birth |
| 31 | SB | Still birth |
| 32 | OB | Obesity |
| 33 | AN | Anemia |
| 34 | TH | Thyroid |
| 35 | NS | Neonatal status |
| 36 | PTB | Preterm birth |
Summary of discretized (term-preterm) dataset.
| Outcome |
|
|---|---|
| Number of features | 36 |
| Number of classes | 2 |
| Total instances | 1300 |
| Term birth | 991 |
| Preterm birth | 309 |
List of excellent features in discretized (term-preterm) dataset.
| Feature code | Feature name | Feature type |
|---|---|---|
| WA | Woman age | Numeric |
| PT | Parity | Numeric |
| GD | Gravida | Numeric |
| BMI | Body mass index | Ordinal |
| ANC | Antenatal care visit | Numeric |
| GA | Gestational age | Numeric |
| FHR | Fetal heart rate | Numeric |
| BP | Blood pressure | Ordinal |
| HB | Hemoglobin | Numeric |
| GDM | Gestational diabetes mellitus | Binary |
| PE | Preeclampsia | Binary |
| HT | Hypertension | Binary |
| OH | Obstetric history | Binary |
| EL | Education level | Ordinal |
| CS | Previous caesarean section | Binary |
| MH | Previous medical history | Binary |
| PTB | Preterm birth (target variable) | Binary |
Figure 3A conceptual PTB prediction model.
Confusion matrix.
| Predictive positive | Predictive negative | |
|---|---|---|
| Actual positive | True Positive (TP) | False Negative (FN) |
| Actual negative | False Positive (FP) | True Negative (TN) |
Performance metrics for machine learning classifiers.
| Metrics | Formula |
|---|---|
| CCR | ((TP+TN)/(TP+FP+FN+TN))% |
| TPR | TP/(TP+FN) |
| TNR | TN/(TN+FP) |
| FPR | FP/(TN+FP) |
| FNR | FN/(TP+FN) |
| Precision | TP/(TP+FP) |
| Recall | TP/(TN+FN) |
|
| 2 |
Performance metrics of the classifiers—original dataset.
| Classifiers | Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| DT | 0.777 | 0.702 | 0.930 |
| LR | 0.841 | 0.863 | 0.971 |
| SVM |
| 0.801 | 0.702 |
Performance metrics of the classifiers—balanced dataset.
| Classifiers | Accuracy | Sensitivity | Specificity |
|---|---|---|---|
| DT | 0.796 | 0.713 | 0.972 |
| LR | 0.872 | 0.832 | 0.954 |
| SVM |
| 0.891 | 0.783 |