| Literature DB >> 35470133 |
Abstract
OBJECTIVES: The Indian Liver Patient Dataset (ILPD) is used extensively to create algorithms that predict liver disease. Given the existing research describing demographic inequities in liver disease diagnosis and management, these algorithms require scrutiny for potential biases. We address this overlooked issue by investigating ILPD models for sex bias.Entities:
Keywords: Artificial intelligence; BMJ Health Informatics; Health Equity; Machine Learning; Public health informatics
Mesh:
Year: 2022 PMID: 35470133 PMCID: PMC9039354 DOI: 10.1136/bmjhci-2021-100457
Source DB: PubMed Journal: BMJ Health Care Inform ISSN: 2632-1009
Summary counts of classes in the Indian liver patient dataset dataset, including counts after the dataset is balanced
| Target (disease=1) | Dataset 1 | Total counts for sexes | Dataset 2 (oversampled minority class) | Total counts for sexes | Dataset 3 | Total counts for sexes | |
| Female | 0 | 50 | 142 | 145 | 237 | 408 | 595 |
| 1 | 92 | 92 | 187 | ||||
| Male | 0 | 117 | 441 | 271 | 595 | 271 | 595 |
| 1 | 324 | 324 | 324 | ||||
| Total | 583 | 832 | 1190 |
Experiment 3.1.1—unbalanced training data without feature selection, sex performance disparities
| Mean difference averaged over n=100 | Random forest classifier | Logistic regression classifier | Support vector machine | Gaussian Naïve Bayes | ||||
| Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | |
| Accuracy | 2.96 | 0.00 | −2.85 | 0.01 | −2.98 | 0.02 | −2.72 | 0.02 |
| FScore | 15.63 | 0.00 | 15.86 | 0.00 | 4.14 | 0.00 | 16.19 | 0.00 |
| ROC_AUC* | 6.80 | 0.00 | 2.93 | 0.00 | −2.41 | 0.08 | 5.53 | 0.00 |
| Precision | 5.25 | 0.00 | −4.87 | 0.00 | 3.41 | 0.00 | −3.13 | 0.05 |
| Recall | 21.02 | 0.00 | 24.07 | 0.00 | 2.58 | 0.04 | 19.31 | 0.00 |
| False negative rate | −21.02 | 0.00 | −24.07 | 0.00 | −2.58 | 0.08 | −19.31 | 0.00 |
| True negative rate | −7.42 | 0.00 | −18.20 | 0.00 | −7.40 | 0.00 | −8.24 | 0.00 |
| False positive rate | 7.42 | 0.00 | 18.20 | 0.00 | 7.40 | 0.00 | 8.24 | 0.00 |
| True positive rate | 21.02 | 0.00 | 24.07 | 0.00 | 2.58 | 0.04 | 19.31 | 0.00 |
*ROC AUC score is a measure of the separation between classes in a binary classifier, derived from the area under the ROC curve.
Experiment 3.1.2—balanced training data without feature selection, sex performance disparities
| Mean difference averaged over n=100 | Random forest classifier | Logistic regression classifier | Support vector machine | Gaussian Naïve Bayes | ||||
| Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | |
| Accuracy | −6.17 | 0.00 | −6.36 | 0.00 | −11.47 | 0.00 | −7.43 | 0.00 |
| FScore | 7.69 | 0.00 | 20.17 | 0.00 | −3.40 | 0.00 | 16.65 | 0.00 |
| ROC_AUC | 0.60 | 0.13 | 4.79 | 0.00 | −9.06 | 0.00 | 5.45 | 0.00 |
| Precision | −0.94 | 0.88 | −4.75 | 0.00 | −2.32 | 0.14 | 0.24 | 0.37 |
| Recall | 12.88 | 0.00 | 29.22 | 0.00 | −4.64 | 0.00 | 19.82 | 0.00 |
| False negative rate | −12.88 | 0.00 | −29.22 | 0.00 | 4.64 | 0.00 | −19.82 | 0.00 |
| True negative rate | −11.69 | 0.00 | −19.65 | 0.00 | −13.49 | 0.00 | −8.93 | 0.00 |
| False positive rate | 11.69 | 0.00 | 19.65 | 0.00 | 13.49 | 0.00 | 8.93 | 0.00 |
| True positive rate | 12.88 | 0.00 | 29.22 | 0.00 | −4.64 | 0.00 | 19.82 | 0.00 |
Experiment 3.1.3—unbalanced training data with feature selection, sex performance disparities
| Random forest classifier | Logistic regression classifier | Support vector machine | Gaussian Naïve Bayes | |||||
| Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | |
| Accuracy | 3.42 | 0.00 | −2.90 | 0.01 | −2.75 | 0.01 | −3.31 | 0.00 |
| FScore | 15.36 | 0.00 | 15.79 | 0.00 | 16.50 | 0.00 | 15.29 | 0.00 |
| ROC_AUC | 6.61 | 0.00 | 3.60 | 0.00 | 4.90 | 0.00 | 4.99 | 0.00 |
| Precision | 9.85 | 0.00 | 0.24 | 0.44 | −0.87 | 0.90 | −3.41 | 0.03 |
| Recall | 18.21 | 0.00 | 21.24 | 0.00 | 20.30 | 0.00 | 18.54 | 0.00 |
| False negative rate | −18.21 | 0.00 | −21.24 | 0.00 | −20.30 | 0.00 | −18.54 | 0.00 |
| True negative rate | −4.99 | 0.00 | −14.04 | 0.00 | −10.50 | 0.00 | −8.57 | 0.00 |
| False positive rate | 4.99 | 0.00 | 14.04 | 0.00 | 10.50 | 0.00 | 8.57 | 0.00 |
| True positive rate | 18.21 | 0.00 | 21.24 | 0.00 | 20.30 | 0.00 | 18.54 | 0.00 |
Experiment 3.1.4—balanced training data with feature selection, sex performance disparities
| Random forest classifier | Logistic regression classifier | Support vector machine | Gaussian Naïve Bayes | |||||
| Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | Sex performance disparities (%) | t-test | |
| Accuracy | −5.62 | 0.00 | −6.80 | 0.00 | −6.19 | 0.00 | −4.64 | 0.00 |
| FScore | 7.86 | 0.00 | 14.39 | 0.00 | 16.46 | 0.00 | 21.63 | 0.00 |
| ROC_AUC | −0.05% | 0.46 | 3.57% | 0.00 | 5.95% | 0.00 | 8.17% | 0.00 |
| Precision | 4.60% | 0.00 | 9.28% | 0.00 | 12.82% | 0.00 | 9.35% | 0.00 |
| Recall | 9.70% | 0.00 | 15.51% | 0.00 | 15.38% | 0.00 | 22.78% | 0.00 |
| False negative rate | −9.70 | 0.00 | −15.51 | 0.00 | −15.38 | 0.00 | −22.78 | 0.00 |
| True negative rate | −9.79 | 0.00 | −8.37 | 0.00 | −3.47 | 0.00 | −6.44 | 0.00 |
| False positive rate | 9.79 | 0.00 | 8.37 | 0.00 | 3.47 | 0.00 | 6.44 | 0.00 |
| True positive rate | 9.70 | 0.00 | 15.51 | 0.00 | 15.38 | 0.00 | 22.78 | 0.00 |