| Literature DB >> 34822337 |
Kiran Saqib1, Amber Fozia Khan1, Zahid Ahmad Butt1.
Abstract
BACKGROUND: Machine learning (ML) offers vigorous statistical and probabilistic techniques that can successfully predict certain clinical conditions using large volumes of data. A review of ML and big data research analytics in maternal depression is pertinent and timely, given the rapid technological developments in recent years.Entities:
Keywords: big data; machine learning; mobile phone; postpartum depression
Year: 2021 PMID: 34822337 PMCID: PMC8663566 DOI: 10.2196/29838
Source DB: PubMed Journal: JMIR Ment Health ISSN: 2368-7959
Figure 1PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) procedural flowchart. ML: machine learning; PPD: postpartum depression.
Summary of the main study characteristics (N=14).
| # | Study | Aims or objectives | Sample size; input data used | Diagnosis criteria for PPDa |
| 1 | Jiménez-Serrano et al [ | Develop classification models for detecting the risk of PPD during the first week after childbirth | 1880; hospital data | EPDSb>9; 8th or 32nd week postbirth |
| 2 | Betts et al [ | Develop a prediction model to identify women at risk of postpartum psychiatric admission | 75,054; linked administrative health data | ICDc-10 ( |
| 3 | Tortajada et al [ | To obtain a classification model based on feedforward multilayer perceptron to improve PPD prediction during the 32 weeks after childbirth with a high sensitivity and specificity | 1397; hospital data | EPDS>9; 8th or 32nd week postbirth |
| 4 | Wang et al [ | To develop a PPD prediction model, using EHRsd | 179,980; EHRs | ICD-10-CM codes O99.3 and O99.34 as well as their ICD-9-CM equivalents for a diagnosis of PPD within 12 months after childbirth |
| 5 | Zhang et al [ | To compare the effects of 4 different MLe models using data during pregnancy to predict PPD | 508; hospital data | EPDS >9.5; within 42 days postdelivery |
| 6 | Zhang et al [ | Propose an ML framework for PPD risk prediction | 17,633 and 71,106; 2 data sets from EHRs | PPD within 1 year of childbirth |
| 7 | Hochman et al [ | To apply ML approach to create a prediction tool for PPD to be implemented in health care systems | 214,359; EHRs | PPD within first year postpartum (ICD‐9 codes: 300 and 309 or ICD-10 codes: F40-F48) or acute psychotic manic episodes (ICD‐9 codes: 296.0, 296.1, 296.4, 296.6, 296.81, 298.3, 298.4, 298.8) |
| 8 | De Choudhury et al [ | Detect and predict PPD | 165; Facebook survey using PHQf-9 | PHQ-9 |
| 9 | Natarajan et al [ | Propose an ML-based approach for PPD prediction and diagnosis from survey information | 207; Facebook and Twitter survey data | Postpartum Depression Predictors Inventory |
| 10 | Fatima et al [ | Use linguistic features to propose a solution for PPD that can be generalized and deployed across web-based social platforms | 21; text posts from Reddit | PPD based on linguistic feature |
| 11 | Trifan et al [ | To use social media for potential diagnosis of mothers at risk of PPD and thus the implementation of early interventions | 512; Reddit text posts | Not described |
| 12 | Shatte et al [ | To identify fathers at the risk of PPD | 365; Reddit text posts | ICD-10 depression; symptom 06 months postbirth |
| 13 | Moreira et al [ | Propose an algorithm for emotion-aware smart systems, capable for predicting the risk of PPD during pregnancy through biomedical and sociodemographic data analysis | Performance evaluation used data generated by wearable devices and sensors | Not described |
| 14 | Shin et al [ | To develop predictive models for PPD using ML approaches | 28,755; pregnancy risk assessment and monitoring system data | PHQ-2 |
aPPD: postpartum depression.
bEPDS: Edinburgh Postnatal Depression Scale.
cICD: International Classification of Diseases.
dEHR: electronic health record.
eML: machine learning.
fPHQ: Patient Health Questionnaire.
Summary of the main study characteristics (N=14).
| # | Study | Performance metric | MLa algorithms used | Best-performing algorithm |
| 1 | Jiménez-Serrano et al [ | Hold-out validation |
Naive Bayes LRb SVMc ANNd | Naive Bayes model; G function value of 0.73 |
| 2 | Betts et al [ | 5-Fold cross-validation in R |
Gradient boosting Elastic net methods | Boosted trees algorithm (AUCe 0.80, 95% CI 0.76-0.83) |
| 3 | Tortajada et al [ | Hold-out validation |
ANN | Multilayer perceptrons 0.82 of G and 0.81 of accuracy (95% CI 0.76-0.86) with 0.84 of sensitivity and 0.81 of specificity |
| 4 | Wang et al [ | 10-fold cross-validation |
SVM RFf Naive Bayes L2-regularized LR XGBoostg DTh | SVM with AUC (0.79) |
| 5 | Zhang et al [ | sklearn.cross_validation package in Python |
SVM RF | SVM and feature selection RF (sensitivity=0.69; AUC=0.78) |
| 6 | Zhang et al [ | 5-Fold cross-validation |
RF DT XGboost Regularized LR Multilayer perceptron | LR with L2 regularization; AUC (0.937, 95% CI 0.912-0.962) |
| 7 | Hochman et al [ | Hold-out cross-validation |
XGBoost | AUC of 0.712 (95% CI 0.690-0.733), with a sensitivity of 0.349 and a specificity of 0.905) |
| 8 | De Choudhury et al [ | Not described |
Regression models to develop a series of statistical models | Postnatal model |
| 9 | Natarajan et al [ | Information not provided |
Functional gradient boosting DT SVM NBi | Functional gradient boosting (Roc) 0.952 |
| 10 | Fatima et al [ | 10-Fold cross-validation |
LR SVM Multilayer perceptron | Multilayer perceptron; 91∙7% accuracy for depressive content identification and up to 869% accuracy for PPD content prediction |
| 11 | Trifan et al [ | Hold-out validation |
SVM Stochastic gradient descent Passive aggressive classifiers | SVM |
| 12 | Shatte et al [ | 10-Fold cross-validation |
SVM classifiers using behavior, emotion, linguistic style, and discussion topics as features | 0.67 precision, 0.68 recall, and 0.67F−measure in model including all features |
| 13 | Moreira et al [ | 10-fold cross-validation |
DT SVM Nearest neighbor Ensemble classifiers | Ensemble classifiers |
| 14 | Shin et al [ | 10-Fold cross-validation |
RF Stochastic gradient boosting SVM Regression trees NB k-nearest neighbor LR ANN | RF method (AUC) 0.884 |
aML: machine learning.
bLR: logistic regression.
cSVM: support vector machine.
dANN: artificial neural network.
eAUC: area under the curve.
fRF: random forest.
gXGBoost: Extreme Gradient Boosting.
hDT: decision tree.
iNB: Naive Bayes.