| Literature DB >> 33112241 |
Vincent Bremer1, Philip I Chow2, Burkhardt Funk1, Frances P Thorndike2, Lee M Ritterband2.
Abstract
BACKGROUND: User dropout is a widespread concern in the delivery and evaluation of digital (ie, web and mobile apps) health interventions. Researchers have yet to fully realize the potential of the large amount of data generated by these technology-based programs. Of particular interest is the ability to predict who will drop out of an intervention. This may be possible through the analysis of user journey data-self-reported as well as system-generated data-produced by the path (or journey) an individual takes to navigate through a digital health intervention.Entities:
Keywords: digital health; dropout; machine learning
Mesh:
Year: 2020 PMID: 33112241 PMCID: PMC7657718 DOI: 10.2196/17738
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Process of analysis. AUC: area under the curve; MAE mean absolute error; ROC: receiver operating characteristics; RMSE: root mean square error.
Figure 2Example of data transformation in the context of digital health interventions.
Figure 3Example of creating aggregated time window–based features for w=3.
Figure 4Procedure of statistical analysis.
Figure 5Setup of analysis for dropout prediction.
Aggregation of theory-determined features.
| Feature aggregation method | Handcrafted features | Existing clinically important features |
| Sum: The sum of all observations of a specific feature for an individual |
Days since the last contact (any interaction) If sleeping duration is decreasing from core to core If sleep window duration is 5 or 8 hours |
If the participant had an alcoholic drink that day If the participant took a nap If the system recorded a triggered event that day If the participant logged in that day If the system sent an email that day |
| Last: The last observation of a specific feature for an individual |
Difference between preferred arising time in core 2 and core 3 If preferred arising time is greater than 8 AM in core 2 Average time in days to complete a core among all cores that have been available Time needed in days to complete a core in days (6 features for core 0-5) |
If the participant finished homework in core 2 Number of days where no diaries have been completed in the period of analysis Precipitating factor includes |
| Mean: Mean of the observations of a specific feature for an individual |
Difference between awake and arise time Difference between preferred arise time and actual arise time (AM/PM) Difference between preferred arise time and actual arise time (minutes) Difference between preferred bedtime and actual bedtime |
Naptime in minutes |
Figure 6Heat map of average area under the curve values across core analyses for each model, imputation procedure, and threshold for percentage of missing values. AUC: area under the curve; KNN: k-nearest neighbor; LASSO: least absolute shrinkage and selection operator; SVM: support vector machine.
Figure 7Receiver operating characteristic for each core analysis based on boosted decision trees (15% missing value deletion, k-nearest neighbor imputation). AUC: area under the curve; FPR: false-positive rate; TPR: true-positive rate.
Figure 8Five most important features for each core analysis according to boosted decision trees (15% deletion of missing values, and k-nearest neighbor imputation). The x-axis represents the values for each feature, and the y-axis represents the SHAP values. SHAP: SHapley Additive exPlanation; SOL: sleep onset latency; WASO: wake after sleep onset.
Figure 13Five most important features for each core analysis according to boosted decision trees (15% deletion of missing values, KNN imputation, and Core 5 analysis). SHAP: SHapley Additive exPlanation.
Summary of the unique top 5 most important features across analyses.
| Predictors | Analysis at each point in time | ||||||
| Feature | Description | Core 0 | Core 1 | Core 2 | Core 3 | Core 4 | Core 5 |
| Core 0 completion date—intervention start datea | Time to complete core 0 in days | +b | + | N/Ac | N/A | N/A | N/A |
| Arise time—awake timea | Difference between time of awakening and getting out of bed in minutes (time to get up) | + | N/A | N/A | N/A | N/A | N/A |
| Usual arise time | Retrospective report specified from baseline data | + | N/A | N/A | N/A | N/A | N/A |
| Wake after sleep onset | Minutes awake in the middle of the night from sleep diaries | + | + | N/A | N/A | N/A | N/A |
| Sleep onset latency | Minutes to fall asleep from sleep diaries | + | N/A | N/A | N/A | N/A | N/A |
| Baseline arise time (pre retro sleep arising time) | Time the user specified that they got out of bed from baseline data | N/A | + | + | N/A | N/A | N/A |
| Pre retro sleep waking early | User indicates having problems waking up too early in the morning | N/A | + | N/A | N/A | N/A | N/A |
| Pre teach trust info source c | How much the user trusts health information | N/A | + | N/A | N/A | N/A | N/A |
| Average time to complete corea | Average time to complete a core among all cores that have been available up to the point of the analysis | N/A | N/A | + | + | + | + |
| Pre stpi 24 depd,e | How low the user feels at baseline | N/A | N/A | + | N/A | N/A | N/A |
| Pre se gen 3f | How well the user feels things have been going | N/A | N/A | + | N/A | N/A | N/A |
| Bedtime | If a participant went to bed in the AM or PM (before or after 12 AM) | N/A | N/A | + | N/A | N/A | N/A |
| Email senta | If the system sent an email that day | N/A | N/A | N/A | + | N/A | N/A |
| Pre stpi 26 curg | How stimulated the user feels at baseline | N/A | N/A | N/A | + | + | N/A |
| Trigger event loggeda | If the system logged a trigger event that day | N/A | N/A | N/A | + | N/A | N/A |
| Pre teach stress 6 | User feels he or she can solve most problems if necessary effort is put in | N/A | N/A | N/A | + | N/A | N/A |
| Pre stpi 18 curh | How eager the user feels at baseline | N/A | N/A | N/A | N/A | + | N/A |
| Core 4 completion date—core 4 start datea | Time to complete core 4 in days | N/A | N/A | N/A | N/A | + | + |
| Pre stpi 29 anxi | How much self-confidence the user feels at baseline | N/A | N/A | N/A | N/A | + | N/A |
| Days since the last informationa | Days since the last contact (any interaction) | N/A | N/A | N/A | N/A | N/A | + |
| Pre CESDj 14k | How lonely the user feels at baseline | N/A | N/A | N/A | N/A | N/A | + |
| Pre retro sleep length of sleep prob | Number of months the user reports having had sleep difficulties at baseline. | N/A | N/A | N/A | N/A | N/A | + |
aHandcrafted/theory-driven features.
b+ indicates appearance of feature in corresponding core analysis.
cN/A: not applicable.
dSTPI: state-trait personality inventory.
ePre stpi 24 dep: baseline STPI measure item #24 depression subscale.
fPre se gen 3: baseline Perceived Stress Scale item #5.
gPre stpi 26 cur: baseline STPI measure item #26 curiosity subscale.
hPre stpi 18 cur: baseline STPI measure item #18 curiosity subscale.
iPre stpi 29 anx: baseline STPI measure item #29 anxiety subscale.
jCenter for Epidemiologic Studies Depression Scale.
kPre CESD 14: baseline CESD measure item #14.