Literature DB >> 35791395

Application of Machine Learning Algorithms for Asthma Management with mHealth: A Clinical Review.

Kevin C H Tsang¹, Hilary Pinnock¹, Andrew M Wilson², Syed Ahmar Shah¹.

Abstract

Background: Asthma is a variable long-term condition. Currently, there is no cure for asthma and the focus is, therefore, on long-term management. Mobile health (mHealth) is promising for chronic disease management but to be able to realize its potential, it needs to go beyond simply monitoring. mHealth therefore needs to leverage machine learning to provide tailored feedback with personalized algorithms. There is a need to understand the extent of machine learning that has been leveraged in the context of mHealth for asthma management. This review aims to fill this gap.
Methods: We searched PubMed for peer-reviewed studies that applied machine learning to data derived from mHealth for asthma management in the last five years. We selected studies that included some human data other than routinely collected in primary care and used at least one machine learning algorithm.
Results: Out of 90 studies, we identified 22 relevant studies that were then further reviewed. Broadly, existing research efforts can be categorized into three types: 1) technology development, 2) attack prediction, 3) patient clustering. Using data from a variety of devices (smartphones, smartwatches, peak flow meters, electronic noses, smart inhalers, and pulse oximeters), most applications used supervised learning algorithms (logistic regression, decision trees, and related algorithms) while a few used unsupervised learning algorithms. The vast majority used traditional machine learning techniques, but a few studies investigated the use of deep learning algorithms. Discussion: In the past five years, many studies have successfully applied machine learning to asthma mHealth data. However, most have been developed on small datasets with internal validation at best. Small sample sizes and lack of external validation limit the generalizability of these studies. Future research should collect data that are more representative of the wider asthma population and focus on validating the derived algorithms and technologies in a real-world setting.

Entities: Chemical

Keywords: artificial intelligence; chronic disease; remote monitoring; asthma; self-management; smart devices

Year: 2022 PMID： 35791395 PMCID： PMC9250768 DOI： 10.2147/JAA.S285742

Source DB: PubMed Journal: J Asthma Allergy ISSN： 1178-6965

Introduction

Asthma is a variable long-term condition, affecting 339 million people worldwide,1 often with diurnal, seasonal and life-time differences in symptoms and disease burden. Although, for many, asthma symptoms are controlled most of the time, some have on-going poor control and all are at risk of attacks which, at best, are inconvenient and at worst can result in hospitalization or even death.2 Currently, there is no cure for asthma, therefore the focus of management is on improving symptom control and reducing the risk of attacks. Asthma is an umbrella term encompassing a range of phenotypes so personalization of management strategies is essential. Monitoring is one of the pillars of management, allowing patients to correctly assess their health and take appropriate action. Mobile health or mHealth is commonly defined as the practice of using mobile technologies in medical care. This can range from using text reminders for medical appointments to healthcare telephone helplines to using home monitoring systems and wearable devices.3 mHealth encompasses many streams of data, most of which are produced faster than a single human can comprehend; machine learning is ideal for processing this amount of data to produce actionable information and personalized feedback. Machine learning involves using computers and algorithms to process large amounts of data (many observations and many variables) and identify patterns without explicit human programming.4 It has provided insights into a very wide range of applications, including genomics,5–7 images,8–10 sound recordings,11,12 vital signs,13 and electronic health records data collected in primary,14,15 secondary,16 and tertiary care.17 Machine learning is an umbrella term, consisting of tools and techniques that use data to learn how to perform a given task, but the algorithms generally fall into two classes, supervised and unsupervised learning. Supervised learning finds a mathematical function to link the data with known labels and is suitable for tasks that have a well-defined goal. Unsupervised learning, on the other hand, describe patterns and structures in the data without following the lead of labels or categories defined by a human. More details about machine learning algorithms are provided in the . Currently, most mHealth interventions that have been implemented in healthcare have focused on reminders and communications.3 Areas of asthma management that machine learning and mHealth can support include monitoring,18 personalizing care,19 providing education,20 understanding patterns in the population to better target care,21 and predicting asthma attacks using a multitude of data sources.22 Broadly, existing research efforts can be categorized into three types: 1) technology development, 2) attack prediction, 3) patient clustering. This clinical review will provide a critical overview of the current research that has leveraged machine learning in the context of mHealth for remote asthma management, its shortcomings, challenges, the extent of readiness for deployment, and future research recommendations.

Methods

We carried out a clinical review and searched PubMed for applications of machine learning to mHealth for asthma management, based on the following inclusion criteria: 1) full text available; 2) available in English; 3) published in last 5 years; 4) including at least one machine learning algorithm; 5) including data collected from humans; 6) including data other than electronic health records; 7) peer reviewed. We excluded systematic reviews, commentaries, and preprints. The terms used to search title and abstract are listed in Table 1. Terms in the same column were joined by the OR operator and the search terms in different columns were joined by the AND operator. Publications in the past five years equated to publications between 1st January 2017 and 30th July 2021.

Table 1

Search Strategy

Asthma	Machine Learning	mHealth	Validation
Asthma*	Predict*, Machine Learning, Artificial intelligence, Bayesian, Machine, Regression	mHealth, Telehealth, Telemonitor, Monitor, Smart, Digital-health, eHealth, Mobile, Smartphone, Track	AUC, Area Under the Curve, ROC, Receiver Operating Characteristic, Accuracy, Validation, Sensitivity, Specificity

Note: Asterisk (*) denotes a wildcard operator, for example, “predict*” represents “predict”, “predicts”, “predicting”, “prediction”, etc.

Search Strategy Note: Asterisk (*) denotes a wildcard operator, for example, “predict*” represents “predict”, “predicts”, “predicting”, “prediction”, etc.

Results

Search Results

With our search terms, we found 90 papers available via PubMed published in the last 5 years. After reviewing the abstracts of all the papers with the inclusion and exclusion criteria, 22 papers were identified and further reviewed in this study (see Figure 1).

Figure 1

Article selection.

Article selection. We classified the studies in three areas: technology development, attack prediction, and patient clustering. Technology development refers to contexts where machine learning is central to developing a new monitoring tool,23–33 such as in cough and wheeze analysis. Attack prediction refers to studies that use machine learning to predict an asthma event (typically an attack) usually using mHealth data.34–42 Patient clustering refers to studies which subtype the asthma population using unsupervised learning algorithms.43,44 See Table 2 for a summary of the papers.

Table 2

Summary of Studies

Study	Category	Participants [Data Source]	Devices	Collected Data	Machine Learning Algorithms	Input Features (X)	Output (Supervised) (Y)	Output (Unsupervised)	Performance	Application to Asthma Management
Chen A, 202023	Technology development	11 healthy adults	2 wireless wearable sensors: the abdominal respiration (Sensor1), the chest respiration (Sensor2)	Respiratory behaviors	Random Forest	100 data points sliding window, 1200 data slices per individual	4 postures: Standing, laying on the back, laying on the left, laying on the right	-	Accuracy = 99.53% (individual classifier)	Monitor sleeping posture and respiratory behavior
Vatanparvar K, 202024	Technology development	131 individuals (age not specified): asthma = 69, COPD = 9, asthma and COPD = 13	Smartphone (Samsung Galaxy Note 8)	1 minute of voluntary cough	Gaussian Mixture Model, neural networks	5380 sound samples of coughs	Coughing individual	-	Sensitivity = 90.30%, specificity = 96.39%, accuracy = 93.34% (NeTrain with cough embeddings)	Passive monitoring of coughs
Prinable J, 202025	Technology development	9 healthy adults	Pulse oximeter, portable sleep diagnostic (Alice PDx)	Raw PPG trace, SPO2, pulse rate, and relative tidal volume (RTV)	Deep learning (LSTM)	45 recordings, 4 features each: PPG, band-passed PPG, SPO2, pulse rate	-	Inspiration time, expiration time, respiratory rate, inter-breath intervals (IBI), and the inspiration-expiration ratio (I:E)	Relative bias <4% (apart from I:E ratio)	Passive monitoring of breathing
Adhi Pramono R.X, 201926	Technology development	Unknown individuals from multiple repositories47	Unknown devices	Cough sounds	Logistic regression	43 recordings. Frequency bands of interest: B-HF and B-01. The spectral features: HFMaxratio, MinMaxratio, and LQMAXratio	2 classes: cough, non-cough	-	Sensitivity = 90.31%, specificity = 98.14%, F1-score = 88.70%, positive predictive value = 88.47%, Matthews Correlation Coefficient (MCC) = 87.46%	Passive monitoring of coughs
Chen H, 201927	Technology development	126 individuals (all ages, infants to elderly) (including asthma and COPD) [ICBHI Scientific Challenge]48 and unknown individuals [R.A.L.E lung sounds]49	Digital stethoscopes	Respiratory sounds	SVM, Extreme Learning Machine (ELM), KNN	240 recordings, 2 features extracted from Enhanced Generalized S-Transform (EGST): mean and standard deviation of EGST coefficients	2 classes: wheezing, normal respiratory	-	Sensitivity = 100%, specificity = 99.27% (ELM, SVM, KNN)	Active monitoring of wheeze
Li K, 201928	Technology development	30 adults (age 19-48) [HARuS]50 and 14 children (age 5-15) with asthma [BREATHE]51	[HARuS] waist-worn smartphone (Samsung Galaxy S II) and [BREATHE] wrist-worn smartwatch (Motorola Moto 360 Sport)	Triaxial accelerometry and gyroscopic data	XGBoost (Gradient Boosted Trees), SVM, random forest	2000 inspiratory and expiratory segments of sounds, 6 features per window of signal: arithmetic mean, SD, median absolute deviation, minimum, maximum, and entropy	6 physical activities: [HARuS] 6 activities: standing, sitting, lying, walking, walking downstairs, walking upstairs; [BREATHE] 6 activities: standing, sitting, lying, walking, walking on stairs, running	-	[HARuS] Accuracy rate = 91.06% (GGS), [BREATHE] accuracy rate = 79.4% (GGS)	Activity recognition of smart watches
Azam M.A, 201829	Technology development	50 individuals (age not specified) with COPD, asthma, bronchitis, and pneumonia	Smartphone (Samsung Galaxy S3)	Airflow 25 cm in front of mouth	Bag-of-Features, SVM	255 breathing cycles, 5 features extracted from instantaneous envelop (IE) and instantaneous frequency (IF)	2 classes: normal, Adventitious Signal (AS)	-	F1-score = 75%, accuracy rate = 75.21 (complete cycle)	Active monitoring of breathing sounds
Adhi Pramono R.X, 201930	Technology development	Unknown individuals from multiple repositories47	Unknown devices	Cough sounds	Logistic regression	43 unique recordings, 4 features each: Linear Predictive Coding (LPC) coefficient, tonality index, spectral flatness, and spectral centroid	2 classes: cough, non-cough	-	Sensitivity = 86.78%, specificity = 99.42%, F1-score = 88.74%	Passive monitoring of coughs
Infante C, 201731	Technology development	87 individuals (age not specified): COPD = 7, asthma = 15, allergic rhinitis = 11, asthma and allergic rhinitis = 17, COPD & allergic rhinitis = 4, healthy = 33	Custom-built electronic stethoscope and Android application	Voluntary coughs recorded from the trachea (30 seconds), standard auscultation lung sound data, peak flow meter reading and clinical questionnaire	Logistic regression with L1 penalization (LASSO)	4 features: Zero Crossing Irregularity, Rate of Decay, Kurtosis, Variance	2 set of labels, diagnosis; cough type (wet or dry).	-	Sensitivity = 100%, specificity = 87%, AUC = 94% (Wet vs dry) Sensitivity = 35.7%, specificity = 100%, AUC = 67.8% (classifying unhealthy patients with cough type)	Active monitoring of coughs
Taylor T.E, 201832	Technology development	20 healthy adults	Inhaler Compliance Assessment (INCA) audio recording device, pneumotachograph spirometer	Audio recording of placebo Ellipta inhalation, inhalation rate	Linear regression, power law regression	15 inhalations per person, acoustic envelope of the inhaler inhalation	Flow rate: peak inspiratory flow rate (PIFR), volume or inspiratory capacity (IC), and the inhalation ramp time (Tr)	-	Accuracy = 90.89% (power law model)	Measure correct inhaler technique
Purnomo A.T, 202133	Technology development	Unknown individuals	FMCW radar	5 to 15 seconds of chest displacement breathing waveforms	XGBoost (Gradient Boosted Trees)	4000 breathing waveforms. [set 1] breathing wave form; [set 2] 8 features: mean, median, maximum, variance, standard deviation, absolute deviation, kurtosis, and skewness; [set 3] MFCC feature extraction	5 classes: normal breathing, deep and quick breathing, deep breathing, quick breathing, holding the breath	-	Precision > 80%, sensitivity > 70%, F1-score > 75% (for all classes, MFCC feature extraction)	Active monitoring of breathing
Zhang O, 202034	Attack prediction	2010 individuals (age >16) with severe and persistent asthma [SAKURA]52	Paper diary	Daily questionnaire: PEF, morning symptoms, evening symptoms, reliever inhaler usage, asthma sleep wakening	Recursive feature elimination, PCA, random under-sampling, random over-sampling, SMOTE, logistic regression, naïve Bayes, decision tree, perceptron	728,535 daily records, 432 features, 9 basic features	2 classes: exacerbation event, no exacerbation	-	Sensitivity = 90%, specificity = 83%, AUC = 85% (logistic regression)	Attack prediction from daily diary and PEF
Tsang K.C.H, 202035	Attack prediction	554 adults with asthma [AMHS]53	Smartphone (BYOT)	Daily and weekly questionnaire: symptoms, healthcare usage, medication usage, triggers encountered, PEF	Decision trees, logistic regression, naïve Bayes, and SVM	2659 periods, 25 features per 14-day period before unstable event, 6 basic features	2 classes: stable, unstable period	-	Sensitivity = 86.6%, specificity = 72.5%, AUC = 87.1% (naïve Bayes)	Attack prediction from daily diary
Tinschert P, 202036	Attack prediction	79 adults with asthma	Smartphone (Samsung Galaxy A3) application based on MobileCoach	ACT, Pittsburgh Sleep Quality Index, Nocturnal cough frequencies (manually labelling audio recordings from the smartphone’s built-in microphone)	Mixed-effects regressions, decision trees based on recursive partitioning analysis	2291 nights, 7 combinations of Pittsburgh Sleep Quality Index and Nocturnal cough frequencies	ACT score	Prediction of exacerbation risk in the next 7 days	56% < balanced accuracy < 70%	Attack prediction from sleep quality
Tenero L, 202037	Attack prediction	38 children (age 6-16): persistent asthma = 28, control = 10	Electronic nose (Cyranose 320)	VOCs in exhaled breath, spirometry	PCA, penalized logistic model	1 recording per person, 32 e-Nose nanosensors	4 classes: control (CON), controlled asthma (AC), partially controlled asthma (APC) or uncontrolled asthma (ANC)	6 most important sensors, 5 principal components	Sensitivity = 79%, specificity = 84%, cross-validated AUC = 80%	Asthma control prediction from exhaled breath
Finkelstein J, 201738	Attack prediction	Adults with asthma	Peak flow meter connected to laptop	PEF, daily questionnaire: symptoms, medication usage, trigger exposure, sleep	Naïve Bayes, adaptive Bayesian network, SVM	7001 records, 147 features, 21 basic variables x 7 days	2 classes: high-alert, no-alert PEF zone on day 8	-	Sensitivity = 100.0%, specificity = 100.0%, accuracy = 100.0% (adaptive Bayesian network)	Attack prediction from daily diary and PEF
Castner J, 202039	Attack prediction	43 adults (working aged women) with poorly controlled asthma	Fitness tracker (Fitbit Charge), activity monitor (Actigraph GT3X+), spirometer (Vyntus), spirometer (MicroDiary), home monitor (Hobo Data Logger)	ACT, Mini Asthma Quality of Life Questionnaire (AQLQ), trait emotionality PANAS-X questionnaire, Consensus Sleep Diary, asthma control diary (ACD), physiologic and environmental sensors, medical record review, and spirometry.	Generalized linear mixed models	900 daily scores, [set 1] 8 features; [set 2] 10 features	[set 1] self-reported asthma-specific wakening; [set 2] FEV1	-	[set 1] AUC = 77% (sleep wakening) [set 2] AUC = 83% (FEV1)	Measure sleep disruption using fitness tracker
Khasha R, 201940	Attack prediction	96 individuals (age >5) with asthma	Weather reports, Air quality, questionnaires, medical records	140 variables about patient demographics, lung function, symptoms, environmental factors, medical history	Ensemble learning, multinomial logistic regression, SVM, random forest, extreme gradient boosting, KNN, decision tree, Gaussian naïve Bayesian, rule-based classifier created from clinical knowledge	2870 daily records, 35 selected variables	3 classes: well-controlled, not well-controlled, very poorly-controlled levels	-	Sensitivity = 88.3%, precision = 89.4%, specificity = 94.9%, neg pred value = 94.3%, accuracy = 92.7% (Ensemble Learning 2)	Asthma control prediction using health records and weather reports
Van Vliet D, 201741	Attack prediction	96 children (age 6-18) with asthma	NIOX analyzer (NIOX MINO), 5 liter inert bag with a resistant free valve (Tedlar bag), spirometer (ZAN 100®)	ACQ, GINA respiratory symptom score, online FeNO assessment, collection of exhaled breath (VOCs), dynamic spirometry	Random forest, PCA	574 chromatograms, 7 VOCs	Exacerbations	Most important VOCs, separation of children with and without exacerbation, possible difference in co-factors between samples of children with an exacerbation 14 days after sampling and those without an exacerbation after 2 months	Sensitivity = 88%, specificity = 75%, AUC = 90% (attack after 14 days)	Attack prediction from exhaled breath and home monitoring
Huffaker M.F, 201842	Attack prediction	16 children (age 5-18) with persistent asthma	BCG accelerometer-based passive bed sensor (Murata Technologies SCA11H)	Heart rate (HR), respiratory rate (RR), HR variability HRV, calculated RR variability (RRV), relative stroke volume (SV), HR percentile based on age, RR percentile based on age, movement, relative Q (= SV × HR), VO2	Random forest	891 nights, 16 features, 8 basic features	Report of asthma symptoms	-	Sensitivity = 47.2%, specificity = 96.3%, accuracy = 87.4%	Attack prediction from sleep and bed monitoring
Tibble H, 202043	Patient clustering	211 children (age 6-15) with asthma attack54	Electronic inhaler monitoring devices	Medication dose taken	PCA, K-means, decision trees	35,161 person-days of data, 5 features: the percentage of doses taken, the percentage of days on which zero doses were taken, the percentage of days on which both doses were taken, the number of treatment intermissions per 100 study days, and the duration of treatment intermissions per 100 study days	-	3 clusters: poor adherence, moderate adherence, good adherence	-	Characterize asthma patients by adherence
Tignor N, 201744	Patient clustering	334 adults with asthma [AMHS]53	Smartphone (BYOT)	Daily questionnaire: symptom diary, medication, triggers encountered	Probability based imputation with consensus clustering (PIC) method (utilize k-means)	1 recording per person. [Cluster formation] daily symptoms; [Characterize formation] 10 features: 4 clinical features, 3 demographic features, 3 trigger features	-	3 clusters: high day symptom rate, medium day symptom rate, low day symptom rate	-	Subtyping asthma patients for personalized alerts based on triggers

Abbreviations: AUC, Area under the ROC curve; BYOT, Bring your own technology; COPD, Chronic Obstructive Pulmonary Disease; FeNO, Fractional exhaled nitric oxide; GINA, Global Initiative for Asthma; LSTM, Long short-term memory; PCA, Principal Component Analysis; PPG, Photoplethysmogram; SVM, Support Vector Machine; VOCs, Volatile organic compounds.

Summary of Studies Abbreviations: AUC, Area under the ROC curve; BYOT, Bring your own technology; COPD, Chronic Obstructive Pulmonary Disease; FeNO, Fractional exhaled nitric oxide; GINA, Global Initiative for Asthma; LSTM, Long short-term memory; PCA, Principal Component Analysis; PPG, Photoplethysmogram; SVM, Support Vector Machine; VOCs, Volatile organic compounds. Most applications of machine learning for asthma management in mHealth involve collecting self-reported data to form the ground truth of a patient’s asthma condition, and some objective data either using smartphones or mobile monitoring devices, or both. Frequently, a validated measures of asthma control is collected (eg, Asthma Control Questionnaire (ACQ)45 or Asthma Control Test (ACT)46) in mHealth studies. Using around five questions about the symptoms experienced by patients, the questionnaires determine whether patients’ asthma is controlled or uncontrolled. Many methods and devices for monitoring different aspects of a person have been studied individually and in combination. Machine learning can be applied to breath monitoring,37,41 sleep monitoring,23,34–36,38,39,42 cough and wheeze,24,26,27,29–31,36 lung function monitoring,23,25,33–35,38,40 adherence monitoring,32,35,38,43 and environment monitoring.39,40,44 However, studies had different outcome measures; hence, it is difficult to conduct a direct comparison between studies.

Technology Development

Developing monitoring tools was a goal for 11 of the included studies. These include identifying sleeping postures from wearable respiration sensor data,23 activity detection using smartwatches,28 home breathing monitoring,25,33 and active24,27,29,31 and passive cough and wheeze detection.26,30 Many of the identified studies on technology development applied digital signal processing (DSP) to process the raw signals collected via sensors, a necessary step before the application of machine learning. Two27,28 out of 11 studies included data from children and five23,25,27,28,32 out of 11 studies included data from adults; however, none of the 11 studies developing monitoring tools had specifically investigated data from a senior population. Some of the studies on adults were conducted purely with healthy adults who could mimic a wide range of breathing patterns.

Sleep Posture

Among patients with asthma, posture (such as standing vs supine) can influence respiratory behavior.55 However, there is conflicting evidence as to whether sleeping posture has a significant effect on respiratory behavior.55–57 Identifying the posture of when the respiratory measurement was taken can be useful when studying posture-related instabilities. Using two wearable sensors located at the abdomen and chest, four postures (standing and three sleeping) were identified with high accuracy. However, the ability to correctly identify postures from sensor data was dependent on knowing to which individual the data belonged. Using this information, the classifier jumped in performance from 21.9% accuracy to 99.5% accuracy, thus adapting this method for asthma management will require more research or include a calibration stage.23

Activity Detection

Smartwatches are increasingly prevalent amongst the public, healthy individuals, and elite athletes to measure their health. This has promoted technology development, so that the sensors are more reliable, affordable, and comparable between brands.58 Motion data (triaxial accelerometry and gyroscopic data) commonly collected in smartwatches was used in activity detection, which could improve the capabilities of passive monitoring potentially replacing the need to ask questions about activity. Using DSP to process the raw signals and supervised learning (gradient boosted tree classification) on two datasets, various activities like standing, sitting, and walking were identified from signals from the wrist worn device with promising accuracy.28 In a comparison between the performance of algorithms trained on two datasets, one in adults and one in children, found the activity detection performed better in adults, but this was confounded by the adults performing tightly proscribed movements and the children recording more natural movements.28

Breathing Monitoring

Breathing monitoring and detecting difficulties in breathing could help potentially identify asthma attacks early. Tools that have been proposed for home monitoring include portable sleep diagnostic devices to monitor breathing,25 and radar to measure chest movement.33 Using deep learning and features from a pulse oximeter, there were accurate predictions of the respiratory waveforms.25 Likewise, applying supervised learning (XGBoost) on features extracted from chest movement recorded by the radar gave promising accuracy of identifying different breathing patterns.33

Cough Monitoring

Like sleep monitoring, wheeze and cough are widely captured as a measure of asthma control and included in validated asthma questionnaires. However, there are also studies combining mHealth and machine learning to develop new tools for monitoring wheeze and cough, both actively24,27,29,31 and passively.26,30 Recording and analyzing voluntary coughs and respiratory sounds from people with different respiratory diseases could provide a tool to assist diagnosis. Although separating wet (cough with phlegm) and dry coughs was successful, there were varying levels of performance when making a diagnosis using recordings alone.24,29,31 Using voluntary cough recordings, one study accurately predicted individuals who were either healthy, had asthma, had chronic obstructive pulmonary disease (COPD), or had comorbid asthma and COPD with an accuracy of 93.3%.24 In contrast, another study using cough type to distinguish healthy people from those with respiratory disease had a much lower performance, with AUC of 67.8%.31 Developing new DSP methods (an essential step to be able to extract relevant information from raw sound signals) have shown promise in wheeze and cough detection from digital stethoscope recordings.26,27,30

Inhaler Technique Monitoring

Measuring adherence to medication is widely studied in asthma research. In addition to measuring when patients took medication, measuring how the inhalers were used and checking for correct technique is another application of mHealth and machine learning. Regression models of DSP processed adult audio recordings from the INhaler Compliance Assessment (INCA) device were found to accurately estimate the inhaler inhalation flow profile with 91% accuracy.32 This objective measure of inhaler technique could help patients improve how they take their medication.

Attack Prediction

Machine learning was applied to several different mHealth data sources to predict asthma attacks and change in symptoms. The data included volatile organic compounds,37,41 sleep quality,36,39,42 peak flow,34,35,38,40 preventer medication adherence,35,38 and environmental triggers.39,40 Two34,40 of the nine studies included data collected from children or teenagers, and adults, but the population was considered as a whole in both cases. Three studies37,41,42 focused on children with asthma, four studies35,36,38,39 focused on adults with asthma, and none of the studies focused on seniors. The performance of the algorithms was unlikely to have been affected by the age group of the study population.

Breath Analysis

Volatile organic compounds (VOCs), stemming from indoor pollutants, that are present in the breath of patients could be used to understand the development of asthma attacks, but evidence is inconsistent.59 Gas chromatography–mass spectrometry (GC-MS) is the gold standard in VOC analysis, but electronic nose (e-Nose) could be a portable alternative. The e-Nose can detect and recognize individual chemical compounds in mixtures of chemical vapors. The VOCs in exhaled breath of children were analyzed using both supervised and unsupervised learning.37,41 Supervised learning methods (penalized logistic models and random forest) were used to identify the most important VOCs for attack prediction. Classifiers were trained to identify which VOCs would predict an upcoming asthma attack or worsening control. The study reported good performance, with sensitivity and specificity between 70% and 90%, and an AUC upwards of 80%. Furthermore, unsupervised learning (principal component analysis (PCA)) was used to pre-process the data to form combinations of VOCs for attack prediction and for visualizing high-dimensional data in a two-dimensional graph.37,41

Sleep Monitoring

Aligned with the clinical recognition of exaggerated diurnal variation causing sleep disturbance as a sign of poorly controlled asthma,60,61 disturbance to sleep was widely used as a potential predictor of worsening asthma. Many studies captured night symptoms and sleep quality using questionnaires,34,35,38 but some collected objective sleep data using devices.36,39,42 Out of 25 features used to predict asthma attacks with daily (symptom diary like-) questionnaires about asthma, night symptoms-related features were two of the four most predictive features.35 Also, night-time waking was selected as one of three basic variables used for prediction.34 When the objective data were combined with machine learning algorithms (random forest, generalized linear mixed models, regression), it enabled smartphone recordings to analyze nocturnal coughs,36 related fitness tracker activity data with sleep wakening,39 and bed sensors to predict asthma control.42 The usefulness of using sensors to predict self-reported asthma control is unclear, using nocturnal cough and sleep quality alone achieving balanced accuracy of no more than 70% in predicting attacks,36 but using fitness tracker data to predict sleep wakening had an AUC of 77%,39 and an accuracy of 87.4% in predicting reports of asthma symptoms.42

Lung Function Monitoring

Falling peak expiratory flow (PEF) is a major indicator of asthma attacks. Peak flow meters are sometimes used by patients at home to take objective measurements and used to inform whether action needs to be taken. Spirometers are another device that measures lung function, but in more detail than peak flow meters.62 Action plans use thresholds of 80% of their best PEF to determine that action needs to be taken, and urgent action is required if a person’s PEF falls below 60%.60 A drop in PEF and/or a change in symptom score are widely used in asthma action plans to determine self-management in response to deterioration.63 Smart peak flow meters enable patients to measure and track their PEF, and are often linked with a mobile app to function. Measuring PEF to monitor lung function is commonplace in asthma studies. This could be either reporting the results from a traditional peak flow meter,34,35,40 or using a smart peak flow meters that sends the data through a computer or smartphone.38 PEF measurements are used as both predictors of asthma attacks as well as defining severity and informing management. Using daily diaries and PEF measurements to predict worsening condition with supervised learning (adaptive Bayesian network) achieved a performance of 100.0% accuracy, sensitivity, and specificity.38

Adherence Monitoring

Adherence to regular preventative medication is sometimes captured by questionnaire and used as a predictor for asthma attacks.35,38 Although clinically important, the two studies did not identify the adherence to controller medication as an important predictive feature in their methods. In contrast, and consistent with clinical recommendations, features based on the use of short-acting reliever medication were two of the four most predictive features.35

Environment Monitoring

Some common asthma triggers in the environment, such as pollen, meteorological change, and air pollution (eg, particulate matter, carbon monoxide (CO), nitrogen dioxide (NO2)), could be monitored to reduce risk of exposure to known triggers. Also, recording asthma triggers encountered, such as viral infections, passive smoke, and pets, could give a better understanding of a person’s asthma and their symptoms.64–66 Connecting data from pollution monitoring stations and meteorology stations with patient health records provides a wealth of information for analysis. Furthermore, combining physicians’ knowledge using a rule-based classifier (analogous to a decision tree created based on knowledge) with conventional supervised learning techniques (multinomial logistic regression, SVM, random forest, extreme gradient boosting, KNN, decision tree, Gaussian naïve Bayesian) created an accurate (sensitivity of 88.3% and precision of 89.4%) ensemble learning algorithm for predicting levels of asthma control.40 Based on the joined dataset, the most important features for prediction were lung function and symptoms: PEF in the morning and before bedtime, ACT score, and shortness of breath in the last 24 hours. Although environmental features were not ranked highly, daily NO2 concentration and daily temperatures were useful.40 Further, home environment measuring device has also been shown to be useful in predicting self-reported asthma-specific wakening.39

Patient Clustering

Two studies43,44 used unsupervised learning to form data-driven clusters using data collected via mHealth. One study was investigated clusters in children with asthma,43 the other had focused on data collected by adults with asthma.44 In addition to capturing adherence to regular controller medication via questionnaires, there has also been in-depth studies of medication adherence. Smart inhalers are devices that objectively measure how inhaler medication is taken, as an alternative to self-report. Monitoring can be applied to the long-acting controller inhaler or the short-acting reliever inhaler, or both. By analyzing electronic inhaler monitoring data of controller medication with unsupervised learning algorithms (PCA and k-mean), asthma patients were characterized by multi-dimensional inhaler adherence measures, which formed three groups, poor (on average 16% of their prescribed doses), moderate (averaged 60% of dose), and good (averaged 91% of dose) adherence.43 Furthermore, comparison with clusters formed by another data-driven method (decision trees) yielded similar results.43 Like many daily questionnaires, recording encounters with asthma triggers can be difficult and lead to missing data. To tackle this, probability-based imputation with consensus clustering was developed as a method of imputing the missing data and clustering patients, which can be used to subtype asthma patients for personalized alerts based on their triggers.44 Using the imputation method, three patient clusters were formed using the daily asthma symptom data. The characteristics of each cluster was investigated on four clinical, three demographic, and three trigger features. Cluster 1, with the highest average day symptom level, had patients who frequently reported pollen and heat as their triggers. On the other hand, cluster 3, with the lowest average day symptoms, was characterized more by patients citing air quality as their trigger.44 Prospectively, weather forecasts could be useful in predicting the risk of a future asthma attack for patients who are sensitive to environmental triggers such as sudden temperature changes or high pollen levels.

Discussion

This review has described a range of machine learning applications being used to support asthma management, in the areas of developing novel technology,23–33 predicting acute attacks at an individual level,34–42 and informing understanding of asthma phenotypes by clustering patients within populations.43,44 There were examples of successful application of machine learning to achieve a novel task (such as attack prediction from sleep quality, control prediction from exhaled breath, characterize asthma patients by medication adherence)36,37,42,43 or to improve existing methodology by using fewer resources for similar or better performance (such as smartphone-based passive monitoring of coughs).24,26,27,30,31,40,41 Most of the machine learning algorithms applied were easily interpretable,26–32,34–39 a desirable characteristic to help easily understand the decision process in a clinical context. However, a few studies applied more complex but less interpretable machine learning algorithms.24,25,40

Developing Novel Technology: Proof-of-Concept with Clinical Potential

Using machine learning, new home monitoring tools were under development, including for activity detection, breath monitoring, cough monitoring, and inhaler technique monitoring.23–33 Most studies were in the proof-of-concept stage and although they were developed on selected small populations, many had achieved promising performance.23–25 An initial challenge, before considering the clinical potential of novel technology, is to process the incoming data so that background noise is removed and clear signals emerge.29 This was the focus of several of the papers that described development of new methods to filter the signal data.26,27,29 Before using the novel technology to monitor asthma at home, validation studies should be conducted in a real-world environment.

Prediction of Attacks: Supporting Individual Self-Management

Asthma is a variable condition,67 and central to supported self-management is the ability to recognize early evidence of deterioration and to take appropriate timely action to prevent a serious attack.68,69 A key aim of many of the machine learning papers was to use a wide variety of data sources to identify an individual’s risk of uncontrolled asthma and to improve prediction of asthma attacks.34–42 All the predictors explored (asthma symptoms, PEF, VOCs, fractional exhaled nitric oxide (FeNO), heart rate, respiratory rate, sleep quality, medication adherence, and environment) showed promise, though it was widely discussed that combining multiple varied data sources could help improve asthma attack prediction.28,34,35,38,40 Importantly, the prediction algorithms were developed retrospectively and require external validation in different datasets before they can be used in clinical practice. Besides the need for external validation, future studies should also consider evaluating the algorithms by comparison to existing effective “action plans” in clinical practice.

Clustering Patients: Informing Phenotypes and Targeting Care

Contemporary understanding of asthma as an umbrella term describing a heterogenous group of conditions70 has increased interest in identifying phenotypes of asthma amenable to specific treatments or carrying specific risks of poor symptom control and/or acute attacks. Using unsupervised learning algorithms, progress has been made on forming patient clusters representing natural patterns spotted in the data.43 Understanding phenotypes not only has value in terms of individual risk and targeting care to “treatable traits” but can inform health service delivery as appropriate care can be targeted on high-risk populations.71 However, many of the studies used relatively small datasets – and often of populations selected for frequent symptoms or willingness to monitor – with limited generalizability to the whole asthma population.23–25,31,36,37,39,41–43 Future research should consider larger sample sizes that can better represent the general asthma population.

Machine Learning Applied to Asthma Management: Challenges

Tailored Data Collection

The performance of machine learning algorithms largely depends on the input data; hence, the sample size and data pre-processing methods must be considered in conjunction with the performance metrics. Most data used to train the machine learning algorithms in this review had small sample sizes, and sometimes used narrow inclusion criteria to collect the data.23–25,31,36,37,39,41–43 For example, a common exclusion criterion for asthma studies is “other respiratory disease”,23,37,41,43,44 which makes for a homogeneous dataset (which may be easier to analyze) but it reduces the likelihood of the results being generalizable. It also overlooks the possibility that the conditions excluded may be part of the phenotype. Even within asthma, different individuals have different medication regimes, which complicates the analysis,43 but selection according to a specific regime (say prescribed combination controller medication) will only give information on a selected population. Importantly, in longitudinal studies where participant retention is a factor, different individuals may provide different amounts of data for analysis, which will skew analysis towards patients who are more engaged with the study, more adherent to data collection, possibly influenced by the characteristics of their asthma.42,44

Secondary Analysis of Existing Datasets

To tackle the problem of small sample sizes, some studies have conducted secondary analysis on data that were collected for a different purpose.27,34 Eight studies (36%) were based on data that were publicly available or available on request.26–28,30,34,35,43,44 This makes for efficient use of data, but the aims (and thus eligibility) of the original dataset may not match the aims of the new analysis thereby making the interpretation of the results more challenging.

Missing Data

How the analysis handled missing data will be important to understand the differences between studies.35,40,42,44 If the amount of missing data is small, removing the cases with missing data is an option. Alternatively, imputing the missing values is a method that avoids losing data, but is a major challenge when there is a low response rate or the data are not missing at random44,72,73 (eg, people with frequent attacks may monitor more regularly than those who rarely have symptoms). Other methods to handle missing data include interpolation into regular spacing or creating summary windows,35 which can then be analyzed using regular methods. However, each method of handling missing data carries their assumptions (for example, assuming people with missing inhaler data and people who reporting using and not using their have the same inhaler usage rate).

Low Event Rate in the Dataset

For many people with less severe asthma, attacks are infrequent leading to large “class imbalance”. In some populations, the imbalance can be upwards of 90%.26,34–36,38,40 Data analysis sampling techniques, such as Synthetic Minority Oversampling TEchnique (SMOTE),74 have been applied to balance out the classes by essentially multiplying the minority class, which allows machine learning techniques to function properly. For example, oversampling techniques can be used to artificially enlarge the number of asthma attacks such that the data now has 50% attacks and 50% controlled asthma.

Inconsistent Output Definitions During Modelling

Different studies of asthma attack predictions had different definitions of an asthma attack and outcome measures. This included using patient symptoms,36,37,39–42 self-reported asthma attack treatment,34,35 and spirometry measurements.38,39 Although sometimes similar, the different definitions cannot be used in direct comparison.73 Furthermore, some outcomes were easier to model based on the input data, thus leading to over-optimistic performance results. For example, Finkelstein and Jeong used 21 daily measures, including symptoms and PEF, to predict asthma attacks.38 However, the asthma attacks were defined as the PEF zone on day 8, which is directly related to one of the input features, namely PEF on day 7. Consequently, it is not sufficient to assess any study based solely on the performance metrics without the broader context.

External Validation

For external validation, the “new” dataset must be the similar in at least the key parameters as the training dataset to meaningfully compare the machine learning algorithms. Ideally, and especially for health data, the methods should be robust and comparable even if there are slight differences in the data. It is highly challenging to externally validate machine learning models partly due to major differences in inclusion criteria and outcome definitions, and most often due to lack of access to comparable data.26,30,41 Slight differences in wording of questions or device choice can create datasets that are similar yet not directly comparable, hence not applicable for external validation (for example, acute attacks might be measured as “needing an oral steroid course” or “unscheduled care” and might be assessed over a year or a few months). In the context of mHealth, this requires similar devices to be used, but rapidly advancing technology may make this a challenge. However, this may change in the future as devices become validated and widely used (like how validated questionnaires and guidelines have allowed studies to be comparable). None of the machine learning algorithms in the 22 studies had been externally validated and were only internally validated.

Data Quality

Conducting data collection in controlled environments enables cleaner data to be collected and analyzed.27,29 However, real-world settings will most likely lead to reduced data quality. Consequently, it is important that a given model’s performance is evaluated for use by actual patients in their day-to-day lives.32,33

Future Direction

Machine learning algorithms are dependent on the data that is inputted. Since most existing studies are based on relatively small sample sizes and often selected populations, the next natural step is to validate the results in larger – and more representative – populations.25,39,43 Future research should consider adding other data sources to existing models, collecting multi-dimensional data using several devices and data sources simultaneously to provides a more complete picture about a person and their environment, whilst also assessing the utility of individual devices.25,28,34,35,38,40 Studies like MyAirCoach22 and Biomedical REAl-Time Health Evaluation (BREATHE)51 that combine several sources of data longitudinally are important for future development of mHealth technologies for asthma. The data used to train the machine learning models included data collected from children, teenagers, and adults, patients with asthma, COPD, and other respiratory diseases, some exclusively and others in combination. Although any variation of the performance in the algorithms trained on data from either age group was unlikely to be directly related to the age, it remains to be seen if the model developed for one population can perform comparably with a new or more general population. Expanding the functionality of technologies developed, improving performance, and validating results against other devices is another area for future research.23,24,27,31,33,37,41 For example, wheeze detection could be extended to other breath sounds,27 expanding its application to other respiratory diseases. Cough detection could be applied to more difficult data, such as a mix of multiple individuals and background noise,24 much like the “cocktail party problem” in machine learning. Developments in image recognition and video analysis using machine learning is promising8–10 and could be applied to enhance inhaler technique monitoring. The data generated by mHealth devices for home monitoring are increasingly reliable and validated against existing gold-standard equipment.58,75,76 However, the validity of the information created by machine learning analysis has not yet reached the standards required by health services. Many more large-scale studies, akin to clinical trials, will be required to test the outputs of real-time analysis using mHealth and machine learning algorithms deployed in the real world.23,28–30,34,42 Although training machine learning models often require a large amount of computing power, the resulting models may be easy to use and can be deployed and run on a mobile phone. An ideal asthma management system combining machine learning and mHealth would intelligently utilize both active and passive monitoring and be validated with clinical trials. Passive monitoring requires minimal input from the patient, such as wearing a smartwatch or switching on a sleep monitoring device, capturing data without interfering with the patient’s daily life. In contrast, active monitoring requires more input from the patient but could provide more detailed information about a person’s condition, such as measuring peak flow or answering questions about asthma control. Using machine learning to infer when active monitoring is required based on passive monitoring data would minimize the need for intrusive data collection, while not reducing the attention given to patients.36,40 Most importantly, systems must be evaluated clinically to ensure clinical (and cost) effectiveness and safety.

Strengths and Limitations

A reproducible search strategy was implemented using the free search engine PubMed database to search for the latest developments in applications of machine learning algorithms, where the focus was placed only on the past five years. The interdisciplinary team who interpreted the papers consisted of practicing clinicians (covering both primary and secondary care) and applied machine learning experts. However, this is not a systematic review, and it was challenging to directly compare studies and algorithms due to diverse contexts.

Conclusion

Recent developments in applying machine learning to asthma management have tested a wide range of functionalities using mHealth devices. The algorithms have demonstrated promising results, but they have only been assessed with internal validation at best. Further, the algorithms were mostly developed on small datasets and a select population. Consequently, the likely performance of these algorithms in the general population in a real-world environment is unknown. Future research should include external validation with large sample size and a focus on combining multiple, diverse sources of data.

62 in total

1. Validation of the Fitbit One, Garmin Vivofit and Jawbone UP activity tracker in estimation of energy expenditure during treadmill walking and running.

Authors: Kym Price; Stephen R Bird; Noel Lythgo; Isaac S Raj; Jason Y L Wong; Chris Lynch
Journal: J Med Eng Technol Date: 2016-12-05

2. Machine-learning enabled wireless wearable sensors to study individuality of respiratory behaviors.

Authors: Ang Chen; Jianwei Zhang; Liangkai Zhao; Rachel Diane Rhoades; Dong-Yun Kim; Ning Wu; Jianming Liang; Junseok Chae
Journal: Biosens Bioelectron Date: 2020-11-06 Impact factor: 10.618

3. Recommendation for optimal management of severe refractory asthma.

Authors: Jaymin B Morjaria; Riccardo Polosa
Journal: J Asthma Allergy Date: 2010-07-26

Review 4. Systematic meta-review of supported self-management for asthma: a healthcare perspective.

Authors: Hilary Pinnock; Hannah L Parke; Maria Panagioti; Luke Daines; Gemma Pearce; Eleni Epiphaniou; Peter Bower; Aziz Sheikh; Chris J Griffiths; Stephanie J C Taylor
Journal: BMC Med Date: 2017-03-17 Impact factor: 8.775

5. MyAirCoach: the use of home-monitoring and mHealth systems to predict deterioration in asthma control and the occurrence of asthma exacerbations; study protocol of an observational study.

Authors: Persijn J Honkoop; Andrew Simpson; Matteo Bonini; Jiska B Snoeck-Stroband; Sally Meah; Kian Fan Chung; Omar S Usmani; Stephen Fowler; Jacob K Sont
Journal: BMJ Open Date: 2017-01-24 Impact factor: 2.692

6. Automated Segmentation of Optical Coherence Tomography Angiography Images: Benchmark Data and Clinically Relevant Metrics.

Authors: Ylenia Giarratano; Eleonora Bianchi; Calum Gray; Andrew Morris; Tom MacGillivray; Baljean Dhillon; Miguel O Bernabeu
Journal: Transl Vis Sci Technol Date: 2020-12-03 Impact factor: 3.283

7. Human postprandial responses to food and potential for precision nutrition.

Authors: Sarah E Berry; Ana M Valdes; Nicola Segata; Paul W Franks; Tim D Spector; David A Drew; Francesco Asnicar; Mohsen Mazidi; Jonathan Wolf; Joan Capdevila; George Hadjigeorgiou; Richard Davies; Haya Al Khatib; Christopher Bonnett; Sajaysurya Ganesh; Elco Bakker; Deborah Hart; Massimo Mangino; Jordi Merino; Inbar Linenberg; Patrick Wyatt; Jose M Ordovas; Christopher D Gardner; Linda M Delahanty; Andrew T Chan
Journal: Nat Med Date: 2020-06-11 Impact factor: 53.440

8. Biomedical REAl-Time Health Evaluation (BREATHE): toward an mHealth informatics platform.

Authors: Alex A T Bui; Anahita Hosseini; Rose Rocchio; Nate Jacobs; Mindy K Ross; Sande Okelo; Fred Lurmann; Sandrah Eckel; Eldin Dzubur; Genevieve Dunton; Frank Gilliland; Majid Sarrafzadeh; Rima Habre
Journal: JAMIA Open Date: 2020-05-07

9. A randomised controlled feasibility trial of E-health application supported care vs usual care after exacerbation of COPD: the RESCUE trial.

Authors: Mal North; Simon Bourne; Ben Green; Anoop J Chauhan; Tom Brown; Jonathan Winter; Tom Jones; Dan Neville; Alison Blythin; Alastair Watson; Matthew Johnson; David Culliford; Jack Elkes; Victoria Cornelius; Tom M A Wilkinson
Journal: NPJ Digit Med Date: 2020-10-30

10. A data-driven typology of asthma medication adherence using cluster analysis.

Authors: Holly Tibble; Amy Chan; Edwin A Mitchell; Elsie Horne; Dimitrios Doudesis; Rob Horne; Mehrdad A Mizani; Aziz Sheikh; Athanasios Tsanas
Journal: Sci Rep Date: 2020-09-14 Impact factor: 4.379

1 in total

1. Predicting asthma attacks using connected mobile devices and machine learning: the AAMOS-00 observational study protocol.

Authors: Kevin Cheuk Him Tsang; Hilary Pinnock; Andrew M Wilson; Dario Salvi; Syed Ahmar Shah
Journal: BMJ Open Date: 2022-10-03 Impact factor: 3.006

1 in total