Literature DB >> 35790317

Role of artificial intelligence in defibrillators: a narrative review.

Grace Brown¹, Samuel Conway², Mahmood Ahmad³, Divine Adegbie⁴, Nishil Patel⁵, Vidushi Myneni², Mohammad Alradhawi³, Niraj Kumar^6,7, Daniel R Obaid⁸, Dominic Pimenta⁹, Jonathan J H Bray¹⁰.

Abstract

Automated external defibrillators (AEDs) and implantable cardioverter defibrillators (ICDs) are used to treat life-threatening arrhythmias. AEDs and ICDs use shock advice algorithms to classify ECG tracings as shockable or non-shockable rhythms in clinical practice. Machine learning algorithms have recently been assessed for shock decision classification with increasing accuracy. Outside of rhythm classification alone, they have been evaluated in diagnosis of causes of cardiac arrest, prediction of success of defibrillation and rhythm classification without the need to interrupt cardiopulmonary resuscitation. This review explores the many applications of machine learning in AEDs and ICDs. While these technologies are exciting areas of research, there remain limitations to their widespread use including high processing power, cost and the 'black-box' phenomenon. © Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.

Entities: Chemical

Keywords: Defibrillators, Implantable; Heart Arrest; Tachycardia, Ventricular; Ventricular Fibrillation

Mesh：

Year: 2022 PMID： 35790317 PMCID： PMC9258481 DOI： 10.1136/openhrt-2022-001976

Source DB: PubMed Journal: Open Heart ISSN： 2053-3624

Introduction

Artificial intelligence (AI) is a broad term that encompasses the many uses of machine-based data processing to achieve outcomes that would typically require human cognitive function.1 In recent years, AI has expanded its role within medicine. In particular machine learning, a type of AI where a model is trained by a learning algorithm from a data set and then applies this model to new data sets, has been widely used in a variety of medical fields. The availability of large data sets, combined with advances in machine learning technology, has led to an increasing number of medical applications in the last few years.1 In this review, we examine the use of machine learning in rhythm classification in automated external defibrillators (AEDs) particularly without interruption of cardiopulmonary resuscitation (CPR) and predicting successful shocks and electrical storm in implantable cardioverter defibrillators (ICDs). The European Society for Cardiology (ESC) and European Resuscitation Council encourage the use of AEDs by emergency services and non-medical members of the public to reduce time to defibrillation.2 The ESC also recommends the use of ICDs in patients with documented ventricular fibrillation (VF) or haemodynamically unstable ventricular tachycardia (VT) without reversible causes or 48 hours after myocardial infarction (MI) on chronic optimal medical therapy.3 While this is an exciting new area, there are some limitations to the widespread use of these technologies, which we evaluate in the Discussion section.

Machine learning

First, we will explain some of the key AI concepts that are discussed in this paper and are currently being used in ECG detection. Table 1 summarises some of these key concepts. There are multiple machine learning techniques, which can be broadly categorised as supervised and unsupervised learning, figure 1.1

Table 1

Definition of common AI terms1 5

Term	Definition
Artificial neural network (ANN)	A deep learning algorithm based on biological neural networks with connected layers of nodes used for high levels of data processing.
Convolutional neural networks (CNN)	A type of artificial neural network which extract high level features directly from one-, two- and three-dimensional data for classification.
Support vector machines (SVM)	Supervised machine learning models used for classification and regression analysis. Data are categorised using an optimal line or hyperplane which maximises distance of the hyperplane from its closest points or support vectors.
Random forest	Supervised machine learning model using a large number of decision trees called estimators, which are combined to give accurate predictions of outcomes.
K nearest neighbours (k-NN)	Supervised machine learning model used for classification and regression based on the proximity of a new datapoint to (k) number of neighbouring labelled datapoints.

Figure 1

Top-down approach to AI. Machine learning is a type of AI, which can be broadly split into supervised and unsupervised machine learning. We will mainly focus on the use of supervised machine learning techniques in defibrillators. Adapted from Refs1 5. AI, artificial intelligence; ANN, artificial neural network; CNN, convolutional neural networks; Definition of common AI terms1 5 Unsupervised machine learning recognises patterns in unlabelled data sets. This can be useful in identifying subgroups from complex data and where labelled data sets are not available.1 The clusters or patterns found may not be related to the outcome of interest and complex data can require large amounts of preprocessing prior to use in order to yield useful outcomes. However, unsupervised methods have still been used in clinical applications, for example, to find patterns in electronic health record data where noise, heterogeneity and incompleteness limit the use of supervised methods.4 Supervised machine learning, on the other hand, involves training models to correctly classify input data with labelled outputs. This requires large numbers of labelled data sets for training. Once trained, these models can then be used to predict outcomes on new data sets in a process known as testing. This can be used for classifying distinct groups, that is, types of arrhythmias or for regression models in data with continuous outcomes. Common types of supervised machine learning algorithm include deep learning, support vector machines (SVMs), random forest and K-nearest neighbour (k-NN).5 Deep learning is a type of machine learning that mimics neural networks in the brain to perform high levels of data processing.5 Artificial neural networks (ANN) contain layers of nodes which manipulate and transform input data; the layers between the input and output layers are termed ‘hidden layers’. Weighted connections between these hidden layers adjust the signal based on importance. During training, these weights are typically ascribed a random value close but not equal to zero. Using these initial weights, an initial output classification is produced by a process called forward propagation. This prediction is then compared with the true outcome and an error signal is fed back to the model, so that weights can be adjusted in a process called back propagation. In this way, the model is optimised.5 Deep learning has been used since the 1950s for multiple types of data inputs. However, its use was initially limited due to ‘overfitting’ - when there is too much focus on specific data points rendering it no longer generalisable to new unseen data sets.6 There have since been various techniques developed to avoid overfitting. In ANNs, drop-out regularisation techniques are commonly used to counteract overfitting and prevent excessive coadaptation of neurons. This involves randomly removing neurons and their weighted connections either temporarily or permanently during training.7 Convolutional neural networks (CNNs) are a type of ANN, which extract high-level features directly from raw data.5 They have been used extensively in medical imaging but can be used to analyse multiple types of one-dimensional, two-dimensional and three-dimensional data sets. As in ANNs, the inputs, for example, two-dimensional pixels or three-dimensional voxels, are passed through multiple layers of neurons before reaching the output. Each layer has a convolutional filter or kernal, which extracts the high-level features such as locality and subsimilarity. This removes the need for manual feature selection and introduction of human bias.8 For example, a CNN model was used by Cohen-Shelley et al for screening of moderate to severe aortic stenosis (AS).9 The CNN model has 62 convolutional layers and one classification output layer—moderate to severe AS or mild to no AS, figure 2. Each ECG represented a 12×5000 matrix, which was the input for the CNN. In the CNN, the weights and bias are constantly modified to reduce the difference between the given output and the labelled outcome in the data set.9

Figure 2

Schematic diagram of a convolutional neural network. Adapted from Ref. 7 AS, aortic stenosis.

Schematic diagram of a convolutional neural network. Adapted from Ref. 7 AS, aortic stenosis. SVMs, random forest and k-NN are also supervised machine learning models.5 SVMs are used for binary classification. SVMs determine the optimum hyperplane to separate data into two classes. They do this by maximising the distance between the hyperplane and the points to which it lies closest, also known as support vectors. Random forest uses a large number of decision trees called estimators. Each of these estimators is trained using a random subset of samples and features from the training set, which increases the generalisability of the outcomes. The final classification is the mode (classification) or median (regression) outcome among the estimators. K-NN classification does not learn patterns from the training data to apply to new data sets but instead directly compares new data with training data. New data are compared with the k most similar points in the training set and assigned as the most common value (classification) or the mean/median (regression). K is a positive, non-zero integer that must be selected based on the specific dataset, number of features and individual problem.5

Medical applications of AI

The clinical applications of AI have been rapidly expanding. One of the most commonly used settings for machine learning is medical imaging and diagnostics. In cardiac imaging, several AI techniques have been used for identification of structures of the heart, lesion detection and segmentation of heart tissue and histological tissue classification.10 ECG interpretation and classification of cardiac arrhythmias is another obvious application of AI. Manual ECG interpretation is subjective, error-prone and varies widely depending on the knowledge and experience of the clinician. Computer-generated ECG interpretation has been widely available since it was developed in 1960s; however, their manual feature recognition algorithms have faced criticism for missing the complexities and nuances of ECGs.11 Deep learning models, in particular, CNNs, have been used for ECG interpretation with human-like accuracy, with one model even out-performing cardiologists.12 Whether the use of complex deep learning algorithms such as CNNs will be used routinely for automated ECG interpretation remains to be seen. In the current age of wearable technology and smartwatches with single-lead ECG capabilities, automatic ECG interpretation is becoming particularly important. The Kardia Band (KB) records a single-lead ECG in Apple Watches. This is then paired with an app which uses CNNs to detect atrial fibrillation (AF). Bumgarner et al found that KB interpreted AF with 93% sensitivity and 84% specificity, compared with physician interpretations of KB recordings with 99% sensitivity and 83% specificity. Of the 113 ECG and KB recordings available, 57 of them were uninterpretable by the KB algorithm but were reviewed by clinicians with 100% sensitivity and 80% specificity. Therefore, this technology still requires clinician input and oversight for the best results and is not yet able to function autonomously.13 Not only can AI be used for standard ECG interpretation but also studies have been assessing its use as a screening tool for asymptomatic moderate to severe AS, asymptomatic left ventricular dysfunction and early pulmonary hypertension—helping in early diagnosis and intervention.9 14 15 Attia et al used paired 12-lead ECG and echocardiogram data from nearly 45 000 patients at the Mayo Clinic to train a CNN for the identification of asymptomatic left ventricular dysfunction using the 12-lead ECG data alone. Their model had a sensitivity and specificity of 86.3% and 85.7%, respectively, and they found that those with a positive AI screen had a four times greater risk of developing ventricular dysfunction in the future than those without.14 ECGs are low cost, non-invasive and widely available—making them an ideal candidate for a screening tool. Another use of this ECG recognition technology is in defibrillators. AEDs were developed for use by untrained bystanders on those who have a sudden cardiac arrest in a public place.2 ICDs are implanted in those with a high risk of sudden cardiac death.3 The key to an appropriate and potentially life-saving shock from the ICD or AED is the recognition of a shockable rhythm such as VF and VT. These rhythms can result in a patient’s death unless a shock is delivered quickly. This is where AI may have a major role to play in reducing time to shock and increasing efficiency of recognition of shockable rhythms.

Methods

A search was carried out on Medline and Embase on 3 April 2021 using the terms ‘AED’ ‘ICD’ ‘defibrillator’ together with ‘AI’ and ‘deep learning’. This resulted in 221 abstracts which were screened for relevance to our topic of ‘Applications of machine learning in AEDs and ICDs’.

ECG interpretation for AEDs

Both traditional machine learning and deep learning techniques have been used to classify shockable and non-shockable rhythms. Table 2 shows examples of the techniques, which have been evaluated for use in shock advice algorithms (SAAs) as well as additional applications of this ECG interpretation technology, for example, diagnosis of prearrest MI.

Table 2

Tabular summarisation of search on ECG Interpretation for AEDs

Study	Study design	Algorithm used	Sensitivity	Specificity	Accuracy	Main findings	Potential limitations
Thannhauser et al18	Prospective registry of ICD recipients	SVM	–	–	–	Automated detection of prior MI from VF waveform	Small sample size Used induced, short duration VF which is more organised than in-field Less generalisable
Krasteva et al19	Retrospective Holter recordings of ventricular arrhythmias and AED recordings of OHCA	ANN	99.6%	98.7%	99.3% to 99.5%	Accurate, automated detection of a shockable rhythm	Cases distributed unevenly with majority used for validation
Picon et al43	Retrospective public database analysis	CNN	100%	99.0%	99.3%	CNNs can accurately detect shockable rhythms from short ECG segments	CNN models require large amounts of data and processing power to train
Coult et al44	Retrospective cohort study	SVM	–	–	–	Prediction of OHCA outcomes	Generalisability Data collection limited to a maximum of four shocks
Elola et al45	Retrospective database analysis	CNN	92.0%	93.0%	92.1%	A recurrent CNN is the superior model for circulation characterisation with a BAC of 90% for 3 s segments	Low specificity
Nguyen et al46	Retrospective public database analysis	CNN	97.0%	99.0%	99.3%	Novel SAA to increase the probability of an appropriate AED defibrillation following cardiac arrest	–
Figuera et al47	Retrospective public database analysis	SVM	97.0%	99.0%	–	Automated detection of shockable rhythms.Interpretability is more challenging using OHCA data compared with Holter recordings	Long response time of over 7 min
He et al48	Retrospective cohort study	CNN	91.0%	91.0%	85.6%	Improved automated prediction of defibrillation outcomes	No phenotypic data or data on long term survival Low sensitivity and specificity
Tripathy et al49	Retrospective database analysis	Variational mode decompensation and random forest classifier	96.5%	98.0%	97.2%	Variational mode decomposition and random forest classifier can be used for classification of VF/VT and non-shockable rhythms	Limited by the size and ECGs in databases used
Sanromán-Junquera et al50	Retrospective database analysis	SVM	–	–	–	Proposed SVM system uses information from the ICD to support the identification of anatomical region of the left ventricular tachycardiac entry site	Single centre study Additional covariates required for increasing accuracy
Li et al17	Retrospective ventricular tachyarrhythmia database analysis	SVM	96.2%	96.0%	96.0%	Validation of a ML-based VF/VT classification system, argued to be superior to conventional classification	Selection of high-quality data
Alonso-Atienza et al16	Retrospective database analysis	SVM	75.0%	92.0%	96.0%	Use of SVM algorithms combining ECG features significantly improves the efficiency for the detection of life-threatening arrhythmias	Generalisability

ANN, artificial neural network; NSR, sinus rhythm; OHCA, out of hospital cardiac arrest; SAA, shock advice algorithm; SVM, support-vector machines; VF, ventricular fibrillation; VT, ventricular tachycardia.

Tabular summarisation of search on ECG Interpretation for AEDs Small sample size Used induced, short duration VF which is more organised than in-field Less generalisable Cases distributed unevenly with majority used for validation CNN models require large amounts of data and processing power to train Generalisability Data collection limited to a maximum of four shocks Low specificity Long response time of over 7 min No phenotypic data or data on long term survival Low sensitivity and specificity Limited by the size and ECGs in databases used Single centre study Additional covariates required for increasing accuracy Selection of high-quality data Generalisability ANN, artificial neural network; NSR, sinus rhythm; OHCA, out of hospital cardiac arrest; SAA, shock advice algorithm; SVM, support-vector machines; VF, ventricular fibrillation; VT, ventricular tachycardia. SVMs have been used in rhythm classification of ECG readings, see table 2. Rhythm analysis in AEDs needs to have both high specificity and sensitivity and low processing power, so the machines are cheap and easily available. Therefore, optimising the parameters for the algorithm can increase efficiency. Alonso-Atienza et al initially used an SVM with 13 ECG parameters, which have been used previously to characterise VF and shockable rhythms.16 They then examined the utility of each ECG parameter individually using three different feature selection filters. They found threshold sample count, sample entropy (measure of similarity with an ECG signal segment) and VF filter (measure of residue after a narrowband elimination filter is applied) to be the most effective in diagnosis of VF. Therefore, a system using just these three features could decrease processing power while maintaining accuracy.16 Li et al similarly optimised their SVM algorithm with the use of only two parameters selected using a genetic algorithm, which mimics natural selection and eliminates weaker combinations to find the optimum combinations.17 They achieved higher sensitivities and specificities with two parameters compared with Alonso et al. However, they both used different window sizes, parameters and databases making them difficult to directly compare. Difficulties also arise as many of these databases use ECG traces from Holter monitors, which differ from out of hospital cardiac arrest (OHCA) traces, which often have more noise. In future, a single public OHCA ECG database with training and test data sets would be useful to allow for comparison of algorithms and more similar training sets to actual OHCA traces. SVMs have also been used for diagnosis of the cause of arrest based on ECG parameters. Thannhauser et al used an SVM to identify previous MI from VF waveforms.18 The diagnosis of previous MI based on VF morphology had previously been performed in animal studies, but this was the first human study that demonstrated ‘proof-of-concept’. This could be used to inform decision-making postcardiac arrest. Elucidating the cause of cardiac arrest is important postresuscitation for prevention of further episodes. However, this is in the early stages and would need to be used with the whole clinical picture for decision-making purposes. Building on previous SVM models, Krasteva et al assessed a CNN for characterisation of rhythms.19 They used large samples of ECG traces, over 3000 and 6000 for training and validation, respectively. However, there were more than four times more non-shockable rhythm samples available compared with shockable rhythms. Their model used ECG traces as short as 2 s with maximal performance at 5 s, meaning their system would cause a minimal break in CPR before shock decision reached. Previous studies have found an average preshock pause in AEDs to be 18 s; therefore, this new technology could greatly reduce breaks in CPR.20 Krasteva et al found that their model outperformed five CNN models in the literature on public and OHCA databases as well as a current AED shock advisory programme using a decision tree classifier, particularly on shorter 2 s ECG traces. This represents a significant step forward compared with previous model. While they used a large sample size for training and validation of their deep learning model, their data set was imbalanced with four times more non-shockable rhythm samples available compared with shockable rhythms. This is a commonly encountered issue with current databases and can lead to bias within the algorithms. We can see from multiple studies in table 2 that CNN models have high sensitivity to detect shockable rhythms and high specificity to rule out non-shockable rhythms. Nonetheless, use of these more advanced machine learning algorithms is currently limited in practice due to the difficulties in embedding them into AEDs with their limited processing power. AEDs must be cost-effective to allow widespread use. While the above studies demonstrate high sensitivity and specificity, the algorithms have only been tested on the computer-based systems and not in AED simulations. Bench studies, such as those used by Jekova et al to assess the accuracy of commercial AED arrythmia analysis algorithm in the presence of electromagnetic interferences, help to evaluate algorithms in simulated real-life scenarios.21

Rhythm classification during CPR

One of the major limitations of AED ECG recognition is that CPR must be interrupted for reliable diagnosis as current algorithms are unable to classify shockable and non-shockable rhythms during CPR due to artefacts. CPR is often suspended for 15 s or more for diagnosis rhythm classification to occur.22 Even small breaks in CPR can impact outcomes; an increase in preshock pause of just 5 s decreases survival by 18%.20 The ability to continue chest compressions while analysing the rhythm would help to minimise interruptions. Table 3 summarises the use of machine learning technologies to analyse rhythms during CPR.

Table 3

Tabular summarisation of search on the use of machine learning algorithms in rhythm classification during CPR

Study	Number of ECG segments used	Study design	Algorithm used	Sensitivity	Specificity	Accuracy	Limitations
Jekova et al26	1545	End-to-end analysis of ECG during CPR in OHCA using CNN	CNN	89.0%	91.7%	–	Data did not contain statistically significant numbers of shockable VT
Hajeb-Mahammadalipour et al51	23816	Development of an automated condition-based filter to removed CPR artefacts for accurate rhythm analysis during CPR	Condition based filtering algorithm followed by ANN	94.5%	88.3%	89.2%	Assumed constant rate of chest compressions constant within the 14 s period Difficulty removing artefacts from asystole ECGs and lack of sufficient asystole ECGs in training set
Hajeb-Mahmmadalipour et al52	3872	Analysis of ECG rhythms superimposed with CPR artefacts using a CNN	CNN	95.2%	86.0%	88.1%	Artificially introduced artefacts from AEDs in asystole not real-life traces Not tested during asystole
Didon et al25	2916	To present new combination of algorithms for rhythm analysis during CPR in AED	Analyse While Compressing (AWC)	92.10%	>99%	–	Small sample of VT rhythms Still requires 'hands-off' reconfirmation of classification in 34.4% of cases
Isasi et al23	272	Rhythm classification during CPR using a recursive least squares filter followed by CNN	Recursive least squares filter followed by CNN	95.8%	96.1%	96.0%	Recursive least squares filter requires thoracic impedance to remove ECG artefacts
Hu et al53	1578	Two-step analysis of ECG during chest compressions whereby if shockable rhythm not identified, chest compression-free analysis occurs	A two-step analysis through CPR algorithm	93.60%	99.50%	–	Small sample size of coarse VT The OHCA cardiac arrests were not treated with a defibrillator until they arrived at hospital Short ECG segments
Isasi et al54	2203	Use of machine learning algorithms following CPR artefact filtering for reliable shock decisions	Least mean squares filter followed by ANN, SVM, Kernel Logistic Regression or Random Forest classifier	94.5%	95.5%	96.0%	Computer based study not ‘bench’ simulation study
Fumagalli et al55	2701	Analysis of ECG during chest compressions with 3 s pause to re-confirm rhythm	Analysis During Compressions with Fast Reconfirmation (ADC-FR) algorithm	95.0%	99.0%	–	Requires thoracic impedance for removal of ECG artefact
Yu et al24	1017	An adaptive filter which can eliminate CPR artefacts from corrupted ECGs without any reference channels can be used for non-shockable rhythm detection during CPR	ANN	95.0%	80.0%	–	Tested with artificial mixtures of clean human ECGs and CPR artefacts collected from pigs Only 24 CPR artefacts produced and superimposed onto the ECG segments

ANN, artificial neural network; CNN, convolutional neural network; OHCA, out of hospital cardiac arrest; VT, ventricular tachycardia.

Tabular summarisation of search on the use of machine learning algorithms in rhythm classification during CPR Data did not contain statistically significant numbers of shockable VT Assumed constant rate of chest compressions constant within the 14 s period Difficulty removing artefacts from asystole ECGs and lack of sufficient asystole ECGs in training set Artificially introduced artefacts from AEDs in asystole not real-life traces Not tested during asystole Small sample of VT rhythms Still requires 'hands-off' reconfirmation of classification in 34.4% of cases Recursive least squares filter requires thoracic impedance to remove ECG artefacts Small sample size of coarse VT The OHCA cardiac arrests were not treated with a defibrillator until they arrived at hospital Short ECG segments Computer based study not ‘bench’ simulation study Requires thoracic impedance for removal of ECG artefact Tested with artificial mixtures of clean human ECGs and CPR artefacts collected from pigs Only 24 CPR artefacts produced and superimposed onto the ECG segments ANN, artificial neural network; CNN, convolutional neural network; OHCA, out of hospital cardiac arrest; VT, ventricular tachycardia. Adaptive filters have been used to remove CPR artefacts. These adaptive filters, such as least mean squares or recursive least squares, use signals recorded by defibrillators, including compression depth and thoracic impedance to model the artefact and remove it prior to rhythm classification. Isahi et al used a recursive least square filter to remove CPR artefacts and a CNN for rhythm classification. They found sensitivities and specificities of 95.8% and 96.1%, respectively. The use of such adaptive filters is limited practically as they rely on additional reference channels for information, which are not readily available in all standard AEDs.23 Similarly, Yu et al used noise-assisted multivariate empirical mode decomposition and least mean squares.24 Even following adaptive filters, ECG segments during CPR can still have more noise than standard ECGs, therefore using specific machine learning algorithms can confer increasing accuracy. Yu et al constructed a neural network to assess the rhythms and identify VF. They found sensitivities >95% and specificities >80%. However, the CPR artefacts were taken from porcine ECGs of pigs in asystole receiving chest compressions not real-life OHCA ECGs.24 Didon et al developed a new protocol termed ‘Analyse While Compressing’ (AWC). AWC is a two-step process where the rhythm is initially analysed during chest compressions and if a shock is advised, the rhythm is confirmed in the absence of chest compressions prior to shock delivery. Reconfirmation of rhythm was still required in 34.4% of non-shockable rhythm cases where the rhythm was not able to be accurately classified, therefore CPR interruptions still took place.25 To avoid the need for adaptive filters or external feedback devices, end-to-end analysis of the rhythm has been evaluated. Jekova et al aimed to optimise an end-to-end CNN model for shock advisory decision during CPR using real-life AED recordings in OHCA.26 Their CNN was able to extract features from raw ECGs during CPR with sensitivities and specificities of 89.0% and 91.7%, respectively. They tested their model on 5591 real-life cardiac arrest rhythms during CPR. Nevertheless, their sensitivities and specificities remain below the American Heart Association (AHA) recommendations for SAA by 1% for VF and 3.9% for asystole.27 Their database unfortunately lacked enough shockable VT rhythms, less than 0.2% of the total number of rhythms, therefore they were unable to report statistically significant sensitivities for VT.26 There is scope for further optimisation of the model possibly with further training datasets or additional layers and channels in the CNN to make the model useful clinically.

Implantable Cardioverter Defibrillators

ICDs rely on recognition of life-threatening VT and VF rhythms before delivering a shock. The SAA must differentiate between shockable rhythms and non-shockable rhythms including normal sinus rhythm, supraVTs, sinus bradycardia, AF and idioventricular rhythms. The SAA must have high sensitivity for shockable rhythms and high specificity for non-shockable rhythms, where the delivery of shock will confer no benefit and can even result in deterioration of the rhythm. Given the catastrophic consequences of missing potentially fatal rhythms, ICDs are programmed with a high sensitivity threshold in order to avoid missed shocks. However, this can lead to high numbers of inappropriate shocks. As a result of these shocks, there are device complications such as reduced battery life and requirement of earlier reimplantation. Moreover, for the patient, there is pain associated with the shocks, worse quality of life and increased risk of dangerous arrythmias.28 Table 4 summarises our search on the use of AI in ICDs. Outside of the SAA, machine learning can be used to predict appropriate candidates for ICD insertion and identify adverse events secondary to ICD including risk of electrical storm. The use of machine learning to predict the success of defibrillation will be discussed below.

Table 4

Tabular summarisation of search on the use of artificial intelligence in ICDs

Study	Number of participants	Study design	Algorithm used	Most accurate predictive factors	Potential limitations	Benefits
Wu et al56	382	Prospective registry analysis	Random Forest	HF hospitalisation CMR derived LA and LV volumes Larger total scar and grey zone extents Lower LA emptying fractions Serum IL-6	Observational study Long enrolment for cohort ICD programming parameters not prescriptive	Identification of predictive factors for appropriate ICD interventions in a cohort of patients suitable for primary prevention ICD insertion.
Van Hille et al57	62	Retrospective database analysis	Drools and ontology reasoning modules	With finer level of granularity DROOLS would be preferred	Small sample sizes Does not use specific instructions	Drools and ontology reasoning approaches are efficacious methods for the triage of AF alerts from ICD devices.
Shakibfar et al29	16 022	Retrospective database analysis	Logistic regression—model 1Random forest—model 2	Total number of sustained episodes Shocks delivered Cycle length parameters	–	Prediction of electrical storm using machine learning models based on ICD remote monitoring summaries during episodes.Random forest superior to logistic regression (p<0.01).
Shakibfar et al30	19 935	Retrospective cohort study	Random forest and logistic regression	Percentage of ventricular pacing during the day Activity of ICD during day Average ventricular HR during day Number of previously untreated tachycardias	Difficult to differentiate nsVT and VT US only (generalisability)	Use of large-scale random forest showed that daily summaries of ICD measurements in the absence of clinical information can predict short term risk of electrical storm.
Ross et al58	71 948	Retrospective registry analysis	Random forest and logistic regression	Family history of sudden death NYHA 4 Previous ICD Thoracic cardiac surgery and biventricular pacemaker insertion	Dual chamber ICDs only No information on leads Single rather than multiple imputation	Random forest can improve identification of mortality and adverse events by dual-chamber ICDs.

AF, atrial fibrillation; HF, heart failure; IL-6, interleukin-6; LA, left atrium; LV, left ventricle; nsVT, non-sustained ventricular tachycardia; NYHA-4, New York Heart Association Classification 4; VT, ventricular tachycardia.

Tabular summarisation of search on the use of artificial intelligence in ICDs HF hospitalisation CMR derived LA and LV volumes Larger total scar and grey zone extents Lower LA emptying fractions Serum IL-6 Observational study Long enrolment for cohort ICD programming parameters not prescriptive With finer level of granularity DROOLS would be preferred Small sample sizes Does not use specific instructions Total number of sustained episodes Shocks delivered Cycle length parameters Percentage of ventricular pacing during the day Activity of ICD during day Average ventricular HR during day Number of previously untreated tachycardias Difficult to differentiate nsVT and VT US only (generalisability) Family history of sudden death NYHA 4 Previous ICD Thoracic cardiac surgery and biventricular pacemaker insertion Dual chamber ICDs only No information on leads Single rather than multiple imputation AF, atrial fibrillation; HF, heart failure; IL-6, interleukin-6; LA, left atrium; LV, left ventricle; nsVT, non-sustained ventricular tachycardia; NYHA-4, New York Heart Association Classification 4; VT, ventricular tachycardia. Electrical storm is a life-threatening condition defined as three or more sustained episodes of VT, VF or appropriate ICD shocks in a 24-hour period. This can be life threatening despite an ICD and, therefore, identifying those at high risk is important. Models for prediction of electrical storm have been assessed; they found percentage of ventricular pacing, cycle length parameters and number of previously untreated tachycardias to be risk factors.29 30

Predicting success of defibrillator shocks

There are multiple potential benefits to the prediction of successful defibrillation. Currently, in OHCA, shocks are delivered, depending on rhythm assessment, following 2 min of CPR.31 This resuscitation protocol does not consider the likelihood of shock delivery being successful at any point during the arrest. AI algorithms can be used to predict likelihood of shock success in the hopes that shocks could be delivered at the optimum time—a summary of these papers is in table 5.

Table 5

Tabular summarisation of search on the prediction of ICD interventions

Study	Data set	Number of participants	Algorithm used	Classification accuracy	AUC	Ventricular arrhythmias
Okada et al59	CMR imaging	122	Substrate spatial complexity analysis	81.0%	0.72	40
Kotu et al32	CMR imaging	54	MATLAB, SVM and k-NN	94.4% to 92.6%	0.96	–
Ebrahimzadeh et al60	ECG	70 (35 normal, 35 sudden cardiac death)	kNN, MLP	84.0% to 99.7%	–	–
Au-Yeung et al37	ECG	788	RF, SVM	–	0.81 to 0.88	3 in 10 patients
Marzec et al61	CIED	235	RF, k-NN, STATA IC	55.3% to 76.6%	0.5	49
Shandilya et al 36	ECG+PetCO₂	153	MDI model	78.8%	0.832	–
Howe et al33	ECG	41	SVM	81.9%	0.75	115
Shandilya et al34	ECG	57 cardiac arrests (90 signals)	SVM	Up to 83.3%	0.85 to 0.93	57

AUC, area under the curve; CIED, cardiac implantable electronic devices; CMR, cardiac MRI; k-NN, k nearest neighbours algorithm; MDI, multidomain integrative; MLP, multilayer perceptron; RF, random forest; STATA-IC, statistical software package; SVM, support vector machines.

Tabular summarisation of search on the prediction of ICD interventions AUC, area under the curve; CIED, cardiac implantable electronic devices; CMR, cardiac MRI; k-NN, k nearest neighbours algorithm; MDI, multidomain integrative; MLP, multilayer perceptron; RF, random forest; STATA-IC, statistical software package; SVM, support vector machines. SVMs have been used to predict successful defibrillation in VF arrest.32–34 Multiple VF waveform characteristics were used in these studies; the best predictors of termination of VF including amplitude spectrum area—a frequency domain characteristic—and slope and root mean square amplitude—time domain characteristics. Howe et al found an accuracy of 81.9% using their model with the aforementioned VF waveform characteristics.33 However, this was only based on a small retrospective study of 41 patients with 115 defibrillation ECGs. Larger sample sizes would be required to validate this system. The accuracy of defibrillation success was improved with waveform capnography. Capnography is being used more frequently in cardiac arrest scenarios as it can also be used for early indication of return of spontaneous circulation. However, use would be limited in community AEDs where capnography is not commonplace and could have issues being implemented without training and ICDs where it is not available. Shandilya et al constructed a similar SVM algorithm assessing VF waveform characteristics with accuracy of 83.3%.34 The patients in their study had received low voltage (120 J) shocks. While this is considered equivalent to higher energy shocks, it may affect how the results can be compared with similar studies. VF is the initial rhythm in only 20%–30% of cardiac arrests.35 We have not seen studies yet assessing prediction of shocks in other rhythms such as VT. In a more recent paper, Shandilya et al performed a retrospective analysis of 153 patients with OHCA cardiac arrest who received at least one shock for VF. Using a multiple domain integrative model, a type of AI model, to classify ECG rhythms and predict defibrillation success, they found 78.8% accuracy with ECG rhythms alone. As above, addition of end-tidal CO2 increased accuracy to 83.3%, unfortunately this information was only available for 48 patients.36 They did not control for preshock pauses and ‘no-flow’ time before defibrillation, which has been previously shown to impact success.35 This was a relatively small study, and larger sample sizes will be required to get more meaningful data. Current sensitivities and specificities are unlikely to be sufficient to justify changing the current protocols. AI could also be used to aid decision-making for implantation of an ICD. Patients with previous MI are separated into high arrhythmia risk groups—who could benefit from an ICD—and low arrhythmia risk groups based on clinical guidelines. Clearly, it would be beneficial to risk stratify patients individually to appropriately provide ICDs to those who might benefit. Markers such as left ventricular ejection fraction and myocardial scar size have been used in AI systems to evaluate arrhythmia risk. Kotu et al used cardiac MRI features including size, location and texture of scarred myocardium to characterise labelled high and low risk groups.32 Using an SVM classifier, they were able to obtain an average accuracy of 92.6% with a combination of scar size and heterogeneity. This technology could be used clinically to aid decision-making, nevertheless the final decision would still need to be clinician led and on a case-by-case basis. As well as predicting the success of ICD shocks, predicting need for a shock prior to the delivery would be useful clinically to warn patients and avoid side effects of ‘surprise shocks’. Au-Yeung et al used data from the Sudden Cardiac Death Heart Failure Trial where they collected preventricular tachyarrhythmia and regular rhythms from patients with congestive heart failure.37 They analysed heart rate variability data 5 min and 10 s before tachyarrhythmia in attempt to identify a ‘signature’ of VF/VT onset. They used both random forest and SVM to assess the data. They found a specificity of 75% for 5 min prediction and 80% for 10 s prediction. With these results, however, there would likely be many false positives. The study was limited as it only assessed patient with heart failure. It is possible that using additional features or more sensitive AI programmes could yield higher sensitivities that could be used in clinical practice.

Discussion

We have outlined above the enormous potential of AI in cardiology and specifically in AEDs and ICDs. Machine learning offers exciting prospects to reduce peri-shock pauses both with increased efficiency of SAAs and the ability of SAAs to classify rhythms without interrupting CPR. In ICDs, machine learning has a number of applications, which could improve the quality of life of patients, including prediction of shock and electrical storm. Despite the enormous potential of AI in the field of defibrillators, there are some limitations to be aware of. Commercially available AEDs already exhibit high specificity. Compared with ICDs, AEDs favour specificity over sensitivity to reduce inappropriate shocks. International standards advise AED sensitivity >90% and specificity of >95% for detecting coarse VF.27 Nishiyami et al found on assessment of four commercially available AEDs that VF was diagnosed and treated correctly in almost all cases.38 Given the technology already has such high rates, it could be argued that newer AI algorithms increase the cost and complexity of machines with minimal gain. However, none of the AEDs investigated could obtain both a >75% sensitivity for VT and >95% specificity for SVT.38 In the future, looking to improve VT and SVT discrimination could be a key area for AI. Overfitting represents a challenge to AI algorithms, whereby the model has learnt in such a way that the rules are only applicable to the training sample and are no longer generalisable.6 7 As well as drop-out regularisation, the large data sets now available help to mitigate overfitting in training of algorithms. In medical imaging, data augmentation has been used to artificially increase the data sets available by creating variants of original images in the data sets.39 Whether this could also be used with ECG traces is unclear. Multiple studies that were used in this review also discussed the issue of not having a single large database to use, so that algorithms could be compared. Therefore, the benefits of a single large database would be twofold. A common issue within the field of AI is the ‘black-box problem’. This is the fact that some AI models, in particular, neural networks, lack interpretability in their decision-making process.40 Many of the studies we reported above have detailed in their methods, which parameters have been used in their algorithms. Nonetheless, it can be difficult to fully explain the outcomes reached based on these parameters. As neural networks become more complex with increasing numbers of layers, they become more difficult to interpret. Explainable AI has been a key area of research, particularly with potential medicolegal issues of incorrect shock decisions. In the USA, bystander AED use occurs in only 2% of OHCA cardiac arrests.41 Another application of AI which we have not yet discussed is in drone delivery of AEDs in order to increase availability of AEDs and reduce to time to initial defibrillation. A recent simulation study in rural Canada found that drone-delivered AEDs decreased time to defibrillation by between 1.8 min and 8.0 min—which would have a great impact on mortality.42 AI could be used to calculate optimum geographical location and possible patrols to allow greatest access to AEDs. There remain some limitations with drone delivery currently including flight path restrictions and an inability to fly in rainy and windy conditions, which would need to be overcome before widespread use. One of the most exciting future advances in machine learning use in AEDs is in rhythm recognition during CPR. This technology has developed from adaptive filters to remove CPR artefacts to the development of end-to-end SAAs. AHA recommended sensitivities and specificities have not yet been reached but with further optimisation of algorithms, this could become a reality soon. One key step will be the development of a large database of real-life AED traces during CPR. Jekova et al were able to use a large database but the proportions of VF for example did not meet criteria, and for further optimisation, more studies will be required.26 Current models have not been able to reduce ‘hands-off’ time completely as they often still require reconfirmation of the rhythm in the absence of chest compressions.25 Further optimisation of these algorithms remains an exciting area of research.

Conclusion

Machine learning remains a promising new technology for SAAs in AEDs and ICDs. These technologies have the potential to increase survival in OHCA by removing the need to stop CPR during resuscitation and optimum timing of shock delivery. They can also be used to help diagnose cause of arrest, for example, previous MI and improving patient quality of life by reduction in inappropriate ICD shocks—all of which could have life changing outcomes for patients. Even small improvements in sensitivities and specificities of these widely used defibrillators could save hundreds of lives. In the future, a single large database of real-life training and testing ECGs would be useful for building and assessing algorithms to allow for comparison of different technologies. We hope to see this technology being integrated into clinical practice in the near future.

59 in total

1. Ventricular Fibrillation Waveform Analysis During Chest Compressions to Predict Survival From Cardiac Arrest.

Authors: Jason Coult; Jennifer Blackwood; Lawrence Sherman; Thomas D Rea; Peter J Kudenchuk; Heemun Kwok
Journal: Circ Arrhythm Electrophysiol Date: 2019-01

2. Comparing Drools and ontology reasoning approaches for telecardiology decision support.

Authors: Pascal Van Hille; Julie Jacques; Julien Taillard; Arnaud Rosier; David Delerue; Anita Burgun; Olivier Dameron
Journal: Stud Health Technol Inform Date: 2012

3. An automated system for ECG monitoring.

Authors: M E Nygårds; J Hulting
Journal: Comput Biomed Res Date: 1979-04

4. Cardiac rhythm analysis during ongoing cardiopulmonary resuscitation using the Analysis During Compressions with Fast Reconfirmation technology.

Authors: Francesca Fumagalli; Annemarie E Silver; Qing Tan; Naveed Zaidi; Giuseppe Ristagno
Journal: Heart Rhythm Date: 2017-09-14 Impact factor: 6.343

5. Analyze Whilst Compressing algorithm for detection of ventricular fibrillation during CPR: A comparative performance evaluation for automated external defibrillators.

Authors: Jean-Philippe Didon; Sarah Ménétré; Irena Jekova; Todor Stoyanov; Vessela Krasteva
Journal: Resuscitation Date: 2021-01-30 Impact factor: 5.262

6. The performance of a new shock advisory algorithm to reduce interruptions during CPR.

Authors: Yingying Hu; Hanqi Tang; Chenguang Liu; Daoyuan Jing; Huadong Zhu; Yazhi Zhang; Xuezhong Yu; Guoxiu Zhang; Jun Xu
Journal: Resuscitation Date: 2019-08-01 Impact factor: 5.262

7. Ventricular fibrillation and tachycardia classification using a machine learning approach.

Authors: Qiao Li; Cadathur Rajagopalan; Gari D Clifford
Journal: IEEE Trans Biomed Eng Date: 2013-07-26 Impact factor: 4.538

8. Cardiac magnetic resonance image-based classification of the risk of arrhythmias in post-myocardial infarction patients.

Authors: Lasya Priya Kotu; Kjersti Engan; Reza Borhani; Aggelos K Katsaggelos; Stein Ørn; Leik Woie; Trygve Eftestøl
Journal: Artif Intell Med Date: 2015-07-04 Impact factor: 5.326

9. Combining Amplitude Spectrum Area with Previous Shock Information Using Neural Networks Improves Prediction Performance of Defibrillation Outcome for Subsequent Shocks in Out-Of-Hospital Cardiac Arrest Patients.

Authors: Mi He; Yubao Lu; Lei Zhang; Hehua Zhang; Yushun Gong; Yongqin Li
Journal: PLoS One Date: 2016-02-10 Impact factor: 3.240

10. Deep Feature Learning for Sudden Cardiac Arrest Detection in Automated External Defibrillators.

Authors: Minh Tuan Nguyen; Binh Van Nguyen; Kiseon Kim
Journal: Sci Rep Date: 2018-11-21 Impact factor: 4.379