Literature DB >> 35342282

Appositeness of Optimized and Reliable Machine Learning for Healthcare: A Survey.

Subhasmita Swain¹, Bharat Bhushan¹, Gaurav Dhiman^2,3,4, Wattana Viriyasitavat⁵.

Abstract

Machine Learning (ML) has been categorized as a branch of Artificial Intelligence (AI) under the Computer Science domain wherein programmable machines imitate human learning behavior with the help of statistical methods and data. The Healthcare industry is one of the largest and busiest sectors in the world, functioning with an extensive amount of manual moderation at every stage. Most of the clinical documents concerning patient care are hand-written by experts, selective reports are machine-generated. This process elevates the chances of misdiagnosis thereby, imposing a risk to a patient's life. Recent technological adoptions for automating manual operations have witnessed extensive use of ML in its applications. The paper surveys the applicability of ML approaches in automating medical systems. The paper discusses most of the optimized statistical ML frameworks that encourage better service delivery in clinical aspects. The universal adoption of various Deep Learning (DL) and ML techniques as the underlying systems for a variety of wellness applications, is delineated by challenges and elevated by myriads of security. This work tries to recognize a variety of vulnerabilities occurring in medical procurement, admitting the concerns over its predictive performance from a privacy point of view. Finally providing possible risk delimiting facts and directions for active challenges in the future.

Entities: Chemical

Year: 2022 PMID： 35342282 PMCID： PMC8939887 DOI： 10.1007/s11831-022-09733-8

Source DB: PubMed Journal: Arch Comput Methods Eng ISSN： 1134-3060 Impact factor: 8.171

Introduction

In this era of technology and advancements, we have come across multiple transformations made by ML/DL systems in industries such as governance, manufacturing, and transportation. Over the past couple of years, the utilization of intelligent systems has increased manifold in various domains, including our routine life. One such realm is healthcare [1, 2], which earlier had been impervious to large-scale technological disruptions. The Healthcare industry across the globe has evolved extensively with the advent of machine intelligence. Nasr et al. [3] explore current state-of-the-art smart healthcare systems, highlighting significant topics such as wearable and smartphone devices for fitness monitoring, ML for illness prediction, and assistive frameworks, including social robots designed for assisted living environments. Bharadwaj et al. [4] confer applications of ML algorithms integrated with the Healthcare Internet of Things (H-IoT) in terms of their compensations, choice, and potential future aspects. The acceptance of ML/DL techniques has sustained exceptional results in versatile tasks such as brain tumor segmentation [5], Saliva sample classification of COPD patients [6], Chronic Neurological Disorder assistance [7], anomaly recognition in the Artificial Pancreas [8], clinical image reconstruction [9], cancerous cell classification, to name a few. It is expected that in the coming years' intelligent software systems will take over much of the human labor, put by radiologists and physicists in examining medical documents. ML will transform conventional medical practice and research. Healthcare has emerged as an active application area for ML/DL models in achieving human-level performance in various pathological tasks [10]. Some of the investigations reported that the intelligent models outperformed clinical experts in certain respects. Esteva et al. [11] illustrate the categorization of skin lesions with a single CNN evaluated against 21 board-certified dermatologists on biopsy-proven clinical diagnosis of the scariest skin cancer The findings show that AI can classify skin cancer with a degree of accuracy equivalent to dermatologists. Rajpurkar et al. [12] build the CheXNet algorithm, which can diagnose pneumonia from chest X-rays at a higher level than experienced radiologists CheXNet outperforming them on the F1 measure. The drive to enhance the performance of ML models in comparison to humans has resulted in a marvelous increase in the conception of computer-aided investigative systems. The potential of AI systems for healthcare applications increased by the development of advanced technologies such as the Internet of Things (IoT), Big Data, cloud computing, etc. Unitedly with the technologies, AI can produce profoundly accurate monitoring and prediction systems that can facilitate human-centric emergency medical assistance. Shishvan et al. [13] exposed a variety of emerging ML algorithms in the context of comprehensive healthcare services. The work introduces the applicability of intelligent algorithms in multiple steps such as data extraction, feature selection, model fitting, training, and execution, and a set of performance measurement metrics to evaluate [14]. Kumar et al. [15] develop a classification structure to categorize the recurrence of specific health conditions based on the clinical history of patients using pre-trained word2vec, GloVe, domain-trained, universal sentence encoder embeddings, and fastText to challenge the sorting of sixteen indisposition conditions within medical histories. In this digital era healthcare services have now extended to wearable devices, IoT, and cloud applications, as we attain a deeper understanding of embedded and automated systems in the clinical context [4]. Developing targeted therapies for personalized treatments, accurate localization of disease hubs, and identifying morbidities will be apparent if intelligent systems are critically developed ascertaining the liabilities united with it [16]. Dhief et al. [17] presented an extensive review of IoT frameworks and state-of-art techniques used in healthcare and voice pathology surveillance systems whereas Alhussein et al. [18] investigated the voice abnormality detection system using DL on mobile healthcare frameworks. Researchers and physicians are reviewing numerous approaches to utilize the skill of DL methods for Intensive Care Unit (ICUs) and critically acclaimed concerns [19-21], similarly, Ganainy et al. [22] proposed a real-time consultation system in the clinical context which forecasts the Mean Arterial Pressure (MAP) values’ current status at the ease of bed accessibility using new ML structures. The majority of intelligent applications utilizing customer records have received disappointing results at some point in their performance due to their obsession with metrics [23-25]. Envisioning the privacy concerns that arise while dealing with data transmission or analysis to model a predictive system settles at a compromising state [26-28]. This paper attempts to acknowledge the diverse techniques of ML and their diligence in the Healthcare ecosystem. A brief of subsequent sections is provided next [29-32]. This paper shares a concise statistical background of ML Algorithms while discussing multiple ML models, their application in clinical aspects, along with certain hindrances, and any possible solutions to tackle those shortcomings. This paper outlines various challenges related to medical analysis using ML and DL techniques. This paper analyses and lists different heterogeneous sources contributing to healthcare data and the flaws associated. This paper describes the applications of ML in healthcare for medical prognosis, computer-aided detection, diagnosis, and treatment. Further, the associated drawbacks are outlined as well. This paper lists different types of vulnerabilities in the ML pipeline and their sources. Further, the work highlights various techniques to avoid information breaches and preserve the privacy of data for clinical users. The remainder of the paper is organized as follows. Section 2 presents the various ML algorithms, their applications, and their mathematical background. Section 3 presents the different applications of ML in the healthcare systems and tries to bring the present scenario where utilization of the intelligent systems to automate regular tasks is demonstrated. Section 4 witnesses the probable vulnerabilities that are encountered during the preparation of ML models in the healthcare pipeline. Section 5 presents a study to recognize the privacy challenges concerning the involvement of AI systems and various approaches to preserving privacy concerns. Conclusively, Sect. 6 presents imminent prospects and areas that require further research followed by the chapter conclusion in Sect. 7.

Background of ML Algorithms

The majority of developing countries have invested their time and money in advanced technical prospects that in some way or other prove to be cost-effective in the long run. Development is often associated with the advent of automated machinery and mechanical systems as we grow towards becoming a data-centric world. Management and effective use of data at the industrial level is an irksome task if humans run the errands, this is where the applicability of various ML/DL-based intelligent systems gain its importance. ML algorithms are developed specifically for supporting models to solve a problem in different domains (e.g., Healthcare, Fintech, Industrial, etc.) [33]. Okay et al. [34] demonstrate that applying (Interpretable Machine Learning) IML models to sophisticated and difficult-to-interpret ML approaches provides thorough interpretability while preserving accuracy, which is challenging when crucial medical choices are at stake. Ileberi et al. [35] implement an ML-based framework, Synthetic Minority over-sampling Technique (SMOTE), for credit card scam exposure since it outstrips other prevailing methodologies. Ahsan et al. [36] propose a unique prognostics framework based on statistics-driven ML modeling for forecasting qualification test results of electronic components, allowing a decrease in qualification test cost and time. Hari et al. [37] offer a supervised ML method built by modeling the behavior of Gallium Nitride (GaN) power electronic devices for reliably forecasting the current waveforms and switching voltage of these innovative devices. Seng et al. [38] concentrate on how computer vision (CV) and ML practices may be applied to existing vinification actions and vineyard organizations to obtain industry-relevant outcomes. Rehman et al. [39] provide an ML technique for the localization of brain tumor cells utilizing the textonmap image on FLAIR scans of Magnetic Resonance Images (MRI). Singh et al. [40] offer a unique ensemble-based classification technique that combines AI, fog computing, and smart health to create a reliable platform for the early identification of COVID-19 infection. Comparatively, Vyas et al. [41] offer an ML model powered by a multimodal method for assessing a patient's readiness to suggest the hospital plays an important part in action design based on patient choice. Some ML algorithms and their purposes are discussed in the forthcoming sections. A summary of the different ML algorithms discussed in this chapter is depicted in Fig. 1.

Fig. 1

Illustration of various ML algorithms and their categories

Regression Models

Regression analysis is a statistical modeling method that aims to define a relationship between a dependent and independent variable (linear or polynomial) [42]. This predictive modeling technique can be utilized for forecasting, time-series modeling, predictive analysis, etc. Various types of regression methodologies subsisting are Linear, Polynomial, Logistic, Multivariate Regression, Ridge, and Bayesian Linear Regression. Some of these are discussed next.

Linear Regression

Linear regression models have transformed the statistical view of supervised learning for quantitative response prediction of a relation linking the independent (input vector) and dependent variable (output vector). The relationship is represented by a linear function (regression technique) with a formidable perfection. In the ML arena, Linear regression models outperform simplicity while preserving considerable interest and ease of interpretability. Velez et al. [43], presented a straightforward definition of ML as “the capacity to explain or show human eccentricities in understandable terms”. Linear regression targets to access a direct relationship (function) f that justifies the relationship between an input vector x having dimension d and a real-value output y (i.e., f(x)) as where is identified as the intercept of the function and is the coefficient vector corresponding to the individual input variables. To calculate the regression coefficients and , a training set (, ) is required where A ∈ denotes k training inputs, and, denotes k training outputs where each is affirmed with the real-entity output . The prime objective is to reduce the empirical risk, quantifying via the relation between predictor and the response, for each Loss functions are a measure of the amount of deviation resulting from the actual outputs concerning model performance. The least squared estimate is one of the widely used loss functions for regression models and also has minimal variance amongst all unbiased linear estimates. Working a regression model by reducing the Residual Sum of Squares (RSS) between the predicted outputs and the labels is expressed as [44] Certain downsides include high variance, where a model may properly reflect the data set but may overfit to noisy or otherwise unrepresentative training data, reducing prediction accuracy and making it unsuitable for fitting. However, alternative approaches like Linear Dimension Reduction (LDR), this approach generates a low-dimensional linear mapping of the original high-dimensional or noisy data that maintains some characteristic of interest, denoises or compresses the data, extracts important feature spaces, and other benefits, further, forward or backward elimination allows to avoid overfitting and reduce robustness. The processing and manipulation of data are often associated with noise, creating a diminishing impact on the model's performance [45]. The link between regularization and robustness due to noise is represented as: In this regard, the noise is expected to vary accordingly to an uncertainty set , and the learner inherits the robust behavior, where is a convex function that calculates the remainder [46]. Regression models can sometimes renounce the correct interpretability due to a significant no of features against fewer data, to overcome the shortcomings and multicollinearity, various feature selection strategies are applied.

Shrinkage Models

To produce a more predictable model the value of regression coefficients is depreciated with the help of some regularization methods also known as Shrinkage methods at the cost of importing some bias in model ascertainment. The principal intention behind shrinkage methods is penalizing the regression coefficients on the loss function towards a fundamental point, like the mean. Some common shrinkage methods include Ridge Regression which penalizes the norm-2 of the regression coefficientswhere controls shrinkage magnitude, lasso regression penalizes norm-1 and tries to minimize the quantity by Least Absolute Shrinkage and Selection Operator (Lasso) Regression is an extension of linear regression supplemented by shrinkage. The lasso approach favors models with fewer parameters, well-suited for models with high degrees of multicollinearity, or for developing automation of some rudiments of model selection. Lasso models are more interpretable as compared to ridge regression due to large which compels some of the estimated coefficients to be equivalent to absolute zero. The estimation accuracy of subset selection is driven solely by the disturbance present in the input dataset, to reduce the effect of foreign particles and to shun numerical issues, the Tikhonov regularization term () with weight > 0 is introduced along with the cutting plane approach [47].

Regression Models Beyond Linearity

Linear correlation is naturally extended to complex non-linear terms, which may apprehend composite relationships between predictors and regressors. Non-linear regression models extend to include step functions, exponential, local regression, smoothing, regression splines, and polynomial regression into the Familia. Otherwise, the Generalized Additive Models (GAMs) [48] maintain the additivity of the original predictors , and the relation between every feature and the response y is expressed using nonlinear functions such as To preserve a certain level of predictors interpretability concerning linear models, GAMs escalate the flexibility and accuracy of prediction with the aid of non-parametric models such as boosting and random forest. The predictors are expressed in the form of . The efficacy of GAMs is underrepresented in scenarios where observations exceed predictors. Piecewise affine forms appear as suitable models when the correlated function is found separable, discontinuous, or fuzzy to complex nonlinear expressions [49, 50].

Classification

Classification refers to segregation or mapping of unlabelled data items (entity α) based on a trained dataset () where every has a predefined class relative in a specific category. Classification admits multiclass and binary approaches including logistic regression, Linear Discriminant Analysis (LDA), Support Vector Machines (SVMs), and decision tree mechanisms [51].

Logistic Regression

In critical domain functional relationship between and is absent. Considering this situation, the relation between and has to be described in a general way by a framing a probability function considering that the train data preserves independent bits from. Here the label is assumed to be binary, i.e., the finest class membership conclusion is to choose the label that amplifies the distribution imperatively. Logistic regression examines the probability of belonging to a class for one in the two categories of the dataset by [52] The prominent decision boundary between the binary classes is marked by a hyperplane (that maximizes the measure of deviation) is described as . The parameters and are obtained by maximum-likelihood estimation method To conclude at a globally optimal solution, order method such as gradient descent for positioning a differential function's local bottom, taking recurrent steps in the conflicting course of the function's incline at the current point, in the steepest descent direction and order such as Newton's method where each iteration entails fitting a parabola to the graph of a differential function at a trial value p and then determining the minimum or maximum of that parabola (called saddle point), come into play. Further tuning of the logistic regression models can be achieved by variable selection to avoid overfitting, forward selection to add variables, or backward elimination to withdraw variables based on the statistical relevance of the coefficients.

Decision Trees

Classification is often associated with a non-parametric model, Decision Trees (DT) for a conclusive decision on any hypothetical or real-world instance using distribution rules expressed as a tree data structure. Statistical indicators (such as mean, median, or mode) recline the intuitive prediction of the model on the segmented training data. DTs are good for large datasets with less dimension and can handle both numerical and categorical values. Entropy is calculated for each candidate i.e., the average weighted probability, and combined them to find the average of each node, represented as , where ‘H’ represents the entropy for the given weight ‘s’ and ‘’ if the frequency of the probability of an element per class ‘i’ in the data. Subtlety, the Gini Impurity is given as evaluates the impurity of each candidate node and hence the root with the least impurity can be picked easily. Similarly, the Information Gain (IG) which quantifies the quantity of split is represented assimplifying it to . This can be estimated aswhere ‘H(s)’ is the entropy for the data given the variable ‘a’. To avoid overfitting of data, pruning along with other techniques such as Smit and Konin are taken into consideration. Pruning of a tree is an essential measure to ensure unbiased decisions, represented aswhere ‘R(T)’ is the total misclassification rate of terminal nodes, ‘T’ no of terminal nodes and ‘’ is the cost complexity measure. Various recursive procedures help in the splitting of training datasets to parse them through segmentation. Since recursive procedures have a distinguished greedy nature, it has failed at times to settle at global optimum, giving chances to implement certain other alternatives such as the heuristic approach based on mathematical programming paradigms (i.e., linear optimization) and dynamic programming. Consider an example of a simple classification tree, where the tree determines the health status and need of exercising for elderly people based on their activities. Figure 2 represents the decision process. Okaty et al. [53] propose a fresh stratum-based DT model for precise localization of anatomical landmarks in clinical image scrutiny. Liang et al. [54] provide an effective and privacy-preserving DT classification strategy for health monitoring systems (PPDT). They turn a DT classifier into a boolean trajectory, then encode with symmetric key encryption. Zhu et al. [55] present a novel Multi-ringed (MR) Forest framework based on DTs for the reduction of false positives in pulmonary node detection. Various algorithms that utilize fed data to generate decision trees are Classification and Regression Tree (CART), Iterative Dichotomiser 3 (ID3), ID 4.5, etc.

Fig. 2

Decision tree to predict the need for exercising for elderly people based on their activities

Decision tree to predict the need for exercising for elderly people based on their activities Step 1: Start. Step 2: Randomly shuffle and select n training samples from the dataset along with replacement. Step 3: Calculate the entropy of the target. Step 4: The dataset is then split into different attributes. The entropy for each branch is calculated. Then it is added proportionally, to get total entropy for the split. The resulting entropy is subtracted from the entropy before the split. The result is the Information Gain or decrease in entropy. Step 5: Choose the attribute with the largest information gain as the decision node, divide the dataset by its branches and repeat the same process on every branch. Step 5.1: A branch with an entropy of 0 is a leaf node. Step 5.2: A branch with an entropy of more than 0 needs further splitting. Step 6: End.

SVM

Under the hood of supervised machine learning algorithms in the statistical learning category, SVMs receive vital attention in the optimization approaches. SVMs intend to identify a hyperplane with a maximum margin separating two significant classified classes. Given a training set with training inputs where and being the binary response variable, SVM identifies the margin of separation as . Provided, represents the vector of coefficients for input variables and is the intercept of the distinguishing hyperplane [56].

Hard margin SVM

Hard margin SVM is known as the simplest version of SVMs that proceeds with an assumption that a hyperplane exists which physically separates data into two different classes avoiding misclassification. This optimization technique is categorized as a linearly constrained convex quadratic problem. Following this model's training, a hyperplane is identified which separates the data keeping the distance to the closest data point from the margin of separation maximum. The distance of a data point to the hyperplane is given bywhere expresses the norm-2. Therefore, the data points with labels are on one side of the hyperplane such that while the data point with labels are on the other side . Now to find the hyperplane an optimization function has to be dealt with,s.t., , , , which is recognized as a convex quadratic problem. Often the accuracy of optimization by forcing the separability of data on a linear hyperplane is traded off which rules out the practicability of this version of SVM, this is where soft-margin SVMs outperform hard-margin SVMs.

Soft margin SVM

The convex quadratic problem becomes infeasible when data is not separable on linear terms. An alternative to this problem exists by minimizing the errors average. To minimize the data points tinkering on the unfavorable side of the hyperplane a slack variable in the constraints of the objective function is introduced which is then penalized as a proxy. The soft-margin escalation problem is discussed aswhere , , . Considering another alternative as to introduce an error term in the objective function using the squared hinge loss function instead of the hinge loss function to attain specificity of soft-margin SVM. The misclassification rate of this optimization strategy maximizes when norm-2 is replaced with norm-1 leading to linear optimization problems.

Sparse SVM

Various approaches have been proposed to deal with sparsity (feature selection in classification model) in SVMs among which 1-norm, elastic net (both 1-norm and 2-norm) are common. The approach is applied to the model which tunes bias to one of the norms using a hyperparameter [57]. The number of features selected can be modeled in the soft-margin optimization problem by using binary variables where indicates feature is selected else . A constraint restricting the feature number for an optimum desired reach can be resulting in a mixed-integer quadratic catch ass.t. , , , , , .

SVR

Support Vector Regression (SVR) is a supervised machine learning technique that is designed to handle regression difficulties. Regression analysis comes in handy while observing the relationship between one or more predictor variables and dependent variables since it can balance the complexity of the model and prediction error [58]. SVR is an extension to classic SVM that is introduced for binary classification buttressing the core idea of recognizing a linear function approximated with a tolerance variable training set () where [59]. SVR has shown optimal performance in handling high-dimensional data that deals with regression problems. SVR uses a similar approach to SVM to perform classification using hyper-planes defined by a few support vectors and can easily handle non-linear regression competently [60]. However, a linear function might not always be derivable thus slack variables expressing deviations from the expected tolerance are introduced and minimized similar to the way of soft-margin SVMs. Following, the optimization problem is stated. Hyperparameter (P) tuning further adjusts the weight on deviation from tolerance. This deviation is the -insensitive loss function given by

Clustering

Clustering is a widely used class of supervised learning that focuses mainly on the grouping of a set of objects into smaller clusters of similar genera. This common statistical data analysis technique finds its application in the domains of pattern recognition, bioinformatics, data compression, image analysis, and information retrieval. Healthcare sectors collect massive amounts of data from various healthcare service providers, and this data may include information such as patient information, medical tests, and treatment specifics. Because of the intricacy of the data obtained, analyzing the data for decision-making on a patient's health state is tough. Numerous strategies, such as clustering, are currently used by healthcare practitioners to determine a patient's health state. Clustering is an unsupervised learning method that divides huge datasets into smaller groups based on related properties [61]. This method is usually used to find commonalities between data points. The most common use of unlabeled learning (Unsupervised learning) has been to generate a cluster or group of items in a dataset. Given an input , which includes k unlabelled observations, with , clustering aims to procure subsets of , i.e., individual clusters, which are homogeneous as well as separated. The cluster estimation acts as a tuning parameter that needs to be corrected before examining the clusters. The degree of separation and homogeneity can be modeled based on the different criteria which give rise to several types of clustering algorithms such as K-means Clustering, Capacitated Clustering, Hierarchical Clustering, etc.

K-Means

K-means clustering or minimum sum of squares clustering is a vector quantization method that aims to partition the no. of data observations into disjoint clusters with an affiliated minimum central mean for each sample. The decision on the cluster proportions is considered by close examination of the elbow curve, or similarity indicators, such as Calinski-Harabasz index, silhouette values, or via statistical programming approaches [62]. Binary variables described as and the centroid of each cluster , the difficulty of reduction in cluster variance is provided as a nonlinear equation [63], . Introduction of the variable which denotes the distance of observation from centroid, the following linear dimensional formula is obtained as Apart from the above-mentioned methods several other alternatives such as the heuristic approach based on gradient method, bundle approach, and a column generation approach are in practice. Figure 3 represents the clusters with K-means as their centroid, all classified distinctly.

Fig. 3

Clusters with K-means, classified

Clusters with K-means, classified Input: coordinates dataset , Count of clusters K. Step 1: Initialize k centroids randomly. Step 2: Attach each coordinate in dataset D with the closest centroid. This will circulate all coordinates into K clusters based on their similarity. Step 3: Re-compute the coordinates of centroids. Step 4: Repeat Steps 2 and 3 until the positions become constant or fixed. Output: Data points with cluster membership.

Capacitated Clustering

The Capacitated Centred Clustering Problem (CCCP) aims to catalogue a bunch of clusters with a limited capacity and correlation indicated by the similarity index of the cluster’s mean. Considering a group of expected clusters from CCCP can be mathematically represented as . Where is the uppermost bound on the clusters, represents the measure of dissimilarity between cluster and observation i. is the capacity of cluster, and is the weight of observation . Variable denotes the assignment of to and variable is equivalent to 1 when cluster is used. If the variable is a distance and the clusters are homogeneous then the formula also models the well-known facility location problem [64].

Linear Dimension Reduction

Linear dimensionality reduction or shrinkage methods have been developed extensively for ages in the domain of statistics and applied fields to become an indispensable tool for analysing high-dimensional and noisy data. These methods improve the model's interpretability by producing a low-dimensional linear function from the original high-dimensional data that preserve features of interest in the output sample [65].

Principal Components

Principal component analysis (PCA) targets prune the sum of squared residual errors between the original high-dimensional data and projected data points. PCA trail in terms of explained variances, which refer to the quantum of information regained from the original feature set PCA was formulated originally as where is a unit vector. The problem above was sensitive to the presence of outliers. To improve robustness, the original formulation later grew equivalent to "maximizing variance" derivation given as where. PCA finds its application in various data analytics problems which benefit from dimensionality reduction mechanisms. For linear regression models, there exists Principal Component Regression (PCR) a two-staged procedure that inherits the properties of PCA accompanied by the advantage of including fewer predictors and reduced predictability time in the same variable dataset. Amid all the resolute outcomes of PCA, the only known drawback is interpretability.

Problems in Healthcare Sector

A change toward a data-driven socioeconomic health slant is taking place. This is due to the increased volume, velocity, and diversity of data attained from the public and private sectors in healthcare and natural sciences in a wide range. Over the last five years, there has been remarkable advancement in informatics technologies and computational intelligence for use in health and biomedical sciences. However, the full potential of data to address the breadth and extent of human health problems has yet to be realized. The properties of health data present intrinsic limitations to the effective implementation of typical data mining and ML technologies. Aside from the volume of data ('Big Data’) they are difficult to manage because of their complexity, heterogeneity, dynamic nature, and unpredictability. Finally, practical obstacles in applying new and current standards across different health providers and research organizations have hindered data management and the interpretability of the results. Oliveira et al. [66] address the issue of interpretability of the results acquired from the study of clinical data and goes on to explain the cluster labels by deciphering the appropriate events. Consecutively, Mengoudi et al. [67] use self-supervised representation to train DNNs to detect diverse cognitive processes in healthy people. As a result, the model learns to encode high-level semantic information, which is then utilized to distinguish between control people and dementia sufferers. Intelligent methods are now being used to solve possible challenges in the healthcare business.

Applications of ML in Healthcare

Healthcare sectors spawn a comprehensive quantity of heterogeneous information and data daily, which makes it difficult for the data to be analysed and processed by conventional methods. DL and ML methods help simplify the arduous methods to automate the task for actionable insights. Besides, the sources of data can intensify healthcare service information into distinct quarters such as medical data, social media data, environmental data, and genomics. Table 1 accumulates the contributions of various researchers in different domains of ML applicability over time. ML/DL techniques can serve to automate and improve performance in major healthcare applicative sectors such as prognosis, diagnosis, treatment, and clinical workflow. A depiction of the extensive amount of heterogeneous data sourcing into healthcare systems is shown in Fig. 4.

Table 1

Summary of contributions made by researchers over time

Application of ML in healthcare	References	Year	Contribution
Electronic health records (EHRs)	Stojanovic et al. [68]	2017	Modeled healthcare quality via compact representations of EHRs
	Brisimi et al. [69]	2018	Presented Chronic disease prediction hospitalization from EHRs
	Shickel et al. [70]	2018	Analyzed advances in DL techniques for EHRs
	Fuente et al. [71]	2019	Developed a solution for searching behavioral patterns in EHRs using the Random Forest algorithm
	Harerimana et al. [72]	2019	Presented deep learning strategies for EHRs analytics
	Bernardini et al. [73]	2020	Developed solutions for discovering type-2 diabetes in EHRs using sparse balanced SVMs
	Tsang et al. [74]	2020	Modeled skimpy data for feature selection in the prediction of Dementia patient’s admission using EHRs
	Lee et al. [75]	2021	Proposed classification of opioid usage for total joint replacement patients
	Kumar et al. [15]	2021	Developed Ensemble ML approaches for morbidity identification from clinical data
Medical image analysis	Zebari et al. [76]	2020	Improved automated segmentation of pectoral muscle and breast cancer boundary in mammogram images
	Zech et al. [77]	2018	Developed Automated annotation of clinical radiology reports using natural language-based models
	Jing et al. [78]	2018	Developed Automatic generation of radiology imaging reports
	Li et al. [79]	2021	Developed solution Using histopathological images to classify and diagnose lung cancer subtypes
	Mandal et al. [64]	2018	Surveyed on medical imaging transformation across the healthcare spectrum
	Umamaheswari et al. [80]	2018	Developed digital imaging to Classify and segment acute lymphoblastic leukemia cells
	Wang et al. [81]	2019	Used sparse multi-regularization learning and multi-level dual network features to classify breast cancer images
	Abhinaav et al. [82]	2019	Developed ML mechanism using extracted Papanicolaou Smear images to detect abnormality and severity of cells
	Bora et al. [83]	2020	Proposed a radiograph generating reconstruction mechanism for facilitating AI in medical imaging
Treatment	Weng et al. [84]	2017	Provided analysis on ML prediction of cardiovascular risk using routine medical data
	Fatima et al. [85]	2017	Surveyed ML algorithms for disease diagnosis
	Zhao et al. [86]	2019	Applied ML approach for drug repositioning of Schizophrenia and anxiety disorders
	Jamshidi et al. [87]	2020	Proposed DL approaches for diagnosis and treatment of the novel coronavirus
	Li et al. [88]	2019	Assessed ML for predicting severity in liver fibrosis for chronic HBV
	Noaro et al. [89]	2021	Developed ML-based model for improving the calculation of Insulin Bolus of type-1 diabetes therapy
	Yang et al. [90]	2017	Proposed a combined ML algorithm for effective medical diagnosis and treatment using an inference engine
	Chaitra et al. [91]	2020	Proposed an ML model for diagnostic prediction of autism spectrum disorder
Computer aided-detection (CAD)	Saygılı et al. [92]	2021	Developed ML methods and soft computing strategies for computer-aided Covid-19 detection from CT-Scan and X-ray images
	Abdelsalam et al. [93]	2018	Presented the computer-aided detection of leukemia using microscopic blood-based ML
	Wu et al. [94]	2018	Developed DL techniques to detect hookworm in wireless endoscopy images
	Yu et al. [95]	2021	Implemented ML-aided imaging analytics for histopathological image diagnosis
Disease prediction and diagnosis	Suresh et al. [96]	2017	Presented clinical event prediction and analysis using DL mechanisms
	Rau et al. [97]	2018	Presented a study using ML for predicting the mortality rate of the isolate to severe traumatic brain injury patients
	Kim et al. [98]	2017	Proposed ML-based diagnosis of major depressive disorder by combining heart rate data
	Pellegrini et al. [99]	2018	Developed ML assisted diagnosis of dementia and cognitive impairment
	Akbulut et al. [100]	2018	Presented an ML system for foetal health condition prediction based on maternal clinical history
	Karhade et al. [101]	2018	Developed ML algorithms for predicting survival of a 5-year spinal chordoma patient
	Abdar et al. [102]	2019	Proposed a new ML technique for the diagnosis of coronary artery disease
	Burdick et al. [103]	2020	Used ML to develop a prediction system for respiratory decompensation in coronavirus patients
	Hashem et al. [104]	2020	Developed ML models for diagnosis of HCV-related chronic liver disease and hepatocellular carcinoma
	Magesh et al. [105]	2020	Developed explainable ML using LIME on imagery computers model for pre-detection of Parkinson’s disease
	Shen et al. [106]	2021	Presented risk predicting ML models in the diagnosis of Escherichia coli sepsis in patients
	Montolío et al. [107]	2020	ML in disability prediction and diagnosis of multiple sclerosis utilizing optical coherence tomography computers
Clinical time-series data	Yu-Wei et al. [108]	2019	Used recurrent neural networks for prediction of unplanned ICU readmission
	Xie et al. [110]	2020	Compared benchmarks of classical time-series ML models with new algorithms on glucose prediction in the blood of type-1 diabetes
	Pezoulas et al. [111]	2021	Used time-series gene expression data for the detection of a diagnostic biomarker in Kawasaki disease
	Nancy et al. [112]	2017	Observed a bio-statistical quarry approach for the classification of multivariate clinical time-series data observed at varying intervals
	Froc et al. [113]	2021	Characterized urinary tract endometriosis over a collected one-year national series data of 232 patients
	Wallace et al. [114]	2018	Simplified the function of speech recognition admissibility in medical documentation aspects
Clinical speech and audio processing	Zamani et al. [115]	2020	Presented an automated Pterygium detection using ML/DL approaches
Prognosis	Ke et al. [117]	2019	Presented an automated Image annotation based on multi-label data augmentation and deep CNNs
	Davi et al. [118]	2019	Utilized ML and human genome data for severe dengue prognosis
	Liu et al. [119]	2019	Proposed a weakly supervised DL technique for brain disease prognosis using MRI data and incomplete clinical scores
	Fang et al. [120]	2020	Discussed the ML approach for feature selection in stroke prognosis
	Wang et al. [121]	2019	Presented transfer learning least squares SVM mechanism in bladder cancer prognosis
	Cai et al. [122]	2020	Presented ML models and CT quantification approaches for assessment of disease prognosis and severity of coronavirus patients
	Zack et al. [123]	2019	Developed ML techniques for forecasting patient prognosis after percutaneous coronary intervention
	He et al. [124]	2021	Developed ML prediction model for acute kidney injury following after donation

Fig. 4

Illustration of heterogeneous sources contributing to healthcare data

Summary of contributions made by researchers over time Illustration of heterogeneous sources contributing to healthcare data

Electronic Health Records (EHRs)

Electronic Health Records (EHRs) hold a large amount of data consisting of the medication history of patients and other details regarding their recovery daily by hospitals and other healthcare services. It is an extensively irksome task to extract clinical features from EHRs manually, ML-based methods come to the rescue. ML-based methods make it easy to extract required data for facilitating the diagnosis process. Diverse precedents have been presented to diagnose diseases such as diabetes, lung infections due to Covid-19, advancement of tumorous cells from the unstructured EHRs. The unstructured records from EHRs are mainly examined for two stints, i.e., length-of-stay and mortality prediction. It has been observed in studies that the prediction for the diagnosis process gets degenerated when historical records are trained upon by ML models and tested on new (unseen data). Stojanovic et al. [68] presented a study where they coupled EHRs with advanced ML tools for predicting major parameters of healthcare quality. The study is dedicated to reduced dimensional vector representations of patients' clinical procedures and conditions. Brisimi et al. [69] developed ML methods to predict hospitalization probabilities because of the two most important chronic diseases, i.e., diabetes and heart ailments. The predictions rely on the clinical history of patients recorded in EHRs. The previous era has seen an enormous increase in the volume of digital information of medical data collected in EHRs. Shickel et al. [70] surveyed the present research on the application of DL to clinical tasks on EHR data and identified several loopholes in the current EHR-based research. Likewise, Fuente et al. [71] presented a survey where they studied the behavioral patterns in EHRs of patients using the Random Forest Algorithm. Their study mainly focuses on finding a correlation between different diseases or factors associated with them. Analytics plays an important role when considering data-driven systems for medical facilitation. Harerimana et al. [72] offered an intuitive review of optimized DL approaches in managing and utilizing data from EHRs. The exponential rise in the availability of data might reduce the need for data demand in ML processes. However, performance is traded-off for computation time that can become critical at times considering medical emergencies. Diabetes is one of the most common conditions found amongst the Indian population. The early discovery of type 2 diabetes (T2D) helps in treating patients more pragmatically and prevents severity. Bernardini et al. [73] introduced an ML algorithm known as Sparse Balanced Support Vector Machine (SB-SVM), trained extensively on the abundant data recorded in EHRs to detect the novel T2D early and efficiently. The SB-SVM produces promising results to overcome present competitors in providing the best trade-off between computation time and predictive performance. Similarly, with the help of EHRs, the protagonists of clinical welfare have developed several mechanisms such as admission prediction of Dementia patients [74], classification of Opioid usage for Joint Replacement patients [75], morbidity identification [15], to consider a few.

ML in Medical Image Analysis

ML systems have rooted their applicability in the analyzing procedures of medical images. These computational techniques allow the efficient extraction of important information from image samples produced using various imaging modalities (e.g., MRI, Computed Tomography Scan (CT), Positron Emission Tomography (PET), and ultrasound imaging, etc.). Recent advances in computational hardware are allowing physicists to revise old AI algorithms and experiment with new mathematical ideas [76]. The mechanically produced images enable diagnosis of the kernel of illness and localization of abnormalities in various parts of the body. The significant tasks in clinical image analysis comprise detection, segmentation, localization [77], classification, enhancements, reconstruction, etc. [78]. As a result, a completely automated intelligent system for medical image analysis is predicted to successfully provide services such as segmentation, localization, detection, and classification. M. Li et al. [79] presented an experimental study in computer-aided lung-cancer diagnosis based on histopathological imaging. Their proposed best classifier, i.e., the Relief-SVM (relevant features- Support Vector Machine) model achieved the highest accuracy, thereby verifying the potential of auxiliary diagnostic models using medical images. ML and AI have influenced treatment procedures in numerous ways. A detailed review of how AI is remodeling the medical imaging spectrum is presented by Mandal et al. [64]. Likewise, Umamaheswari et al. [80] propose an algorithm for the classification and segmentation of Acute Lymphoblastic Leukaemia Cells. The system is fully fed by clinical photos that have been analyzed. For the sake of graphical analysis, segmentation is performed first, followed by morphological operators and Otsu's thresholding. The use of nucleus characteristics in conjunction with a supervised KNN classifier improves classification rates, yielding an estimated accuracy of 95.92 percent on average. Based on histological images, Wang et al. [81] presented a profound study where they improved the existing detection accuracy of malign cells in breast cancer. They adopted a dual-network multi-relation regularized learning method for boosting performance. Cervical Cancerous cells are classified using a tack called Papanicolaou Smear (PAP) test. Abhinaav et al. [82] devised an algorithm catering to the image’s dataset produced from the PAP test to classify and group the normal cells from affected cells. Histopathological images are prone to uncertainty which acts as a catalyst in corrupting an ML model's performance trained on it. However, the inception of the fuzzy modeling technique has significantly reduced bias and uncertainty in the image data [83].

Applications of ML in Treatment

Recent innovations and research in extensive ML applications for healthcare domains have paved the way for better treatment scopes. The medication process follows a three-step procedure of prognosis, diagnosis, and treatment. In the diagnosis phase, medical images are studied by expert clinicians and radiologists to interpret the possible risks and cures. An extensive amount of medical data is produced daily from various small and big healthcare facilities, the information collected is put through rigorous supervision, and findings are recorded in reports. However, preparing such reports requires expertise and if handled with less experience in areas of nascent healthcare services may result in misdiagnosis or may conclude at a critical synopsis. On the other hand, preparing textual medical documents at an organizational level can be a tedious and weary task for clinical experts and radiologists, therefore researchers have attempted to address some clarifications on specific problems using ML techniques. Zech et al. [77] proposed a Natural Language Processing (NLP) based method for the annotation of radiology reports. A similar study conducted by Jing et al. [78] resurrected a multi-tasking ML framework for the automatic description and tagging of clinical radiology images. Similarly, researchers and physicians have found ways to blend methods such as Convolutional Neural Networks (CNN), RNN, and LSTM to explain automated state of art architecture for predictive design systems in localizing affected areas of body parts [84, 85]. Zhao et al. [86] presented a study where they developed ML algorithms for possible drug repositioning in case of Depression and Schizophrenia disorders. SVMs outperform others amongst the list of experimented algorithms. The Covid-19 outbreak has claimed thousands of lives and has put forward a profusion of difficulties. Researchers and medicinal experts since then have worked enormously to find ways of saving lives, technology has been an integral part. Jamshidi et al. [87] have curated a collection of diverse DL approaches for the diagnosis and treatment of Covid-19 patients. On the other hand, we can witness how gracefully Li et al. [88] have utilized ML approaches for assessing the degree of severity of Liver Fibrosis for chronic HBV. In our prior discussion diabetes remained one of the most researched conditions. Noaro et al. [89] presented a study where they conferred the abilities of ML models in improving the Insulin Bolus Calculation in type 1 diabetes. Moving on with more recitals of ML for treatment, we witness a few more studies that exemplify the statement [90, 91].

ML in Computer-Aided Detection

ML has been used extensively as a major strategy of CAD scheme, i.e., Computer-Aided Detection/Diagnosis of lesion candidates into certain classes of interdisciplinary technology blending elements of AI and ML with radiology and pathology image processing, an ideal example can be IBM's Watson. The automatic interpretation of medical images has proved to be highly valuable in assisting radiologists and doctors in their clinical treatment when time constraint is paramount. The workflow takes into account various DL/ML techniques like Fisher score discriminator, t-test and chi-square test, and several traditional processes including predictive algorithms, Computer Vision, Image processing methods. Saygılı et al. [92] examined several classification models to support early computer-aided diagnosis and treatment of Covid-19 using image processing and ML. Their proposed approach achieved an astounding accuracy of 99.02% on the X-ray images dataset. Correspondingly, Abdelsalam et al. [93] explored the inclinations of CNNs in computer-aided Leukaemia detection using microscopic blood images. The majority interest of discussion revolves around human ailment detection and seeming cures using the technical big name, ML. Considering one of the most common infectious diseases responsible for fatal endings in children especially, i.e., hookworm, He et al. [94] have proposed a broad study in hookworm detection. Their study bears an ML detection framework for Wireless Capsule Endoscopy (WCE) images which simultaneously tracks the tubular patterns of hookworms and models visual representations. Extending to the method of imaging analytics for pathological image diagnosis, Yu et al. [95] presented an extensive review concerning it.

Disease Prediction and Diagnosis

Disease Prediction and diagnosis early can be prolific in saving a person's life. Predictive ML methods instigate the means of early prognosis and diagnosis from medical data which subdues the time required for acting upon the disease for treatment. Surveys stating that certain ML algorithms have been successful in the prediction of cardiovascular risk with clinical data [96] and studies culminated that ML adeptness raised effectuality in prognosis and diagnosis predictability. The inherent use of ML-based methods for prognosis and prediction of cancer, apprehension of various diseases like virulent infections, dengue, hepatitis, heart problems, malaria, diabetes, etc. have proved to be capable [97]. Major Depressive Disorder (MDD) is a variety of abnormal mood disorders observed under biological psychiatry. It has been very prevalent amongst youngsters these days. Diagnosis for it demands the root cause be unravelled. Kim et al. [98] studied MDD and applied ML to classify peripheral biomarkers using Heart Rate Variability (HRV) serum proteomic analysis data. ML has been observed to assist in a lot of cognitive diagnosis procedures, Pellergrini et al. [99] presented a systematic review which is evidence of it. Following this, Akbulut et al. [100] proposed several ML techniques for monitoring and predicting foetal health status based on maternal clinical history. ML has been developed extensively through the years. The decision support provided by ML models has reduced the workload on clinical professionals to a considerable extent. Karhade et al. [101] developed a Bayes Point machine for the prediction of 5-year survival in spinopelvic chordoma. The ML model was developed specifically for this rare pathology yet accuracy was not compromised. Likewise, Abdar et al. [102] proposed ML techniques for the diagnosis of coronary artery diseases. Burdick et al. [103] employed ML for the prediction and diagnosis of respiratory decompensation in Covid-19 patients. Hashem et al. [104] developed predictive models for the diagnosis of chronic liver diseases along with Hepatocellular Carcinoma. Magesh et al. [105] presented their study on early detection of Parkinson's disease, likewise, diagnostic models for Escherichia coli infection [106], multiple sclerosis [107], and others have been developed.

ML for Clinical Time-Series Data

Time-series data is a collection of numerical/statistical features monitored for a certain period. Clinical Time-series data holds an amalgamation of medical imaging observations periodically tracking the transition of prime data points of concern. Applicability of clinical time-series ML modeling cover prophecy of health standing in Intensive Care Units (ICUs) using CNNs and Long-short Term Memory networks (LSTMs) [108], mortality rate prediction of patients with Traumatic Brain Injury (TBI) [109], assessment of blood pressure, Intracranial Pressure (ICP), is prime signs of Cerebrovascular Autoregulation (CA) in TBI patients [109]. Recently studies state that by integrating time-series data with multivariate model inclinations, their predictivity for forecasting tasks of prognosis, diagnosis, recommendation, etc. is increased tremendously. Xie et al. [110] benchmarked ML time-series models on the prediction of glucose content in the blood for Type 1 diabetic patients. Pezoulas et al. [111] gathered time-series microarray gene expression data for the modeling of a predictive system. The model developed, detects candidate biomarkers of Kawasaki disease. Every ML application needs to be fed a hefty amount of data for better performance. However, data management is considered one of the tedious yet crucial jobs. Nancy et al. [112] applied a bio-statistical mining approach for the efficient classification and management of time-series data considering irregular time intervals. Similarly, Froc et al. [113] listed clinical attributes of urinary tract endometriosis on a series of 232 patient data collected for one year.

Clinical Speech and Audio Processing

In clinical environments, concerned authorities require to generate huge amounts of documentation including clinical reports, imaging reports, discharge applications, etc. which takes a lot of time and is highly strenuous for clinicians. Wallace et al. [114] discussed that while considering the workload already on experts, documentation is an added despondency that takes 50% of their time as a result the interaction time with patients is curbed down. This typical situation is strainful for clinicians and emotionally unconnected for patients who require attention, hence clinical speech and audio processing provide a sigh of relief. The applications of it include interaction-less services with speech communication, automation of transcript generation, clinical notes synthesis, correspondence for an emergency in staff unavailability, etc. These methods are time and cost-effective and increase productivity, to manage the healthcare infrastructure internally well, applications of clinical audio and speech processing have been successful where automation is a new modality for patients as well as clinicians [115]. Clinical speech processing confronts two major challenges as disfluency and utterance segmentation which stalls processing activity.

ML in Prognosis

Prognosis refers to the process of forecasting a likely outcome of a disease based on medical trials. The process includes the identification of potential risks and ascertainment of pre-stages of development for the disease and the likelihood of survival. Collins et al. [116] stated that ML models facilitating the process of prognosis are fed with multimodal patients’ data for improved performance. Recent research in the potential applications of ML in the medical prognosis [117] process puts stress on the sanction of personalized medicine, a premature field that still requires extensive development. To achieve the translational impact of personalized medicine robust validation strategies and ML utilization is expected. Davi et al. [118] proposed an ML classification method developed using human genome markers for severe dengue prognosis. Another study on ML models by Liu et al. [119] is evidence of brain disease prognosis using incomplete clinical scores and MRI data. Various other predictive systems are developed using ML/DL approaches. However, ML can also be utilized for selecting features in stroke prognosis [120]. Wang et al. [121] presented a transfer-learning approach for bladder cancer prognosis. Cai et al. [122] investigated ML models for the assessment and quantification of severity and prognosis of Covid-19 patients. Conclusively, Zack et al. [123] leveraged ML techniques for Percutaneous Coronary Intervention prognosis forecasting, He et al. [124] studied acute kidney injury prediction followed after cardiac death liver transplant.

Sources of Vulnerabilities in ML Pipeline

The applications of ML in healthcare are still at their nascent stage of development, the challenges arising due to security breaches and privacy disruption are discussed in this section. The cyber defense strategies however have not fully grown under the healthcare domains which challenges the secrecy and confidence of ML models developed. In addition, major challenges faced during the ML pipeline development besides potential vulnerability sources causing such challenges have been pointed out next.

Vulnerabilities in Data Collection

Vulnerabilities can seldom sneak through carefully amassed medical data considering the generous amount of information collected in various formats such as medical images, radiology reports, health surveys, patient/ disease registries, clinical trials data, etc., every day. Handling this huge mass of information requires obscure human efforts and bulk time wherein chances of data being descended are highly possible, to reduce such failures automation involving ML/DL pertinence is brought into practice. Whilst medical data is consolidated with vigilance there can be various sources of weakness that influence the proper functioning of the primary ML/DL systems, some of which are discussed below.

Unqualified Personnel

The highly interpersonal data-driven healthcare system requires a lot of technical and non-technical assistance. Technical personnel with strong computation and statistical accomplishments to develop the underlying effective ML/DL-based systems to improve the efficacy of medical processes and time-management strategies are limited. Conceding the feeble situation, hospitals turn over to depend entirely on physicians or researchers who do not have qualifying computational expertise for developing such systems [125].

Environmental and Instrumental Noise

The process of digital data collection and regulation seldom accompanies environmental and instrumental disturbances. Little agitation in certain diagnostic procedures such as in multishot MRI where extensive supervision is required, can lead to undesirable noise in the solicited data thereupon increasing the risk of misdiagnosis.

Vulnerabilities Due to Data Annotation

ML/DL applications require extensive model training for perfect predictive performance. For medical usage applications, most models are extensively trained on clinically produced images that require every sample to be annotated. This tedious task of assigning labels should mostly be performed by clinical experts who can prepare domain-enriched datasets or by some automated algorithms [126]. Labeling data like secondary tasks are not encouraged by professionals as it employs a lot of their crucial time therefore trainee staff (who have little domain expertise) are employed for the task. As a result, it leads to problems such as bawdy labels, misclassification, sanction imbalance, etc. Several vulnerabilities due to data annotation are noted further.

Ambiguous Ground Truth

In medical datasets, Finlayson et al. [127] proactively presented a study that expresses the ambiguity in the ground truth of the results. Even well-defined diagnostic tasks are criticized by therapeutic experts, further mishandling and malicious attacks by some perplexed users make the diagnosis, and hence the treatment process difficult yet being under expert supervision.

Improper Annotation

The proper annotation for data samples is critical for certain life-saving healthcare applications. ML/DL mechanisms are deployed for the automated image labeling tasks which often might lead to coarse-grained problems, mislabelling [128]. These problems may challenge the predictive capabilities of healthcare systems that are mentioned next.

Efficiency Challenges

Efficacy becomes the prime factor to monitor an ML/DL-based system's performance. Particular challenges that influence the quality of data and performance thereafter are Limited and Imbalanced datasets, Class imbalance and bias, and sparsity. Newly identified diseases do not have much available history, due to this limitation the performance of a model on predicting the outcomes of this problem is demoted. Class Imbalance is seen as a common problem in supervised ML/DL models which arise due to a mismatch or non uniform data distribution amongst respective classes. Data Sparsity refers to the missing values in the input data that arise due to skipped or unreported samples. All these problems put a significant effect on the functioning of ML/DL techniques.

Vulnerabilities in Model Training

Vulnerabilities concerning ML/DL model training comprise partial training, model poisoning, privacy infringement, incomplete data rendering. Unbecoming training means inappropriate parameters (such as epochs, test/training ratio, etc.) feeding to the model as a result it becomes exposed to infer at a corrupt proposition. ML/DL models are exposed to cyber-attacks such as adversarial attacks, Trojan attacks, backdoor attacks, etc., breaching the secure integrity of the underlying system [129]. The impediments associated with ML/DL models validate their efficient usage, thereby imposing a check on security and life-critical applications development.

Vulnerabilities in Deployment Phase

Deployment of ML/DL systems in a healthcare ecosystem requires extensive human efforts, consequently to avert the robustness of the system customary accountability has to be considered in the deployment phase. Concerning vulnerabilities that occur in the stationing phase of ML/DL systems include Distribution shifts and Incomplete data. Distribution shifts as they are expected to be deployed on different domain data, they are also vulnerable to adversarial attacks [130]. Since ML models are trained on former medical data their performance on future data degrades the efficacy of the prediction. Certain predicaments result in incomplete data collection which might influence the outcomes of the procurement. Incomplete data can either be dropped or is replaced with the mean of the column, however often these practices may lead to a foresight of false positives and false negatives which can have severe consequences in medical care systems. To ensure the accurate prediction of problems and diagnoses, compact and complete data is vital.

Vulnerabilities in Testing Phase

Vulnerabilities in the testing phase are typical that may arise due to training anomalies because of incomplete data, altered data fed for inference, unlabelled medical image inputs, to name a few. These problems could result in severe outcomes that predict false positives or false negatives delimiting the accurate prediction of the condition or disease. Loopholes in the prediction pipeline are critical for a patient's treatment. Decisively, ML-based healthcare is not just about humbling exertion or predictive analysis/ treatment but demands circumspect deployment of statistical/analytical methods in the underlying systems [131].

ML for Healthcare: Challenges

Scientists and Researchers are using ML/DL techniques to churn out smart solutions that help streamline the administrative as well as diagnostic procedures in a medical management ecosystem. Challenges are requisite in the prudish advancement of ML/DL-based systems for viable healthcare applications. Some of the challenges that impede the performance and applicability of automated systems are discussed in this section. Table 2 summarizes the probable challenges faced while tackling ML prosecution in a healthcare ecosystem.

Table 2

Challenges involved with Machine Learning in Healthcare

ML in healthcare challenges	Description
Safety challenges	Model’s prediction precision without expert intervention is questioned Identifying rare, underlying health problems is challenging Enabling ML techniques to identify subtly hidden cases is the key to ensuring safety
Privacy challenges	Preserving privacy can be challenging Patients expect their confidential information to be safeguarded Anonymization can prevent unauthorized access and privacy breach
Ethical challenges	Data accumulation requires authorization Preserving patients’ dignity while collecting data is to be taken care of If ethical concerns are not addressed, the unfavourable impact is seen in ML applications
Availability of quality data	The information available is heterogenous Data collected during practice have issues (bias, redundancy), produce an adverse effect in the algorithms High-quality practical data requires resources and service with good maintenance
Casualty is Challenging	Reasoning while taking decisions in crucial health problems is imminent Queries where expert reasoning is required cannot be answered from a medical data perspective Forming casual rationalization from data is challenging
Updating hospital Infrastructure is inflexible	Independent sections of healthcare avoid frequent information exchange For frictionless communication, antiquated systems need upgradation The difficulties in upgrading hospital infrastructure raise concern with modern-day healthcare practices using ML/DL

Challenges involved with Machine Learning in Healthcare Model’s prediction precision without expert intervention is questioned Identifying rare, underlying health problems is challenging Enabling ML techniques to identify subtly hidden cases is the key to ensuring safety Preserving privacy can be challenging Patients expect their confidential information to be safeguarded Anonymization can prevent unauthorized access and privacy breach Data accumulation requires authorization Preserving patients’ dignity while collecting data is to be taken care of If ethical concerns are not addressed, the unfavourable impact is seen in ML applications The information available is heterogenous Data collected during practice have issues (bias, redundancy), produce an adverse effect in the algorithms High-quality practical data requires resources and service with good maintenance Reasoning while taking decisions in crucial health problems is imminent Queries where expert reasoning is required cannot be answered from a medical data perspective Forming casual rationalization from data is challenging Updating hospital Infrastructure is inflexible Independent sections of healthcare avoid frequent information exchange For frictionless communication, antiquated systems need upgradation The difficulties in upgrading hospital infrastructure raise concern with modern-day healthcare practices using ML/DL

Safety Challenges

Safety is not a measure of how perfectly an ML/DL model performs under a provisioned environment. Safety accounts for how perfectly an ML/DL model can determine a patient's condition without any expert intervention. The majority of patients under the doctor's supervision have common health conditions, it is their responsibility to examine any underlying rare, subtle or hidden health problems. Arachchige et al., [26] introduced the applicability of the PriModChain framework that enforces safety in the functioning of various mechanical applications in the healthcare domain. Enabling ML/DL applications to recognize those low underlying tenuous events is beneficial in ensuring the safety of the present automated systems.

Privacy Challenges

Privacy is the right of every user (i.e., patient). Preserving privacy in this data-driven healthcare ecosystem is a challenging task, trust is intertwined with issues like integrity, confidentiality, authenticity, accountability, data management, and identity, to name a few [132]. Patients expect that their medical service providers are safeguarding their confidential information from being mishandled or breached by unauthorized accesses, therefore alleviation of privacy breaches is critical for preventing a patient from potential harm. One way of preserving the confidentiality of data to prevent privacy harm is by anonymization, such as reidentification of the individuals [133]. Further austere notice towards every stage of data collection and transmission should be administered.

Ethical Challenges

Ethical usage of data in the ML-driven healthcare system is of utmost importance. Acrimonious caution should be taken while accumulating data for building ML models keeping the sociological aspects of the targeted population at prime. Understanding a patient's concern in preserving their dignity should be considered during data collection. If ethical terms are not taken care of then the use of intelligent systems would have an unfavorable impact. To extend fair and ethical considerations for uncertain and complex scenarios, a clear understanding of ML systems in this regard is expected [134].

Availability of Quality Data

One of the other shortcomings in a healthcare ecosystem is the availability of diverse and good-quality data. Daily, an extensive amount of heterogeneous information related to patients is being generated across medical institutions, and an inadequate amount of useful data is being retrieved for researchers and the scientific community to work on. To produce high-quality practical data requires resources and service with good maintenance and management. The ample presence of quality data would enable professionals to develop systems for the grounds of illness prediction and treatment. Data collected during practice can have issues such as bias, a redundancy that will reflect as adverse outcomes in the algorithms. Intelligent systems cannot differentiate racial bias and fair subjectivity as humans persuade the act they learn, for example, a person with no health provision is repudiated for facilitating medical services wherefore research has brought forward that an AI system could predict bias in racial terms [135]. The trained data also contributes to its modeling challenges [136-138].

Casualty is Challenging

Casualty can be challenging from a medical perspective. Understanding the importance of reasoning, i.e., "What if?" while taking decisions in crucial healthcare problems is imminent [139]. Consider a circumstance where we need to analyse that if the doctor prescribed treatment 1 rather than treatment 2, how will the outcome be influenced? Queries of this kind cannot be answered from a medical data analysed perspective but through causal reasoning. In healthcare applications learning from observational data and inferencing is the socio norm but forming casual rationalizing from it is challenging which requires building casual models. ML/DL models lack fundamental reasoning under their hood and produce output based on correlation and patterns without considering the casual loop in between. In practical application, the limitation of casual analysis may raise concerns about the prophecy of AI systems. The acknowledgment of the casual effect of certain variables on target yields is paramount for fair predictive behaviour.

Updating Hospital Infrastructure is Inflexible

Healthcare organizations favor independent operations and mostly avoid sharing information. For a frictionless erudition exchange, it requires the fixing and updating of antiquated software which can be time-consuming and most are not cost-effective. Finlayson et al. [127] reported that even in the late 20 s most of the infirmaries were operating on the ninth version of the International Classification of Disease (ICD) system even though an updated version of ICD-10 had been released in the early '90 s. The difficulties in upgrading hospital infrastructure and internal management systems can raise concerns with the applicability of recent DL/ML practices.

Future Research Directions

In this section, various issues that require active research attention related to the security, privacy, and robustness of ML in the Healthcare ecosystem are discussed.

Machine Learning on the Edge

The revolutionary change in the purposes of ML in Healthcare applications has seen exponential growth in recent years. Research in ML has revolutionized traditional methods and opted for smart and energy-efficient utilization of wearable devices, IoT sensors, etc. With the development of smart cities and transportable medical devices such as portable ventilators, oxygen concentrators, MRI machines, etc., there is a constant demand for refined ML models trained on Edge devices. This imposes a few limitations including a lack of available hardware support and high computational processing capabilities. ML in the Edge devices is nurturing at its nascent stage and requires attention from the researching fraternity. The growth in this domain will lead to faster care in chancy situations and continuous monitoring of patient's health from a remote location, thereby improving healthcare facilities for a better lifestyle and timely medical assistance.

Handling Dataset Annotation

The output of AI systems is highly subservient on the labeled datasets for training and inference. This requires the medical experts and physiologists to annotate the medical data (such as images, clinical reports, signals, etc.) manually, spending a lot of their valuable time doing this tedious work. The variety of practical medical data glossed with accurate labels will appraise the execution of ML/DL models and exhibit hindrance that might have not been noticed. Thus, manual labeling of data into respective classes is inquisitive, tedious, and energy draining. Automatic approaches like active learning should be adopted and developed to inscribe this impediment.

Distributed Data Management and ML

In Healthcare systems, the generation of data is discrete, i.e., data is processed from various departments within a hospital extending to various other hospitals geographically. This imposes pressure on efficient data sharing and management for clinical analysis particularly using ML models. ML/DL models are developed based on a general consideration that all the analytical information is easily accessible and centrally available. These shortcomings offered by improper management of information exchange need the attention of developers and researchers who collaboratively could tackle the administration of distributed data and ML.

Fair and Accountable ML

Qayyum et al. [140] in analyzing robustness and security of ML/DL techniques reasoned that the results of the models are biased and lack accountability. Ensuring fairness and precision of predictions is of cardinal importance for life-critical application in healthcare systems. Trading the accuracy and accountability of these models could result in cynical outcomes and impose risk to patients' health. Fair predictions by the ML/DL models are influenced by a variety of cases with little available data. Taking into account the importance of fair judgment and interpretability, tuning of models accordingly will make it robust and desist from misjudgements made in the past clinical records. Further study to develop dynamic methods to ensure safety and lessen imperfections is needed in this area.

Model-Driven ML

The practice of ML, AI for predictive analysis in healthcare applications comes with privileges as well as liabilities. Latif et al. [141] discussed the associated caveats in utilizing these tools, failing to denote its lapses might turn out critical as in clinical terms. Usually, the perks of these models convince one that data once available in abundance can handle hypothesis generation without any medical expert validation and interpretation, which attracts unavoidable problems. To avoid these quandaries, it is important to achieve a combined data-driven method including hypothesis and model-based approaches to bring controlled precision in these studies. Areas for building robust, secure, and accountable ML deliverables that are technically precise require further research.

Conclusion

ML is activated by statistically afformed algorithms, distributed over different categories such as Regression, Classification, Clustering, etc. All of these algorithms assist in building intelligent solutions for automating clinical tasks and suspecting disease apprehensions. The traditional practice of services provided by healthcare systems has seen a vast change with the advent of ML and DL-based approaches. However, to ensure secure, bias-free, and hale utilization of these models, provocations should be addressed. This report provides a brief introduction to several ML algorithms, discusses their extent of reinforcement and controls, further marking reliable standards to bypass shortcomings in model building. This paper also provides a synopsis of the challenges arising in the ML deployment pipeline for healthcare infrastructure by classifying different origins of jeopardies in it. Conclusively this work discusses possible solutions to provide users as well as clinical experts in a healthcare ecosystem with secure, robust, and privacy-protected ML explication for privacy endeavouring applications. The paper is summarized by including the potential pursuit of ML techniques in the healthcare sector and the privacy consideration linked with it.

58 in total

1. Weakly Supervised Deep Learning for Brain Disease Prognosis Using MRI and Incomplete Clinical Scores.

Authors: Mingxia Liu; Jun Zhang; Chunfeng Lian; Dinggang Shen
Journal: IEEE Trans Cybern Date: 2019-03-26 Impact factor: 11.448

2. Practical guidance on artificial intelligence for health-care data.

Authors: Marzyeh Ghassemi; Tristan Naumann; Peter Schulam; Andrew L Beam; Irene Y Chen; Rajesh Ranganath
Journal: Lancet Digit Health Date: 2019-08-09

3. Adversarial attacks on medical machine learning.

Authors: Samuel G Finlayson; John D Bowers; Joichi Ito; Jonathan L Zittrain; Andrew L Beam; Isaac S Kohane
Journal: Science Date: 2019-03-22 Impact factor: 47.728

Review 4. Secure and Robust Machine Learning for Healthcare: A Survey.

Authors: Adnan Qayyum; Junaid Qadir; Muhammad Bilal; Ala Al-Fuqaha
Journal: IEEE Rev Biomed Eng Date: 2021-01-22

5. An Explainable Machine Learning Model for Early Detection of Parkinson's Disease using LIME on DaTSCAN Imagery.

Authors: Pavan Rajkumar Magesh; Richard Delwin Myloth; Rijo Jackson Tom
Journal: Comput Biol Med Date: 2020-10-08 Impact factor: 4.589

6. Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports.

Authors: John Zech; Margaret Pain; Joseph Titano; Marcus Badgeley; Javin Schefflein; Andres Su; Anthony Costa; Joshua Bederson; Joseph Lehar; Eric Karl Oermann
Journal: Radiology Date: 2018-01-30 Impact factor: 11.105

7. MR-Forest: A Deep Decision Framework for False Positive Reduction in Pulmonary Nodule Detection.

Authors: Hongbo Zhu; Hai Zhao; Chunhe Song; Zijian Bian; Yuanguo Bi; Tong Liu; Xuan He; Dongxiang Yang; Wei Cai
Journal: IEEE J Biomed Health Inform Date: 2019-10-15 Impact factor: 5.772

8. Fetal health status prediction based on maternal clinical history using machine learning techniques.

Authors: Akhan Akbulut; Egemen Ertugrul; Varol Topcu
Journal: Comput Methods Programs Biomed Date: 2018-06-14 Impact factor: 5.428

9. Artificial Intelligence and COVID-19: Deep Learning Approaches for Diagnosis and Treatment.

Authors: Mohammad Behdad Jamshidi; Ali Lalbakhsh; Jakub Talla; Zdenek Peroutka; Farimah Hadjilooei; Pedram Lalbakhsh; Morteza Jamshidi; Luigi La Spada; Mirhamed Mirmozafari; Mojgan Dehghani; Asal Sabet; Saeed Roshani; Sobhan Roshani; Nima Bayat-Makou; Bahare Mohamadzade; Zahra Malek; Alireza Jamshidi; Sarah Kiani; Hamed Hashemi-Dezaki; Wahab Mohyuddin
Journal: IEEE Access Date: 2020-06-12 Impact factor: 3.367