| Literature DB >> 30881539 |
Sangil Lee1, Nicholas M Mohr2, W Nicholas Street3, Prakash Nadkarni4.
Abstract
Health informatics is a vital technology that holds great promise in the healthcare setting. We describe two prominent health informatics tools relevant to emergency care, as well as the historical background and the current state of informatics. We also identify recent research findings and practice changes. The recent advances in machine learning and natural language processing (NLP) are a prominent development in health informatics overall and relevant in emergency medicine (EM). A basic comprehension of machine-learning algorithms is the key to understand the recent usage of artificial intelligence in healthcare. We are using NLP more in clinical use for documentation. NLP has started to be used in research to identify clinically important diseases and conditions. Health informatics has the potential to benefit both healthcare providers and patients. We cover two powerful tools from health informatics for EM clinicians and researchers by describing the previous successes and challenges and conclude with their implications to emergency care.Entities:
Mesh:
Year: 2019 PMID: 30881539 PMCID: PMC6404711 DOI: 10.5811/westjem.2019.1.41244
Source DB: PubMed Journal: West J Emerg Med ISSN: 1936-900X
Types and examples of machine-learning algorithms.
| Study/year | Type of prediction model | Feature of model | Example |
|---|---|---|---|
| Nelder JA et al. 1972 | General linear model (GLM) | The technique used to obtain maximum likelihood estimates of the parameters with observations distributed according to some exponential family and systematic effects that can be made linear by a suitable transformation. A generalization of the analysis of variance is given for these models using log-likelihoods. | The study aimed at forecasting daily emergency department (ED) visits using calendar variables and ambient temperature to compare the models in terms of forecasting accuracy. |
| Lee E et al. 2012 | Discriminant analysis | A generalization of Fisher’s linear discriminant, a method used in statistics, pattern recognition, and machine learning, to find a linear combination of features that characterizes or separates two or more classes of objects or events. | This study was done to develop a clinical tool capable of identifying discriminatory characteristics that can predict patients who will return within 72 hours to the pediatric emergency department. The investigator used a classification model to predict return visits based on factors extracted from patient demographic information, chief complaint, diagnosis, treatment, and a hospital real-time ED statistics census. |
| Lee S et al. 2017 | Logistic regression | A type of supervised learning that groups the variable to be predicted into classes (presence or absence of disease, for example) by estimating the probabilities with a logistic function. It is intelligible, meaning it is interpretable by humans. | To derive a prediction rule to stratify ED anaphylaxis patients at risk of a biphasic reaction, the authors conducted an observational study of a cohort of patients presenting to an academic ED with signs and symptoms of anaphylaxis. Logistic regression analyses were conducted to identify predictors of biphasic reactions, and odds ratios (ORs) are reported. |
| Peck JS et al. 2012 | Naïve Bayes | A learning algorithm for binary (0 or 1) or categorical (1, 2, 3, 4, for example) problems. The calculations of the probabilities of each hypothesis are simplified to make their calculation tractable and choose the highest posterior probability (example: post-test probability) from the prior probabilities (example: pre-test probability). It is based on the strong assumption that the predictor variables do not interact and are conditionally independent of each other. | The objectives were to evaluate three models that use information gathered during triage to predict the number of ED patients that will subsequently be admitted to a hospital inpatient unit (IU) and to introduce a new methodology for implementing these predictions in the hospital setting. Three simple methods were compared with each other in order to predict hospital admissions at ED triage: expert opinion, naïve Bayes conditional probability, and a generalized linear regression model with a logit link function (logit-linear). Predictors considered included patient age, primary complaint, provider, designation (ED or fast track), arrival mode, and urgency level (emergency severity index assigned at triage). |
| Hao S et al. 2014 | Decision tree | Decision tree is a flow chart–like structure in which each internal node denotes a test on an attribute, each branch represents the outcome of a test, and each leaf node holds a class label, | A decision tree–based model with discriminant electronic medical record (EMR) features was developed and validated and estimated a patient ED 30-day revisit risk. A retrospective cohort was assembled with the associated patients’ demographic information and one-year clinical histories before the discharge date as the inputs. |
| Levin S et al. 2017.12 | Random forest | A type of ensemble method designed for decision tree classifiers, which combines the prediction made by multiple decision trees, where each tree is generated based on the values of an independent set of random vectors. | E-triage used the random forest model applied to triage data that predicts the need for critical care, an emergency procedure, and inpatient hospitalization in parallel and translates risk to triage level designations. |
| Son YJ et al. 2010 | Support vector machine (SVM) | A type of supervised learning models with associated learning algorithms that analyze data used for classifications and regression analysis. Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a binary linear classifier. SVM is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on which side of the gap they fall. | A study aims to identify predictors of medication adherence in heart failure patients. The investigators applied SVM for data classification. For a given set of training data, each marked as belonging to one of two categories. An SVM training algorithm develops a model by finding a hyperplane, which classifies the given data as accurately as possible by maximizing the distance between two data clusters. Data about medication adherence were collected from patients at a university hospital through a self-reported questionnaire. |
| Wu Y et al. 1993 | Neural network | Information processing that derives meaning from complicated or imprecise data. Each node, functioning as an artificial neuron, is connected to another node, and the connection has weight to facilitate the learning process based on input and output, similar to the brain’s neural network. | A study on developing a decision-making aid for radiologists in the analysis of mammographic data used an artificial neural network. The algorithm was trained based on the features extracted from experienced radiologists. The performance of the neural network was found to be higher than the average performance of the resident and staff physician alone, concluding that such networks may provide a potentially useful tool for distinguishing between benign and malignant lesions in mammograms. |
Figure 1Diagram of artificial neuron networks.
Figure 2Diagram of decision tree.
The original figure was created by Ramezankhani et al.14; the link is https://bmjopen.bmj.com/content/6/12/e013336.long. The shading and formatting of the lines between tree nodes and the text font are modified.
FPG, Fasting Plasma Glucose; PCPG, Post Challenge Plasma Glucose.
Figure 3K-fold cross validation.*
*The datasets are divided into several, equally sized subsets. The model is trained on subsets (training sets). After the training process, the model is tested on the remaining subsets (test sets). According to the number of subsets partitioned, user tests k-fold cross-validation. In ten-fold cross-validation, for example, one may use 10 results of 10-fold cross-validation.