| Literature DB >> 36168500 |
Alisa J Hamilton1, Alexandra T Strauss2, Diego A Martinez3, Jeremiah S Hinson4, Scott Levin4, Gary Lin1, Eili Y Klein1,4.
Abstract
Artificial intelligence (AI) refers to the performance of tasks by machines ordinarily associated with human intelligence. Machine learning (ML) is a subtype of AI; it refers to the ability of computers to draw conclusions (ie, learn) from data without being directly programmed. ML builds from traditional statistical methods and has drawn significant interest in healthcare epidemiology due to its potential for improving disease prediction and patient care. This review provides an overview of ML in healthcare epidemiology and practical examples of ML tools used to support healthcare decision making at 4 stages of hospital-based care: triage, diagnosis, treatment, and discharge. Examples include model-building efforts to assist emergency department triage, predicting time before septic shock onset, detecting community-acquired pneumonia, and classifying COVID-19 disposition risk level. Increasing availability and quality of electronic health record (EHR) data as well as computing power provides opportunities for ML to increase patient safety, improve the efficiency of clinical management, and reduce healthcare costs.Entities:
Year: 2021 PMID: 36168500 PMCID: PMC9495400 DOI: 10.1017/ash.2021.192
Source DB: PubMed Journal: Antimicrob Steward Healthc Epidemiol ISSN: 2732-494X
Relevant Machine Learning Terms
| Term | Definition |
|---|---|
| Artificial intelligence (AI) | A computer’s ability to learn from experience |
| Machine learning (ML) | A type of artificial intelligence in which computers draw conclusions from data without being directly programmed |
| Supervised learning | Models in which the outcome is known for each observation |
| Unsupervised learning | Models in which the outcome is not known for each observation |
| Semisupervised learning | Models in which the outcome is known for some observations but not others |
| Label | The patient outcome (dependent variable) |
| Feature | An attribute/characteristic of the patient (dependent variable) |
| Sensitivity | Ability of a model to correctly identify true cases |
| Specificity | Ability of a model to directly identify negative cases |
| Accuracy | Measure of correctly labeled data instances over the total number of instances |
| Precision | Fraction of relevant instances among the retrieved instances (ie, positive predictive value) |
| Recall | Fraction of relevant instances that were retrieved correctly (ie, sensitivity) |
| Training data set | Data used to develop a model |
| Validation data set | Data used to test a model’s performance while training |
| Test data set | Data used to test the accuracy, precision, or recall against real-world data |
| Out-of-sample data | In a study cohort, the data not used as training data |
| Bias-variance tradeoff | In supervised learning, overfitting and underfitting can result in loss of performance |
| Bias | Difference between the average prediction of a model and the correct value |
| Variance | Variability of a model prediction for a given data point |
| Overfitting | When the model follows noise, resulting in low bias and high variance |
| Noise | Nonpredictive features in the data set |
| Underfitting | When the model fails to capture the underlying patterns in the data, resulting in low variance and high bias |
| Decision tree | A model that separates data into smaller and smaller partitions until each observation is classified according to the outcome of interest |
| Stopping criteria | Criteria used to stop further partitioning of data in a decision tree. Can prevent overfitting |
| Ensemble model | An ML technique combining multiple individual models |
| Random forest | A type of ensemble model that combines decision trees to produce a probabilistic prediction for the outcome |
| Receiver operator characteristic (ROC) curve | A way to graph the sensitivity and specificity (or precision) of a model |
| Area under the curve (AUC) | A technique to compare model results (with other models or other measurement tools) by calculating the area under an ROC curve |
| Natural language processing (NLP) | A type of AI in which the algorithm learns how to ‘understand’ language, including contextual nuances |