| Literature DB >> 30616329 |
Ki-Jo Kim1, Ilias Tagkopoulos2,3.
Abstract
Over the past decade, there has been a paradigm shift in how clinical data are collected, processed and utilized. Machine learning and artificial intelligence, fueled by breakthroughs in high-performance computing, data availability and algorithmic innovations, are paving the way to effective analyses of large, multi-dimensional collections of patient histories, laboratory results, treatments, and outcomes. In the new era of machine learning and predictive analytics, the impact on clinical decision-making in all clinical areas, including rheumatology, will be unprecedented. Here we provide a critical review of the machine-learning methods currently used in the analysis of clinical data, the advantages and limitations of these methods, and how they can be leveraged within the field of rheumatology.Entities:
Keywords: Machine learning; Prediction; Rheumatology
Year: 2018 PMID: 30616329 PMCID: PMC6610179 DOI: 10.3904/kjim.2018.349
Source DB: PubMed Journal: Korean J Intern Med ISSN: 1226-3303 Impact factor: 2.884
Figure 1.An overview of fields related to learning from data. AI, artificial intelligence.
Figure 2.Overview of categorical types and different machine-learning algorithms. AI, artificial intelligence.
Figure 3.Workflow to develop a supervised machine-learning-based predictive model.
Representative clinical studies using machine learning methods in internal medicine
| Area | Title | Machine learning category | Machine learning methods | Input data | Reference |
|---|---|---|---|---|---|
| Cardiology | Identifying important risk factors for survival in patient with systolic heart failure using random survival forests | Supervised | Random survival forest | 39 Clinical variables | [ |
| 2,231 Adult patients with systolic heart failure | |||||
| Cardiology | Use of hundreds of electrocardiographic biomarkers for prediction of mortality in postmenopausal women: the Women's Health Initiative | Supervised | Random survival forest | 477 Electrocardiographic findings | [ |
| 33,144 Postmenopausal women | |||||
| Cardiology | Phenomapping for novel classification of heart failure with preserved ejection fraction | Unsupervised | Agglomerative hierarchical clustering | 67 Clinical and echocardiographic parameters | [ |
| 420 Patients with heart failure with preserved ejection fraction | |||||
| Cardiology | Machine learning for prediction of all-cause mortality in patients with suspected coronary artery disease: a 5-year multicentre prospective registry analysis | Supervised | Logit-boost model | 44 Coronary computed tomographic angiography variables and 25 clinical variables | [ |
| 10,030 Patients with suspected coronary artery disease | |||||
| Pulmonology | Unsupervised learning technique identifies bronchiectasis phenotypes with distinct clinical characteristics | Unsupervised | Hierarchical clustering | 78 Selected features from clinical, radiographic, and functional parameters | [ |
| 148 Patients with bronchiectasis | |||||
| Gastroenterology | Predicting hospitalization and outpatient corticosteroid use in inflammatory bowel disease patients using machine learning | Supervised | Random forest | Over 30 clinical and laboratory features | [ |
| 20,368 Patients with inflammatory bowel disease | |||||
| Nephrology | The development of a machine learning inpatient acute kidney injury prediction model | Supervised | Gradient boosting machine | 36 Clinical and laboratory features | [ |
| 121,158 Admissions | |||||
| Nephrology | Using machine learning algorithms to predict risk for development of calciphylaxis in patients with chronic kidney disease | Supervised | LASSO logistic regression | 9,288 Clinical and laboratory features | [ |
| Random forest | 401 Patients with chronic kidney disease | ||||
| Endocrinology | A predictive metabolic signature for the transition from gestational diabetes mellitus to type 2 diabetes | Supervised | Decision tree (J48) | 110 Blood metabolites | [ |
| Naïve Bayes classifier | 1,035 Women with gestational diabetes | ||||
| Endocrinology | Predictive models to assess risk of type 2 diabetes, hypertension and comorbidity: machine-learning algorithms and validation using national health data from Kuwait. A cohort study | Supervised | Logistic regression | 13,647,408 Variables in medical records | [ |
| k-Nearest neighbors | 300,489 Hospital visitors | ||||
| Support vector machines | |||||
| Multifactor dimensionality reduction | |||||
| Oncology | Systematic analysis of breast cancer morphology uncovers stromal features associated with survival | Supervised | LASSO logistic regression | 6,642 Image features from H&E-stained histological images | [ |
| Two independent sets of patients with breast cancer: NKI (248 patients) and VGH (328 patients) | |||||
| Oncology | Development of a prognostic model for breast cancer survival in an open challenge environment | Supervised Unsupervised | Attractor metagenes analysis | Clinical, survival information and 12 molecular features | [ |
| Generalized boosted regression | 1,981 Patients with breast cancer | ||||
| k-Nearest neighbors | |||||
| Oncology | Predicting non-small cell lung cancer prognosis by fully automated microscopic pathology image features | Supervised | Naïve Bayes classifiers | 9,879 Image features | [ |
| Support vector machines | 2,186 H&E stained whole-slide histopathology images, which were obtained from 515 lung adenocarcinoma patients and 502 lung squamous cell carcinoma patients. | ||||
| Random forest | |||||
| Hematology | Prediction of allogeneic hematopoietic stem- cell transplantation mortality 100 days after transplantation using a machine learning algorithm: a European group for blood and marrow transplantation acute leukemia working party retrospective data mining study | Supervised | Alternating decision tree | 18 Clinical features | [ |
| 28,236 Adult hematopoietic stem cell transplantation recipients who were affected by acute leukemia | |||||
| Dermatology | Dermatologist-level classification of skin cancer with deep neural networks | Supervised | Deep convolutional neural network | 129,450 Clinical images of skin lesions, which were labeled with 2,032 various skin disease | [ |
LASSO, least absolute shrinkage and selection operator; NKI, Netherlands Cancer Institute; VGH, Gancouver General Hospital.