| Literature DB >> 28258046 |
Anis Sharafoddini1, Joel A Dubin1,2, Joon Lee1.
Abstract
BACKGROUND: Physicians and health policy makers are required to make predictions during their decision making in various medical problems. Many advances have been made in predictive modeling toward outcome prediction, but these innovations target an average patient and are insufficiently adjustable for individual patients. One developing idea in this field is individualized predictive analytics based on patient similarity. The goal of this approach is to identify patients who are similar to an index patient and derive insights from the records of similar patients to provide personalized predictions..Entities:
Keywords: data-driven prediction; electronic health records; health data; medical records; patient similarity; personalized medicine; predictive modeling; review
Year: 2017 PMID: 28258046 PMCID: PMC5357318 DOI: 10.2196/medinform.6730
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Flow diagram of article selection procedure.
Summary of the reviewed articles in terms of data type, data origin, number of predictors, and number of instances (N=22).
| Authors | Data type | Data origina | Predictors, nb | Instances, nc |
| Cross-sectional | ||||
| Jurisica et al [ | Cross-sectional | NR | 55 | 788 |
| Bobrowski [ | Cross-sectional | The Gastroenterological Clinic of the Institute of Food and Feeding in Warsaw [ | 40 | 511 |
| Park et al [ | Cross-sectional | UCI repository [ | 35 | 350 |
| Cross-sectional | UCI repository [ | 13 | 270 | |
| Cross-sectional | UCI repository[ | 31 | 560 | |
| Cross-sectional | UCI repository[ | 8 | 760 | |
| Cross-sectional | UCI repository[ | 7 | 340 | |
| Saeed et al [ | Longitudinal | MIMIC-II [ | 50 | 377 |
| Chattopadhyay et al [ | Cross-sectional | Hospital-history of suicidal attempts and committed suicides collected from hospital records | 15 | 50 |
| Sun et al [ | Longitudinal | MIMIC-II [ | 50 | 74 |
| Sun et al [ | Longitudinal | MIMIC-II [ | 10 | 1500 |
| David et al [ | Cross-sectional | Laboratory results generated by two Beckman-Coulter Gen-S analyzers at an acute care facility in Brooklyn | NR | 4900 |
| Houeland [ | Cross-sectional | A dataset focused on palliative care for cancer patients | 55 | 1486 |
| Wang et al [ | Cross-sectional | UCI repository [ | 31 | 560 |
| Cross-sectional | UCI repository [ | 8 | 760 | |
| Cross-sectional | A real-world EHR data warehouse of a health network consisting of data from 135K patients over a year | NR | 135K | |
| Wang et al [ | Cross-sectional | A real-world EHR data warehouse of a health network consisting of data from 135K patients over a year | 2388 | 3946 |
| Campillo-Gimenez et al [ | Cross-sectional | French Renal Epidemiology and Information Network (REIN) registry [ | 19 | 1137 |
| Gottlieb et al [ | Cross-sectional and longitudinal | Hospital dataset-Stanford Medical Center, USA | 16 | 9974 |
| Cross-sectional and longitudinal | Hospital dataset-Rabin Medical Center, Israel | 16 | 5513 | |
| Lowsky et al [ | Cross-sectional | A dataset by the United States Renal Data System (USRDS) consisting of all kidney transplant procedures from 1969 to 1999 | 13 | 51,088 |
| Hielscher et al [ | Cross-sectional | The Study of Health in Pomerania (SHIP) [ | 65/57 | 578 |
| Zhang et al [ | Longitudinal | A 3-year longitudinal EHR data of 110,157 patients | NR | 1219 |
| Henriques et al [ | Longitudinal | myHeart home telemonitoring study [ | NR | 41 |
| Lee et al [ | Cross-sectional and longitudinal | MIMIC-II [ | 76 | 17,152 |
| Ng et al [ | Cross-sectional and longitudinal | A longitudinal medical claims database consisting of data from over 300,000 patients during four years | 8500 | 15038 |
| Panahiazar et al [ | Cross-sectional | The Mayo Clinic | 33 | 1386 |
| Wang [ | Cross-sectional | UCI repository [ | 31 | 560 |
| Cross-sectional | UCI repository[ | 8 | 760 | |
| Cross-sectional | A real-world EHR data warehouse | NR | 135K | |
| Wang et al [ | Cross-sectional | A real-world EHR data warehouse | 127 | 3946 |
a NR: not reported.
b Predictors: the total number of predictors.
c Instances: the total number of data points used in each study including the training and test.
Summary of reviewed articles in terms of outcome, evaluation metrics, and comparing methods (N=22).
| Authors | Outcomea | Evaluation metricsb | Compared againstc | |
| Jurisica et al [ | Suggesting hormonal therapy (day of human chorionic gonadotrophin administration and the number of ampoules of human menopausal gonadotrophin) after in vitro fertilization and predicting pregnancy outcome (pregnancy, abortion, ectopic pregnancy, and ovarian hyperstimulation syndrome) | Accuracy | NR | |
| Bobrowski [ | Four types of liver disease (cirrhosis hepatis biliaris primaria, cirrhosis hepatis decompensata, hepatitis chronica activa, and hepatitis chronica steatosis) | Accuracy | Classic | |
| Park et al [ | (1) Six types of dermatology diseases (psoriasis, seborrheic dermatitis, lichen planus, pityriasis rosea, chronic dermatitis, pityriasis rubra pilaris); (2) diagnosis of heart disease (angiographic disease status); (3) diagnosis of a breast tumor as malignant or benign; (4) diagnosis of diabetes; (5) diagnosis of liver disorder | Accuracy; sensitivity; specificity | LR; C5.0; CART; neural network; conventional CBR ( | |
| Saeed et al [ | Hemodynamic stability or instability of an episode | Sensitivity; positive predictive value | NR | |
| Chattopadhyay et al [ | Suicidal risk levels (level 1: suicidal plans or thoughts; level 2: single suicidal attempt; level 3: multiple suicidal attempts) | NR | NR | |
| Sun et al [ | Occurrence of acute hypotensive episode within the forecast window of an hour | Accuracy | Human expert’s idea based on the Euclidean [ | |
| Sun et al [ | Occurrence of acute hypotensive episode within the forecast window of an hour | Accuracy | Human expert’s idea based on the Euclidean [ | |
| David et al [ | Seven disease diagnoses (microcytic anemia, normocytic anemia, mild SIRS, thrombocytopenia, leukocytopenia, moderate/severe SIRS, normal) | Accuracy | Human expert’s idea | |
| Houeland [ | Pain levels | Error rate (1-accuracy). | Random retrieval; | |
| Wang et al [ | (1) Diagnosis of a breast tumor as malignant or benign; (2) diagnosis of diabetes; (3) diagnosis of dementia without complications (HCC352) or diabetes with no or unspecified complications (HCC019) | Accuracy; sensitivity; precision; F-measure | PCA; LDA [ | |
| Wang et al [ | Diagnosis of CHF 6 months later | Accuracy; sensitivity; precision; F-measure | LLE; LE; PCA; Euclidean distance. | |
| Campillo-Gimenez et al [ | Registration on the renal transplant waiting list: yes/no | ROC curve | ||
| Gottlieb et al [ | Patient discharge diagnosis | ROC curve; F-measure | NR | |
| Lowsky et al [ | Graft survival probability | IPEC | Cox model; RSF [ | |
| Hielscher et al [ | Three levels of liver fat concentration measured by magnetic resonance tomography: (1) fat concentration <10%; (2) fat concentration of 10%-25%; (3) fat concentration ≥25% | Accuracy; sensitivity; specificity | Multiple variants of the | |
| Zhang et al [ | Four effective drugs for hypercholesterolemia treatment: atorvastatin, lovastatin, pravastatin, and simvastatin | ROC curve | Patient similarity; patient similarity with drug structure similarity; patient similarity with drug target similarity | |
| Henriques et al [ | Early detection of heart failure: decompensation or normal condition | Sensitivity; specificity; F-measure; G-measure | Coefficients’ distance; linear correlation of signals; Euclidean distance | |
| Lee et al [ | 30-day in-hospital mortality | Area under ROC curve; area under precision-recall curve | Population-based and personalized versions of: majority vote; LR; DT | |
| Ng et al [ | The risk of diabetes disease onset | ROC curve | Global LR; | |
| Panahiazar et al [ | Medication plans for heart-failure patients (angiotensin-converting enzyme, angiotensin receptor blockers, β-adrenoceptor antagonists, statins, and calcium channel blocker) | Sensitivity; specificity; F-measure; accuracy | K-means; hierarchical clustering | |
| Wang [ | (1) Diagnosis of a breast tumor as malignant or benign; (2) diagnosis of diabetes; (3) occurrence of CHF within 6 months | Precision; F-measure; sensitivity; accuracy | ||
| Wang et al [ | Occurrence of CHF within 6 months | Precision; F-measure; sensitivity; accuracy | PCA; Laplacian regularized metric learning [ |
a CHF: congestive heart failure; ICD: International Classification of Diseases.
b IPEC: integrated prediction error curve ; NR: not reported; ROC: receiver operating characteristic: SIRS: systemic inflammatory response syndrome.
c CART: classification and regression tree; CBR: case-based reasoning; DT: decision tree; k-NN: k- nearest neighbor; kd-tree: k dimensional tree; LDA: linear discriminant analysis; LE: Laplacian embedding; LLE: locally linear embedding; LR: logistic regression; LSDA: locality sensitive discriminant analysis; LSML: locally supervised metric learning; LSR: local spline regression; NR: not reported; PCA: principal component analysis; RSF: random survival forest.
Figure 2Distribution of publications by year.
Figure 3Focused application areas of studies. Some studies featured more than a single application area and were counted more than once.