Wuyang Dai1, Theodora S Brisimi1, William G Adams2, Theofanie Mela3, Venkatesh Saligrama1, Ioannis Ch Paschalidis4. 1. Department of Electrical & Computer Engineering, and Division of Systems Engineering, Boston University, 8 Saint Mary's Street, Boston, MA 02215, United States. 2. Department of Pediatrics, Boston University School of Medicine and Boston Medical Center, 88 East Concord Street, Boston, MA 02118, United States. 3. Electrophysiology Lab/Arrhythmia Service, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, United States. 4. Department of Electrical & Computer Engineering, and Division of Systems Engineering, Boston University, 8 Saint Mary's Street, Boston, MA 02215, United States. Electronic address: yannisp@bu.edu.
Abstract
BACKGROUND: In 2008, the United States spent $2.2 trillion for healthcare, which was 15.5% of its GDP. 31% of this expenditure is attributed to hospital care. Evidently, even modest reductions in hospital care costs matter. A 2009 study showed that nearly $30.8 billion in hospital care cost during 2006 was potentially preventable, with heart diseases being responsible for about 31% of that amount. METHODS: Our goal is to accurately and efficiently predict heart-related hospitalizations based on the available patient-specific medical history. To the best of our knowledge, the approaches we introduce are novel for this problem. The prediction of hospitalization is formulated as a supervised classification problem. We use de-identified Electronic Health Record (EHR) data from a large urban hospital in Boston to identify patients with heart diseases. Patients are labeled and randomly partitioned into a training and a test set. We apply five machine learning algorithms, namely Support Vector Machines (SVM), AdaBoost using trees as the weak learner, logistic regression, a naïve Bayes event classifier, and a variation of a Likelihood Ratio Test adapted to the specific problem. Each model is trained on the training set and then tested on the test set. RESULTS: All five models show consistent results, which could, to some extent, indicate the limit of the achievable prediction accuracy. Our results show that with under 30% false alarm rate, the detection rate could be as high as 82%. These accuracy rates translate to a considerable amount of potential savings, if used in practice.
BACKGROUND: In 2008, the United States spent $2.2 trillion for healthcare, which was 15.5% of its GDP. 31% of this expenditure is attributed to hospital care. Evidently, even modest reductions in hospital care costs matter. A 2009 study showed that nearly $30.8 billion in hospital care cost during 2006 was potentially preventable, with heart diseases being responsible for about 31% of that amount. METHODS: Our goal is to accurately and efficiently predict heart-related hospitalizations based on the available patient-specific medical history. To the best of our knowledge, the approaches we introduce are novel for this problem. The prediction of hospitalization is formulated as a supervised classification problem. We use de-identified Electronic Health Record (EHR) data from a large urban hospital in Boston to identify patients with heart diseases. Patients are labeled and randomly partitioned into a training and a test set. We apply five machine learning algorithms, namely Support Vector Machines (SVM), AdaBoost using trees as the weak learner, logistic regression, a naïve Bayes event classifier, and a variation of a Likelihood Ratio Test adapted to the specific problem. Each model is trained on the training set and then tested on the test set. RESULTS: All five models show consistent results, which could, to some extent, indicate the limit of the achievable prediction accuracy. Our results show that with under 30% false alarm rate, the detection rate could be as high as 82%. These accuracy rates translate to a considerable amount of potential savings, if used in practice.
Authors: Li Wang; Brian Porter; Charles Maynard; Ginger Evans; Christopher Bryson; Haili Sun; Indra Gupta; Elliott Lowy; Mary McDonell; Kathleen Frisbee; Christopher Nielson; Fred Kirkland; Stephan D Fihn Journal: Med Care Date: 2013-04 Impact factor: 2.983
Authors: Samuel J Wang; Blackford Middleton; Lisa A Prosser; Christiana G Bardon; Cynthia D Spurr; Patricia J Carchidi; Anne F Kittler; Robert C Goldszer; David G Fairchild; Andrew J Sussman; Gilad J Kuperman; David W Bates Journal: Am J Med Date: 2003-04-01 Impact factor: 4.965
Authors: Ralph B D'Agostino; Ramachandran S Vasan; Michael J Pencina; Philip A Wolf; Mark Cobain; Joseph M Massaro; William B Kannel Journal: Circulation Date: 2008-01-22 Impact factor: 29.690
Authors: Theodora S Brisimi; Tingting Xu; Taiyao Wang; Wuyang Dai; William G Adams; Ioannis Ch Paschalidis Journal: Proc IEEE Inst Electr Electron Eng Date: 2018-02-06 Impact factor: 10.961
Authors: Benjamin A Goldstein; Ann Marie Navar; Michael J Pencina; John P A Ioannidis Journal: J Am Med Inform Assoc Date: 2016-05-17 Impact factor: 4.497
Authors: Theodora S Brisimi; Tingting Xu; Taiyao Wang; Wuyang Dai; Ioannis Ch Paschalidis Journal: Stat Methods Med Res Date: 2018-11-25 Impact factor: 3.021
Authors: Theodora S Brisimi; Ruidi Chen; Theofanie Mela; Alex Olshevsky; Ioannis Ch Paschalidis; Wei Shi Journal: Int J Med Inform Date: 2018-01-12 Impact factor: 4.046