Limor Appelbaum1, José P Cambronero2, Jennifer P Stevens3, Steven Horng4, Karla Pollick5, George Silva6, Sebastien Haneuse7, Gail Piatkowski8, Nordine Benhaga9, Stacey Duey10, Mary A Stevenson11, Harvey Mamon12, Irving D Kaplan13, Martin C Rinard14. 1. Beth Israel Deaconess Medical Center, Department of Radiation Oncology, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: lappelb1@bidmc.harvard.edu. 2. Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, 32 Vassar St, Cambridge, MA, 02139, USA. Electronic address: jcamsan@mit.edu. 3. Beth Israel Deaconess Medical Center, Center for Healthcare Delivery Science, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: jpsteven@bidmc.harvard.edu. 4. Beth Israel Deaconess Medical Center, Division of Emergency Medicine Informatics, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: shorng@bidmc.harvard.edu. 5. Beth Israel Deaconess Medical Center, Center for Healthcare Delivery Science, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: kpollick@bidmc.harvard.edu. 6. Beth Israel Deaconess Medical Center, Center for Healthcare Delivery Science, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: gssilva@bidmc.harvard.edu. 7. Harvard University, T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA, 02115, USA. Electronic address: shaneuse@hsph.harvard.edu. 8. Beth Israel Deaconess Medical Center, Center for Healthcare Delivery Science, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: gpiatkow@bidmc.harvard.edu. 9. Beth Israel Deaconess Medical Center, Department of Radiation Oncology, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: nbenhaga@bidmc.harvard.edu. 10. Brigham and Women's Hospital, Partners Research IS and Computing, Information Systems Department, 75 Francis Street, Boston, MA, 02115, USA. Electronic address: sduey@partners.org. 11. Beth Israel Deaconess Medical Center, Department of Radiation Oncology, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: mstevens@bidmc.harvard.edu. 12. Dana Farber Cancer Institute/Radiation Oncology, Brigham and Women's Hospital, Harvard Medical School, 75 Francis Street, Boston, MA, 02115, USA. Electronic address: hmamon@bwh.harvard.edu. 13. Beth Israel Deaconess Medical Center, Department of Radiation Oncology, 330 Brookline Ave, Boston, MA, 02215, USA. Electronic address: ikaplan@bidmc.harvard.edu. 14. Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, 32 Vassar St, Cambridge, MA, 02139, USA. Electronic address: rinard@csail.mit.edu.
Abstract
AIM: Pancreatic ductal adenocarcinoma (PDAC) is often diagnosed at a late, incurable stage. We sought to determine whether individuals at high risk of developing PDAC could be identified early using routinely collected data. METHODS: Electronic health record (EHR) databases from two independent hospitals in Boston, Massachusetts, providing inpatient, outpatient, and emergency care, from 1979 through 2017, were used with case-control matching. PDAC cases were selected using International Classification of Diseases 9/10 codes and validated with tumour registries. A data-driven feature selection approach was used to develop neural networks and L2-regularised logistic regression (LR) models on training data (594 cases, 100,787 controls) and compared with a published model based on hand-selected diagnoses ('baseline'). Model performance was validated on an external database (408 cases, 160,185 controls). Three prediction lead times (180, 270 and 365 days) were considered. RESULTS: The LR model had the best performance, with an area under the curve (AUC) of 0.71 (confidence interval [CI]: 0.67-0.76) for the training set, and AUC 0.68 (CI: 0.65-0.71) for the validation set, 365 days before diagnosis. Data-driven feature selection improved results over 'baseline' (AUC = 0.55; CI: 0.52-0.58). The LR model flags 2692 (CI 2592-2791) of 156,485 as high risk, 365 days in advance, identifying 25 (CI: 16-36) cancer patients. Risk stratification showed that the high-risk group presented a cancer rate 3 to 5 times the prevalence in our data set. CONCLUSION: A simple EHR model, based on diagnoses, can identify high-risk individuals for PDAC up to one year in advance. This inexpensive, systematic approach may serve as the first sieve for selection of individuals for PDAC screening programs.
AIM: Pancreatic ductal adenocarcinoma (PDAC) is often diagnosed at a late, incurable stage. We sought to determine whether individuals at high risk of developing PDAC could be identified early using routinely collected data. METHODS: Electronic health record (EHR) databases from two independent hospitals in Boston, Massachusetts, providing inpatient, outpatient, and emergency care, from 1979 through 2017, were used with case-control matching. PDAC cases were selected using International Classification of Diseases 9/10 codes and validated with tumour registries. A data-driven feature selection approach was used to develop neural networks and L2-regularised logistic regression (LR) models on training data (594 cases, 100,787 controls) and compared with a published model based on hand-selected diagnoses ('baseline'). Model performance was validated on an external database (408 cases, 160,185 controls). Three prediction lead times (180, 270 and 365 days) were considered. RESULTS: The LR model had the best performance, with an area under the curve (AUC) of 0.71 (confidence interval [CI]: 0.67-0.76) for the training set, and AUC 0.68 (CI: 0.65-0.71) for the validation set, 365 days before diagnosis. Data-driven feature selection improved results over 'baseline' (AUC = 0.55; CI: 0.52-0.58). The LR model flags 2692 (CI 2592-2791) of 156,485 as high risk, 365 days in advance, identifying 25 (CI: 16-36) cancerpatients. Risk stratification showed that the high-risk group presented a cancer rate 3 to 5 times the prevalence in our data set. CONCLUSION: A simple EHR model, based on diagnoses, can identify high-risk individuals for PDAC up to one year in advance. This inexpensive, systematic approach may serve as the first sieve for selection of individuals for PDAC screening programs.
Authors: Robert J Huang; Nicole Sung-Eun Kwon; Yutaka Tomizawa; Alyssa Y Choi; Tina Hernandez-Boussard; Joo Ha Hwang Journal: JCO Clin Cancer Inform Date: 2022-06
Authors: Barbara J Kenner; Natalie D Abrams; Suresh T Chari; Bruce F Field; Ann E Goldberg; William A Hoos; David S Klimstra; Laura J Rothschild; Sudhir Srivastava; Matthew R Young; Vay Liang W Go Journal: Pancreas Date: 2021-08-01 Impact factor: 3.243
Authors: Agnieszka Lemanska; Claire A Price; Nathan Jeffreys; Rachel Byford; Hajira Dambha-Miller; Xuejuan Fan; William Hinton; Sophie Otter; Rebecca Rice; Ali Stunt; Martin B Whyte; Sara Faithfull; Simon de Lusignan Journal: PLoS One Date: 2022-10-05 Impact factor: 3.752