Jean Coquet1, Selen Bozkurt2, Kathleen M Kan3, Michelle K Ferrari3, Douglas W Blayney4, James D Brooks5, Tina Hernandez-Boussard6. 1. Department of Medicine, Stanford University, Stanford, CA, USA. 2. Department of Medicine, Stanford University, Stanford, CA, USA; Department of Biomedical Data Science, Stanford University, Stanford, USA. 3. Department of Urology, Stanford University School of Medicine, Stanford, USA. 4. Department of Medicine, Stanford University, Stanford, CA, USA; Stanford Cancer Institute, Stanford University School of Medicine, Stanford, USA. 5. Department of Urology, Stanford University School of Medicine, Stanford, USA; Stanford Cancer Institute, Stanford University School of Medicine, Stanford, USA. 6. Department of Medicine, Stanford University, Stanford, CA, USA; Department of Biomedical Data Science, Stanford University, Stanford, USA; Department of Surgery, Stanford University School of Medicine, Stanford, USA. Electronic address: boussard@stanford.edu.
Abstract
OBJECTIVE: Clinical care guidelines recommend that newly diagnosed prostate cancer patients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone scans using a two different Natural Language Processing (NLP) approaches. MATERIALS AND METHODS: Our cohort was divided into risk groups based on Electronic Health Records (EHR). Information on bone scan utilization was identified in both structured data and free text from clinical notes. Our pipeline annotated sentences with a combination of a rule-based method using the ConText algorithm (a generalization of NegEx) and a Convolutional Neural Network (CNN) method using word2vec to produce word embeddings. RESULTS: A total of 5500 patients and 369,764 notes were included in the study. A total of 39% of patients were high-risk and 73% of these received a bone scan; of the 18% low risk patients, 10% received one. The accuracy of CNN model outperformed the rule-based model one (F-measure = 0.918 and 0.897 respectively). We demonstrate a combination of both models could maximize precision or recall, based on the study question. CONCLUSION: Using structured data, we accurately classified patients' cancer risk group, identified bone scan documentation with two NLP methods, and evaluated guideline adherence. Our pipeline can be used to provide concrete feedback to clinicians and guide treatment decisions.
OBJECTIVE: Clinical care guidelines recommend that newly diagnosed prostate cancerpatients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone scans using a two different Natural Language Processing (NLP) approaches. MATERIALS AND METHODS: Our cohort was divided into risk groups based on Electronic Health Records (EHR). Information on bone scan utilization was identified in both structured data and free text from clinical notes. Our pipeline annotated sentences with a combination of a rule-based method using the ConText algorithm (a generalization of NegEx) and a Convolutional Neural Network (CNN) method using word2vec to produce word embeddings. RESULTS: A total of 5500 patients and 369,764 notes were included in the study. A total of 39% of patients were high-risk and 73% of these received a bone scan; of the 18% low risk patients, 10% received one. The accuracy of CNN model outperformed the rule-based model one (F-measure = 0.918 and 0.897 respectively). We demonstrate a combination of both models could maximize precision or recall, based on the study question. CONCLUSION: Using structured data, we accurately classified patients' cancer risk group, identified bone scan documentation with two NLP methods, and evaluated guideline adherence. Our pipeline can be used to provide concrete feedback to clinicians and guide treatment decisions.
Authors: H Ballentine Carter; Peter C Albertsen; Michael J Barry; Ruth Etzioni; Stephen J Freedland; Kirsten Lynn Greene; Lars Holmberg; Philip Kantoff; Badrinath R Konety; Mohammad Hassan Murad; David F Penson; Anthony L Zietman Journal: J Urol Date: 2013-05-06 Impact factor: 7.450
Authors: Marc A Dall'Era; Peter C Albertsen; Christopher Bangma; Peter R Carroll; H Ballentine Carter; Matthew R Cooperberg; Stephen J Freedland; Laurence H Klotz; Christopher Parker; Mark S Soloway Journal: Eur Urol Date: 2012-06-07 Impact factor: 20.096
Authors: Douglas W Blayney; Kristen McNiff; David Hanauer; Gretchen Miela; Denise Markstrom; Michael Neuss Journal: J Clin Oncol Date: 2009-06-01 Impact factor: 44.544
Authors: Kavishwar B Wagholikar; Kathy L MacLaughlin; Michael R Henry; Robert A Greenes; Ronald A Hankey; Hongfang Liu; Rajeev Chaudhry Journal: J Am Med Inform Assoc Date: 2012-04-29 Impact factor: 4.497
Authors: Louise Deleger; Holly Brodzinski; Haijun Zhai; Qi Li; Todd Lingren; Eric S Kirkendall; Evaline Alessandrini; Imre Solti Journal: J Am Med Inform Assoc Date: 2013-10-15 Impact factor: 4.497
Authors: Selen Bozkurt; Christopher J Magnani; Martin G Seneviratne; James D Brooks; Tina Hernandez-Boussard Journal: Front Digit Health Date: 2022-06-02
Authors: Selen Bozkurt; Rohan Paul; Jean Coquet; Ran Sun; Imon Banerjee; James D Brooks; Tina Hernandez-Boussard Journal: Learn Health Syst Date: 2020-07-17