BACKGROUND: Previous studies identifying patients with inflammatory bowel disease using administrative codes have yielded inconsistent results. Our objective was to develop a robust electronic medical record-based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing. METHODS: Using the electronic medical records of 2 large academic centers, we created data marts for Crohn's disease (CD) and ulcerative colitis (UC) comprising patients with ≥1 International Classification of Diseases, 9th edition, code for each disease. We used codified (i.e., International Classification of Diseases, 9th edition codes, electronic prescriptions) and narrative data from clinical notes to develop our classification model. Model development and validation was performed in a training set of 600 randomly selected patients for each disease with medical record review as the gold standard. Logistic regression with the adaptive LASSO penalty was used to select informative variables. RESULTS: We confirmed 399 CD cases (67%) in the CD training set and 378 UC cases (63%) in the UC training set. For both, a combined model including narrative and codified data had better accuracy (area under the curve for CD 0.95; UC 0.94) than models using only disease International Classification of Diseases, 9th edition codes (area under the curve 0.89 for CD; 0.86 for UC). Addition of natural language processing narrative terms to our final model resulted in classification of 6% to 12% more subjects with the same accuracy. CONCLUSIONS: Inclusion of narrative concepts identified using natural language processing improves the accuracy of electronic medical records case definition for CD and UC while simultaneously identifying more subjects compared with models using codified data alone.
RCT Entities:
BACKGROUND: Previous studies identifying patients with inflammatory bowel disease using administrative codes have yielded inconsistent results. Our objective was to develop a robust electronic medical record-based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing. METHODS: Using the electronic medical records of 2 large academic centers, we created data marts for Crohn's disease (CD) and ulcerative colitis (UC) comprising patients with ≥1 International Classification of Diseases, 9th edition, code for each disease. We used codified (i.e., International Classification of Diseases, 9th edition codes, electronic prescriptions) and narrative data from clinical notes to develop our classification model. Model development and validation was performed in a training set of 600 randomly selected patients for each disease with medical record review as the gold standard. Logistic regression with the adaptive LASSO penalty was used to select informative variables. RESULTS: We confirmed 399 CD cases (67%) in the CD training set and 378 UC cases (63%) in the UC training set. For both, a combined model including narrative and codified data had better accuracy (area under the curve for CD 0.95; UC 0.94) than models using only disease International Classification of Diseases, 9th edition codes (area under the curve 0.89 for CD; 0.86 for UC). Addition of natural language processing narrative terms to our final model resulted in classification of 6% to 12% more subjects with the same accuracy. CONCLUSIONS: Inclusion of narrative concepts identified using natural language processing improves the accuracy of electronic medical records case definition for CD and UC while simultaneously identifying more subjects compared with models using codified data alone.
Authors: Guergana K Savova; James J Masanz; Philip V Ogren; Jiaping Zheng; Sunghwan Sohn; Karin C Kipper-Schuler; Christopher G Chute Journal: J Am Med Inform Assoc Date: 2010 Sep-Oct Impact factor: 4.497
Authors: Jonathan S Schildcrout; Melissa A Basford; Jill M Pulley; Daniel R Masys; Dan M Roden; Deede Wang; Christopher G Chute; Iftikhar J Kullo; David Carrell; Peggy Peissig; Abel Kho; Joshua C Denny Journal: J Biomed Inform Date: 2010-08-03 Impact factor: 6.317
Authors: Hua Xu; Shane P Stenner; Son Doan; Kevin B Johnson; Lemuel R Waitman; Joshua C Denny Journal: J Am Med Inform Assoc Date: 2010 Jan-Feb Impact factor: 4.497
Authors: Lisa J Herrinton; Liyan Liu; Jennifer Elston Lafata; James E Allison; Susan E Andrade; Eli J Korner; K Arnold Chan; Richard Platt; Deborah Hiatt; Siobhán O'Connor Journal: Inflamm Bowel Dis Date: 2007-04 Impact factor: 5.325
Authors: Katherine P Liao; Tianxi Cai; Vivian Gainer; Sergey Goryachev; Qing Zeng-treitler; Soumya Raychaudhuri; Peter Szolovits; Susanne Churchill; Shawn Murphy; Isaac Kohane; Elizabeth W Karlson; Robert M Plenge Journal: Arthritis Care Res (Hoboken) Date: 2010-08 Impact factor: 4.794
Authors: Fina Kurreeman; Katherine Liao; Lori Chibnik; Brendan Hickey; Eli Stahl; Vivian Gainer; Gang Li; Lynn Bry; Scott Mahan; Kristin Ardlie; Brian Thomson; Peter Szolovits; Susanne Churchill; Shawn N Murphy; Tianxi Cai; Soumya Raychaudhuri; Isaac Kohane; Elizabeth Karlson; Robert M Plenge Journal: Am J Hum Genet Date: 2011-01-07 Impact factor: 11.025
Authors: E I Benchimol; A Guttmann; A M Griffiths; L Rabeneck; D R Mack; H Brill; J Howard; J Guan; T To Journal: Gut Date: 2009-08-02 Impact factor: 23.059
Authors: Katherine P Liao; Jiehuan Sun; Tianrun A Cai; Nicholas Link; Chuan Hong; Jie Huang; Jennifer E Huffman; Jessica Gronsbell; Yichi Zhang; Yuk-Lam Ho; Victor Castro; Vivian Gainer; Shawn N Murphy; Christopher J O'Donnell; J Michael Gaziano; Kelly Cho; Peter Szolovits; Isaac S Kohane; Sheng Yu; Tianxi Cai Journal: J Am Med Inform Assoc Date: 2019-11-01 Impact factor: 4.497
Authors: Sungrim Moon; Sijia Liu; Christopher G Scott; Sujith Samudrala; Mohamed M Abidian; Jeffrey B Geske; Peter A Noseworthy; Jane L Shellum; Rajeev Chaudhry; Steve R Ommen; Rick A Nishimura; Hongfang Liu; Adelaide M Arruda-Olson Journal: Int J Med Inform Date: 2019-05-13 Impact factor: 4.046
Authors: Ashwin N Ananthakrishnan; Andrew Cagan; Tianxi Cai; Vivian S Gainer; Stanley Y Shaw; Susanne Churchill; Elizabeth W Karlson; Shawn N Murphy; Isaac Kohane; Katherine P Liao Journal: Clin Gastroenterol Hepatol Date: 2014-07-17 Impact factor: 11.382
Authors: Jennifer A Sinnott; Wei Dai; Katherine P Liao; Stanley Y Shaw; Ashwin N Ananthakrishnan; Vivian S Gainer; Elizabeth W Karlson; Susanne Churchill; Peter Szolovits; Shawn Murphy; Isaac Kohane; Robert Plenge; Tianxi Cai Journal: Hum Genet Date: 2014-07-26 Impact factor: 4.132
Authors: Ashwin N Ananthakrishnan; Andrew Cagan; Vivian S Gainer; Su-Chun Cheng; Tianxi Cai; Peter Szolovits; Stanley Y Shaw; Susanne Churchill; Elizabeth W Karlson; Shawn N Murphy; Isaac Kohane; Katherine P Liao Journal: J Crohns Colitis Date: 2014-02-19 Impact factor: 9.071