Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Maximizing clinical cohort size using free text queries.

Literature DB >> 25747340

Maximizing clinical cohort size using free text queries.

Adi V Gundlapalli¹, Doug Redd², Bryan Smith Gibson¹, Marjorie Carter¹, Chris Korhonen³, Jonathan Nebeker¹, Matthew H Samore⁴, Qing Zeng-Treitler⁵.

Abstract

BACKGROUND: Cohort identification is important in both population health management and research. In this project we sought to assess the use of text queries for cohort identification. Specifically we sought to determine the incremental value of unstructured data queries when added to structured queries for the purpose of patient cohort identification.
METHODS: Three cohort identification tasks were evaluated: identification of individuals taking gingko biloba and warfarin simultaneously (Gingko/Warfarin), individuals who were overweight, and individuals with uncontrolled diabetes (UCD). We assessed the increase in cohort size when unstructured data queries were added to structured data queries. The positive predictive value of unstructured data queries was assessed by manual chart review of a random sample of 500 patients.
RESULTS: For Gingko/Warfarin, text query increased the cohort size from 9 to 28,924 over the cohort identified by query of pharmacy data only. For the weight-related tasks, text search increased the cohort by 5-29% compared to the cohort identified by query of the vitals table. For the UCD task, text query increased the cohort size by 2-43% compared to the cohort identified by query of laboratory results or ICD codes. The positive predictive values for text searches were 52% for Gingko/Warfarin, 19-94% for the weight cohort and 44% for UCD. DISCUSSION: This project demonstrates the value and limitation of free text queries in patient cohort identification from large data sets. The clinical domain and prevalence of the inclusion and exclusion criteria in the patient population influence the utility and yield of this approach. Published by Elsevier Ltd.

Entities: Chemical Disease Species

Keywords: Clinical notes; Cohort identification; Diabetes; Gingko; Overweight; Structured data; Text query; Unstructured data; Warfarin

Mesh：

Substances：
Plant Extracts
Warfarin

Year: 2015 PMID： 25747340 DOI： 10.1016/j.compbiomed.2015.01.008

Source DB: PubMed Journal: Comput Biol Med ISSN： 0010-4825 Impact factor: 4.589

Keyword Cloud
Cited

2 in total

1. Regular Expression-Based Learning for METs Value Extraction.

Authors: Douglas Redd; Jinqiu Kuang; April Mohanty; Bruce E Bray; Qing Zeng-Treitler
Journal: AMIA Jt Summits Transl Sci Proc Date: 2016-07-20

2. FasTag: Automatic text classification of unstructured medical narratives.

Authors: Guhan Ram Venkataraman; Arturo Lopez Pineda; Oliver J Bear Don't Walk Iv; Ashley M Zehnder; Sandeep Ayyar; Rodney L Page; Carlos D Bustamante; Manuel A Rivas
Journal: PLoS One Date: 2020-06-22 Impact factor: 3.240

2 in total