| Literature DB >> 25881112 |
Yizhao Ni1, Jordan Wright2, John Perentesis2, Todd Lingren3, Louise Deleger3, Megan Kaiser3, Isaac Kohane4, Imre Solti3,5.
Abstract
BACKGROUND: Manual eligibility screening (ES) for a clinical trial typically requires a labor-intensive review of patient records that utilizes many resources. Leveraging state-of-the-art natural language processing (NLP) and information extraction (IE) technologies, we sought to improve the efficiency of physician decision-making in clinical trial enrollment. In order to markedly reduce the pool of potential candidates for staff screening, we developed an automated ES algorithm to identify patients who meet core eligibility characteristics of an oncology clinical trial.Entities:
Mesh:
Year: 2015 PMID: 25881112 PMCID: PMC4407835 DOI: 10.1186/s12911-015-0149-3
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Figure 1An example eligibility description (NCT01154816) derived from ClinicalTrials.gov.
Figure 2Frequencies of the collected EHR fields (a) and descriptive statistics of the unstructured clinical notes (b). *A data entry is a piece of information (e.g. diagnosis) documented during a patient’s visit. If a patient has the same diagnosis/ICD-9 code during multiple visits, we only count the diagnosis/ICD-9 code once for that patient. **Tokens include words, numbers, symbols and punctuations in clinical narratives.
Figure 3The architecture of the automated ES algorithm.
The performance of the demographics-based filter (baseline) and the EHR-based ES algorithms
|
| ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
|
|
| |||||||||
|
|
|
|
|
| ||||||
| Demographics-based Filter | 163 | 149-179 | 1.9 | 24.3 | 8.30E-21 | |||||
| DX/ICD-9 | 50 | 35-64 | 6.20 | 78.1 | 5.27E-4 | |||||
| NOTE | 28 | 16-41 | 10.7 | 87.9 | 7.75E-2 | |||||
| DX/ICD-9+NOTE | 24 | 14-35 | 12.6 | 89.9 | N/A | |||||
|
| ||||||||||
|
|
| |||||||||
|
|
| |||||||||
|
|
|
|
|
|
|
|
|
|
| |
| Demographics-based Filter | 42 | 40-43 | 3.2 | 25.5 | 1.7E-143 | 42 | 40-43 | 1.9 | 24.3 | 1.5E-39 |
| DX/ICD-9 | 8 | 6-10 | 16.8 | 87.8 | 2.36E-7 | 22 | 19-25 | 3.6 | 60.7 | 3.85E-7 |
| NOTE | 4 | 3-5 | 33.1 | 95.0 | 2.54E-2 | 20 | 17-23 | 3.9 | 64.9 | 2.54E-2 |
| DX/ICD-9+NOTE | 3 | 3-4 | 35.7 | 95.5 | N/A | 19 | 17-22 | 4.0 | 65.5 | N/A |
DX/ICD9 indicates ES algorithm using only structured diagnoses and ICD-9 codes; NOTE, ES algorithm using only clinical notes; DX/ICD-9+NOTE, ES algorithm using both structured data and clinical notes.
WL indicates workload; CI, confidence interval; P, precision and Sp, specificity, PV, p-value.
*P-values were calculated by comparing the workload between DX/ICD-9+NOTE with the other algorithms.
N/A indicates that the performances between the two algorithms are identical and no p-value is returned.
The precision of the ES algorithm against the historical enrollments and the list of eligible patients found by the oncologist
|
|
|
|
|
|
|---|---|---|---|---|
| NCT00072384 | 1 | 2 | 1 | 0 |
| NCT00134030 | 9 | 18 | 9 | 2* |
| NCT00274937 | 1 | 2 | 1 | 0 |
| NCT00335556 | 2 | 4 | 0 | 0 |
| NCT00343694 | 3 | 6 | 1 | 1* |
| NCT00379340 | 1 | 2 | 0 | 0 |
| NCT00382109 | 2 | 4 | 0 | 0 |
| NCT00553202 | 6 | 12 | 4 | 1* |
| NCT00557193 | 1 | 2 | 1 | 0 |
| NCT01190930 | 12 | 24 | 12 | 1* |
| TOTAL | 38 | 76 | 29 | 5 |
| Precision | N/A | N/A | 0.38 | 0.45 |
*Indicates that more patients in the algorithm output were eligible for this trial than the number of historical enrollment decisions.
The false positive errors made by the ES algorithm with the causes described by the oncologist
|
|
|
|---|---|
| Previously enrolled in the trial/therapy at a different institution | 3 |
| New diagnosis treated with standard of care therapy due to high likelihood of survival | 4 |
| Correct diagnosis but in a different stage of the disease (e.g. high risk versus low risk) | 5 |
| Correct diagnosis but incorrect relapse status (e.g. relapsed versus non-relapsed, remission 1 versus remission 2) | 5 |
| Wrong diagnosis, confusion between sub-categories of diseases (e.g. ALL versus AML, T cell versus Pre-B cell and different types of renal tumors) | 13 |
| Wrong diagnosis, other reasons | 12 |