| Literature DB >> 19455241 |
Michael Graiser1, Susan G Moore, Rochelle Victor, Ashley Hilliard, Leroy Hill, Michael S Keehan, Christopher R Flowers.
Abstract
BACKGROUND: Large linked databases (LLDB) represent a novel resource for cancer outcomes research. However, accurate means of identifying a patient population of interest within these LLDBs can be challenging. Our research group developed a fully integrated platform that provides a means of combining independent legacy databases into a single cancer-focused LLDB system. We compared the sensitivity and specificity of several SQL-based query strategies for identifying a histologic lymphoma subtype in this LLDB to determine the most accurate legacy data source for identifying a specific cancer patient population.Entities:
Keywords: Large linked database; cancer epidemiology; cancer outcomes research; cancer registry
Year: 2007 PMID: 19455241 PMCID: PMC2675837
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
GeneSys SI sources databases and start dates for source data.
| DATA WAREHOUSE | |
| Hospital administrative (HealthQuest) | 1995 |
| Clinic administrative (IDX) | 1994 |
| Medical Records | 1987 |
| Clinical Labs | 2001 |
| Hospital Pharmacy | 1998 |
| Clinic Pharmacy | 2002 |
| CANCER REGISTRY | |
| Emory Hospital | 1977 |
| Crawford Long Hospital | 1981 |
| CLINICAL TRIALS | 1981 |
| ELECTRONIC MEDICAL | 1991 |
| RECORD (Cerner PowerChart) | |
| RADIATION ONCOLOGY | |
| The Emory Clinic | 1994 |
| Crawford Long Hospital | 2001 |
| GENOMICS (data structures) | 2004 |
| FORMS (e.g. informed consent) | 2003 |
Figure 1.System architecture for the GeneSys SI oncology database application
Queries to identify follicular lymphoma cases within a linked legacy database.
| QCR | Imported list from Cancer Registry | Follicular lymphoma by histology codes, 1985–2002 | 425 (28%) |
| Q1 | Cancer Registry, ICD-O morphology codes plus behavior code 3 (malignant, primary) | Morphology codes 9690, 9691, 9695, 9698 plus behavior code 3 | 242 (16%) |
| Q2 | Text search–pathology reports | ‘follicular’ NEAR ‘lymphoma’ | 406 (26%) |
| Q3 | Text search–pathology reports | ‘follicular lymphoma’ | 126 (8%) |
| Q4 | Text search–all medical record reports | ‘follicular’ NEAR ‘lymphoma’ | 531 (35%) |
| Q5 | Test search–all medical record reports | ‘follicular lymphoma’ | 193 (13%) |
| Q6 | ICD-9 codes–Emory Clinic | 202.0, 202.00, 202.01, 202.02, 202.03, 202.04, 202.05, 202.06, 202.07, 202.08 | 901 (59%) |
| Q7 | ICD-9 codes–Emory Hospitals | (same as Q6 above) | 288 (19%) |
| Q8 | Q2 + Q6 | (see criteria for Q2 and Q6 above) | 1137 (74%) |
| Q9 | Q4 + Q6 | (see criteria for Q4 and Q6 above) | 1233 (80%) |
| Q10 | Q1 + Q2 | (see criteria for Q1 and Q2 above) | 498 (32%) |
| Q11 | Text search–pathology reports | (UMLS terms–see | 36 (2%) |
| Q12 | Text search–all medical record reports | (UMLS terms–see | 121 (8%) |
| Total cases reviewed combining all queries | 1538 |
Description of text queries based on UMLS synonyms for follicular lymphoma.
| “nodular lymphoma” | 75 (5%) |
| “Brill–Symmers” | 0 |
| “Brill–Symmers” | 0 |
| “reticulosarcoma–follicular” | 0 |
| “reticulosarcoma–nodular” | 0 |
| “follicular lymphosarcoma” | 0 |
| “giant follicular lymphoma” | 0 |
| “mal.lym,centr-blas/cyt,foll” | 0 |
| “malig.lymphoma, nodular” | 0 |
| “malig. lymphoma, nodular” | 0 |
| “lymphoma, nodular” | 65 (4%) |
| “nodular lymphosarcoma” | 0 |
| “follicle center lymphoma” | 13 (<1%) |
| “follicular non-Hodgkin” | 35 (2.3%) |
| “foll low grade B-cell lymphoma” | 0 |
| “germinoblastoma, follicular” | 0 |
| “lymphoma, follicle center” | 4 (<1%) |
| “malignant lymphoma, lymphocytic, nodular” | 0 |
| “lyoma,centrbl-centrcyt,foll” | 0 |
| “reticulosarcoma, follicular” | 0 |
| “reticulosarcoma, nodular” | 0 |
Sensitivity and specificity for linked database queries.
| Q1 | 151 + | 23 + | 772 + | 149 + | 50.3 | 47.0 | 97.1 | 95.8 |
| Q2 | 269 + | 102 + | 693 + | 31 + | 89.7 | 69.4 | 87.2 | 89.5 |
| Q3 | 96 + | 21 + | 774 + | 204 + | 32.0 | 24.6 | 97.4 | 97.9 |
| Q4 | 279 + | 131 + | 664 + | 21 + | 93.0 | 90.0 | 83.5 | 85.9 |
| Q5 | 123 + | 28 + | 767 + | 177 + | 41.0 | 38.3 | 96.5 | 97.0 |
| Q6 | 143 + | 490 + | 305 + | 157 + | 47.7 | 42.9 | 38.4 | 35.6 |
| Q7 | 106 + | 101 + | 694 + | 194 + | 35.3 | 33.0 | 87.3 | 86.6 |
| Q8 | 280 + | 569 + | 226 + | 20 + | 93.3 | 77.8 | 28.4 | 27.5 |
| Q9 | 286 + | 591 + | 204 + | 14 + | 95.3 | 93.5 | 25.7 | 24.8 |
| Q10 | 285 + | 123 + | 672 + | 15 + | 95.0 | 81.2 | 84.5 | 85.7 |
| Q11 | 27 + | 8 + | 787 + | 273 + | 9.0 | 6.7 | 99.0 | 99.3 |
| Q12 | 54 + | 34 + | 761 + | 246 + | 18.0 | 20.0 | 95.7 | 96.6 |
Note: Each total has a pathology-verified component listed first followed by a chart-verified component in italics.
Figure 2.Receiver-operator plot of query strategies to identify a pathology-confirmed histologic diagnosis of follicular lymphoma