| Literature DB >> 25863643 |
Mark Singh1, Akansh Murthy2, Shridhar Singh3.
Abstract
BACKGROUND: The amount of incoming data into physicians' offices is increasing, thereby making it difficult to process information efficiently and accurately to maximize positive patient outcomes. Current manual processes of screening for individual terms within long free-text documents are tedious and error-prone. This paper explores the use of statistical methods and computer systems to assist clinical data management.Entities:
Keywords: Bayesian classifier; clinical reports; natural language processing; prioritization; radiology
Year: 2015 PMID: 25863643 PMCID: PMC4409648 DOI: 10.2196/medinform.3793
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Deidentified high-priority (top) and low-priority (bottom) patient reports in text format.
Distribution of the types of reports used in the test dataset.
| Type of report | Percentage, n (%) |
| Mammograms | 35/354 (9.9) |
| CAT scans | 36/354 (10.2) |
| Plain radiology films | 71/354 (20.1) |
| Ultrasounds | 70/354 (19.8) |
| MRIs | 142/354 (40.1) |
Figure 2System architecture.
Examples of common phrases used in the data cleaning process.
| Common phrases |
| “No significant abnormality is identified” |
| “No mammographic change or evidence of malignancy” |
| “No acute cardiopulmonary process” |
| “No acute pulmonary process” |
| “Within normal limits” |
| “Normal abdominal ultrasound” |
| “No acute intracranial process” |
| “Appropriate for age” |
| “Routine annual screening mammogram” |
| “No acute pathology” |
| “Correlation recommended” |
| “Biopsy should be performed” |
| “Surgical consultation is suggested” |
| “Appear significantly changed” |
Common phrase and white space removal depiction.
| Common phrase before white space removal | Common phrase after white space removal |
| “within normal limits” | “withinnormallimits” |
| “Normal abdominal ultrasound” | “Normalabdominalultrasound” |
Figure 3Distribution of reports in each probability range. The x-axis represents probability and the y-axis represents number of reports from the test set.
Precision, recall, F-measure, and accuracy values for the classifier with Pp= 10%.
|
| 50% Pth | 80% Pth |
| 10% Pp of high-priority report | TP 89, FP 22, TN 240, FN 3 | TP 88, FP 21, TN 241, FN4 |
| Precision, % | 80.18 | 80.73 |
| Recall, % | 96.74 | 95.65 |
| F-measure, % | 87.66 | 87.56 |
| Accuracy, % | 92.94 | 92.94 |
Precision, recall, F-measure, and accuracy values for the classifier with Pp= 50%.
|
| 50% Pth | 80% Pth |
| 50% Pp of report being high-priority | TP 91, FP 25,TN 237, FN 1 | TP 91, FP 22, TN 240, FN 1 |
| Precision, % | 78.45 | 80.53 |
| Recall, % | 98.91 | 98.91 |
| F-measure, % | 87.50 | 88.78 |
| Accuracy, % | 92.66 | 93.50 |