| Literature DB >> 30675333 |
Keejun Han1, Hyoeun Shim2, Mun Y Yi1.
Abstract
Clinical decision support (CDS) search is performed to retrieve key medical literature that can assist the practice of medical experts by offering appropriate medical information relevant to the medical case in hand. In this paper, we present a novel CDS search framework designed for passage retrieval from biomedical textbooks in order to support clinical decision-making using laboratory test results. The framework utilizes two unique characteristics of the textual reports derived from the test results, which are syntax variation and negation information. The proposed framework consists of three components: domain ontology, index repository, and query processing engine. We first created a domain ontology to resolve syntax variation by applying the ontology to detect medical concepts from the test results with language translation. We then preprocessed and performed indexing of biomedical textbooks recommended by clinicians for passage retrieval. We finally built the query-processing engine tailored for CDS, including translation, concept detection, query expansion, pseudo-relevance feedback at the local and global levels, and ranking with differential weighting of negation information. To evaluate the effectiveness of the proposed framework, we followed the standard information retrieval evaluation procedure. An evaluation dataset was created, including 28,581 textual reports for 30 laboratory test results and 56,228 passages from widely used biomedical textbooks, recommended by clinicians. Overall, our proposed passage retrieval framework, GPRF-NEG, outperforms the baseline by 36.2, 100.5, and 69.7 percent for MRR, R-precision, and Precision at 5, respectively. Our study results indicate that the proposed CDS search framework specifically designed for passage retrieval of biomedical literature represents a practically viable choice for clinicians as it supports their decision-making processes by providing relevant passages extracted from the sources that they prefer to refer to, with improved performances.Entities:
Mesh:
Year: 2018 PMID: 30675333 PMCID: PMC6323463 DOI: 10.1155/2018/3943417
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1Overall architecture of the framework: (a) domain ontology creation, (b) index processing, and (c) query processing.
Figure 2Ontology classes.
An example of the domain ontology.
| Representative term | Class name | Korean synonyms | English synonyms | |
|---|---|---|---|---|
| Platelet | Test | 혈소판 | PFA, blood disk, PFT, platelet aggregation, platelet function, thrombocyte | |
| Acute leukaemia | Disease | 백혈병, 급성 백혈병 | Leukaemia acute, leukaemia, acute leukaemia | |
| Inflammation | Category | 염증, 염증 관련 | Inflammatory, phlogistic | |
| Serum | Specimen | 세럼, 혈청, 면역 혈청 | Blood serum, sera, serum (blood), serum sickness-like reaction |
Figure 3Biomedical literature indexing process.
Descriptive statistics of biomedical literatures used for passage retrieval.
| HCDMLM2017 | PGDT2003 | RCPBD2014 | |
|---|---|---|---|
| Number of pages | 5,070 | 652 | 3,269 |
| Number of chapters | 77 | 9 | 29 |
| Number of images | 1,362 | — | 1,445 |
| Number of sentences | 109,381 | 10,231 | 55,369 |
| Number of paragraphs | 31,456 | 5,051 | 19,721 |
Opposite relation between negation status and case status of example sentences reported in the laboratory test results (the bold words in the sentence denotes concepts that were examined for negation status).
| Sentences | Case status | Negation status |
|---|---|---|
|
| Normal | Negated |
| No | Normal | Negated |
|
| Normal | Negated |
|
| Abnormal | Affirmed |
|
| Abnormal | Affirmed |
|
| Abnormal | Affirmed |
A list of queries.
| Case text (translated) | Type | The number of unique variants |
|---|---|---|
| Blood test result is normal. | Normal | 3,741 |
| No anemia. | Normal | 3,150 |
| Kidney function test result is normal | Normal | 2,623 |
| AST is normal. | Normal | 2,123 |
| Total cholesterol is normal. | Normal | 2,007 |
| Triglyceride is normal. | Normal | 1,982 |
| Syphilis test result is normal. | Normal | 1,856 |
| AFP (alpha-fetoprotein) is normal. | Normal | 1,009 |
| The concentration of uric acid is normal. | Normal | 951 |
| R-GTP is normal. | Normal | 932 |
| Fasting blood sugar level is normal. | Normal | 922 |
| HIV test result is negative. | Normal | 719 |
| Thyroid function test is normal. | Normal | 321 |
| CEA (cancer antigen) is normal | Normal | 320 |
| The rheumatoid factor (RF) is normal | Normal | 288 |
| Vitamin D (25OH-vitamin D) is deficit. | Abnormal | 826 |
| White blood cell in urine is positive. | Abnormal | 598 |
| Hemoglobin and red blood cells were detected in urine. | Abnormal | 467 |
| It corresponds to fasting blood sugar disorder. The Korean Diabetes Association has designated fasting blood sugar level 100–125 mg/dL as a fasting blood sugar disorder. | Abnormal | 438 |
| A ketone was detected in the urine. | Abnormal | 434 |
| Protein in urine is detected. | Abnormal | 430 |
| White blood cell is detected in urine. | Abnormal | 410 |
| Eosinophil has been increased. | Abnormal | 395 |
| High triglyceride. | Abnormal | 325 |
| High-density cholesterol (HDL cholesterol) has been reduced. | Abnormal | 293 |
| The uric acid concentration has increased. | Abnormal | 244 |
| Rheumatoid factor (RF) is positive. | Abnormal | 207 |
| Crystals have been found in the urine. | Abnormal | 203 |
| The high hemoglobin (HbA1c) suggests that blood glucose levels remained high for the past three to four months. | Abnormal | 191 |
| Bilirubin is high. | Abnormal | 176 |
Retrieval features used by compared retrieval models.
| Concept detection | Query expansion | Ranking function | |||||||
|---|---|---|---|---|---|---|---|---|---|
| NLP | Domain ontology | Synonym | Local | Global | TF-IDF | BM25 | LM | Neg | |
| Baseline | x | — | x | — | — | x | x | x | — |
| UMLSE | x | x | x | — | — | x | x | x | — |
| LPRF | x | x | x | x | — | x | x | x | — |
| GPRF | x | x | x | x | x | x | x | x | — |
| GPRF-NEG | x | x | x | x | x | x | x | x | x |
Figure 4Points of precision for each method. (a) Precision from 1 to 10; (b) Precision from 10 to 50.
Evaluation results of the methods.
| Method | MRR |
| P@5 | |||
|---|---|---|---|---|---|---|
| Baseline | 0.6811 | — | 0.3128 | — | 0.4777 | — |
| UMLSE | 0.7286‡ | +6.9% | 0.3351‡ | +7.13% | 0.4933‡ | +3.2% |
| LPRF | 0.7694‡ | +5.6% | 0.4479‡ | +33.6% | 0.5867‡ | +18.9% |
| GPRF | 0.8889‡ | +15.5% | 0.4549† | +1.5% | 0.7600‡ | +29.5% |
| GPRF-NEG | 0.9278‡ | +4.3% | 0.6273‡ | +37.9% | 0.8667‡ | +14.0% |
†A significant improvement (p < 0.01) over the baseline. ‡A significant improvement over baseline and methods marked with †.
Figure 5nDCG results from 1 to 50. (a) nDCG from 1 to 10; (b) nDCG from 10 to 50.
nDCG results of the methods.
| Method | nDCG@1 | nDCG@5 | nDCG@10 | |||
|---|---|---|---|---|---|---|
| Baseline | 0.6211 | — | 0.3738 | — | 0.2878 | — |
| UMLSE | 0.7001‡ | +12.72% | 0.4038‡ | +8.0% | 0.3118‡ | +8.3% |
| LPRF | 0.7444‡ | +6.3% | 0.4059† | +0.5% | 0.3235‡ | +3.7% |
| GPRF | 0.8444‡ | +13.4% | 0.4681‡ | +15.3% | 0.3563‡ | +10.14% |
| GPRF-NEG | 0.8778‡ | +3.9% | 0.5184‡ | +10.7% | 0.3916‡ | +9.9% |
†A significant improvement (p < 0.01) over the baseline. ‡A significant improvement over baseline and methods marked with †.
Evaluation results by different ranking schemes on abnormal cases.
| Ranking scheme | nDCG@1 | nDCG@5 | nDCG@10 | |||
|---|---|---|---|---|---|---|
| TF-IDF | 0.8108 | — | 0.4418 | — | 0.3384 | — |
| BM25 | 0.8156 | +0.59% | 0.4410 | −0.18% | 0.3476† | +2.72% |
| LM | 0.8312† | +1.91% | 0.4569† | +3.61% | 0.3399 | −2.22% |
| Proposed without NEG | 0.8378† | +0.79% | 0.4583† | +0.31% | 0.3417† | +0.53% |
| Proposed with NEG | 0.8483‡ | +1.25% | 0.4691‡ | +2.36% | 0.3533† | +3.39% |
†A significant improvement (p < 0.0.1) over the baseline. ‡A significant improvement over baseline and methods marked with †.
Figure 610 points of precision (a) and five points of nDCG (b) for two best performing methods (GPRF and GPRF-NEG) on normal case (denoted as NL) and abnormal case (denoted as ABN).
Figure 7Effect of parameter values for GPRF-NEG in terms of P@1, P@3, and P@5. The best performances are achieved when (a) m = 35, (b) λ = 0.65, and (c) γ = 2.0.
Figure 8Expanded laboratory diagnostic process including the proposed CDS search framework. (a) Clinical specimens are delivered to a clinical laboratory. (b) Clinical laboratory test results are sent. (c) Interactions between the diagnostic system and CDS search framework are occurred. (d) Final decision is made based on the retrieved passages. (e) Patient report is made based on the clinical decision for patient care.
Figure 9Interface snapshot for (a) new version of the diagnostic system, and (b) ontology management.
Figure 10Interface snapshot for (a) a list of snippet capturing highlighted sentences for relevant passages with thumbnails, (b) a list of snippet capturing highlighted sentences from abstracts in PMC articles, (c) detailed page view, and (d) previous/next page view.