| Literature DB >> 26347806 |
Irena Spasić1, Bo Zhao1, Christopher B Jones1, Kate Button2.
Abstract
BACKGROUND: In the realm of knee pathology, magnetic resonance imaging (MRI) has the advantage of visualising all structures within the knee joint, which makes it a valuable tool for increasing diagnostic accuracy and planning surgical treatments. Therefore, clinical narratives found in MRI reports convey valuable diagnostic information. A range of studies have proven the feasibility of natural language processing for information extraction from clinical narratives. However, no study focused specifically on MRI reports in relation to knee pathology, possibly due to the complexity of knee anatomy and a wide range of conditions that may be associated with different anatomical entities. In this paper we describe KneeTex, an information extraction system that operates in this domain.Entities:
Year: 2015 PMID: 26347806 PMCID: PMC4561435 DOI: 10.1186/s13326-015-0033-1
Source DB: PubMed Journal: J Biomed Semantics
Fig. 1Information extraction template represented by UML diagram. Each slot has got the following properties: extracted text, concept identifier, preferred concept name, start position of extracted text and its length
An example of a filled template. Original text source: “There is a small undisplaced vertical radial tear of the posterior horn of the lateral meniscus.”
An example of a filled template. Original text source: “A peripheral tear involving the body of the lateral meniscus extending into the posterior third is seen.”
Knee MRI report. A sample from the training dataset
Fig. 2A manually annotated MRI report. A screenshot of the BRAT interface
Fig. 3The strategies for rapid ontology expansion. Newly identified terminology is integrated into the existing ontology
Fig. 4Distribution of UMLS concept mentions. MetaMap was used to automatically identify concept mentions in the training set
Fig. 5Distribution of termhood. FlexiTerm was used to automatically identify multi–word terms in the training set
Fig. 6MEDCIN terminology related to MRI of knee. UMLS terminology services were used to access relevant terminology
Fig. 7RadLex terminology related to descriptions of radiology findings. BioPortal was used to access relevant terminology
An example of a filled template. Original text source: “There is a small undisplaced vertical radial tear of the posterior horn of the lateral meniscus.”
An example of a filled template. Original text source: “A peripheral tear involving the body of the lateral meniscus extending into the posterior third is seen.”
Fig. 8The system architecture diagram. TRAK ontology and all intermediate results are saved in a relational database to enable their integrative querying
An excerpt from the TRAK ontology. Exporting ontology vocabulary into PathNER’s dictionary format
Fig. 9Subclassification of ligaments in the TRAK ontology. The hierarchy is based on is–a relationship
Mapping between template slots and semantic types
| Slot | Semantic type | TRAK identifier | Example |
|---|---|---|---|
| Finding | Accident | TRAK:0000362 | Direct |
| Clinical manifestation | TRAK:0000092 | There is some | |
| Modality-related characteristic | TRAK:0001447 | The ACL returns | |
| Morphologic descriptor | TRAK:0001456 | There is slight | |
| Normality descriptor | TRAK:0001467 | The articular cartilage is | |
| Pathological condition | TRAK:0000204 | There is a small | |
| Physical examination | TRAK:0000656 | Positive | |
| Physiological condition descriptor | TRAK:0001482 | No evidence of articular cartilage | |
| Surgery | TRAK:0000236 | Presumably this had been excised during the ACL | |
| Finding qualifier | Clinical finding | TRAK:0000091 |
|
| Composition descriptor | TRAK:0001322 | Incidental note is made of a | |
| Distribution pattern | TRAK:0001441 | There is | |
| Orientation descriptor | TRAK:0001529 | This could represent a | |
| Quantity descriptor | TRAK:0001468 | There are also | |
| Size descriptor | TRAK:0001485 | There is a | |
| Sport | TRAK:0000323 | HISTORY | |
| Stage of healing descriptor | TRAK:0001502 | There is a | |
| Status descriptor | TRAK:0001478 | Focal area of | |
| Temporal descriptor | TRAK:0001488 | There is | |
| Certainty | Certainty descriptor | TRAK:0001422 | This raises the |
| Visibility descriptor | TRAK:0001495 | Normal | |
| Anatomy | Anatomical entity | TRAK:0001337 | The |
| Anatomy qualifier | Anatomical location descriptor | TRAK:0001561 | There is some oedema |
| General anatomical term | TRAK:0001581 | There is a lot of oedema in the ACL | |
| Meniscus zone | TRAK:0001345 | Complex tear of |
Text in a bold typeset represents instances of a given type
Fig. 10Template filling algorithm. One template is filled for a text segment that contains a single finding. In line 17, a finding is self-contained if it does not require anatomical localisation because it is implicitly stated by the finding itself. For example, Osgood-Schlatter disease is defined as a traction apophysitis of the anterior tibial tubercle. To determine if an extracted finding is self-contained, it is compared against a predefined list
Fig. 11Template filling algorithm. Two templates are filled for a text segment that contains two findings
Fig. 12Marginal distribution of annotations across the slots. “Not available” indicates a missing annotation
Fig. 13Distribution of annotations in the gold standard. Extracted text and corresponding ontology concepts
Evaluation results. Performance of the system on the test set
| Slot | TP | FP | FN | P | R | F |
|---|---|---|---|---|---|---|
| Finding | 1251 | 5 | 3 | 99.60 % | 99.76 % | 99.68 % |
| Finding qualifier | 636 | 19 | 15 | 97.10 % | 97.70 % | 97.40 % |
| Negation | 91 | 1 | 4 | 98.91 % | 95.79 % | 97.33 % |
| Certainty | 232 | 8 | 2 | 96.67 % | 99.15 % | 97.89 % |
| Anatomy | 1313 | 30 | 38 | 97.77 % | 97.19 % | 97.48 % |
| Anatomy qualifier | 439 | 18 | 34 | 96.06 % | 92.81 % | 94.41 % |
| Overall | 3962 | 81 | 96 | 98.00 % | 97.63 % | 97.81 % |
Fig. 14Stage–wise experiments. A total of ten concepts were incrementally removed from the ontology