| Literature DB >> 33058101 |
Antje Wulff1, Marcel Mast1, Marcus Hassler2, Sara Montag1, Michael Marschollek1, Thomas Jack3.
Abstract
BACKGROUND: Merging disparate and heterogeneous datasets from clinical routine in a standardized and semantically enriched format to enable a multiple use of data also means incorporating unstructured data such as medical free texts. Although the extraction of structured data from texts, known as natural language processing (NLP), has been researched at least for the English language extensively, it is not enough to get a structured output in any format. NLP techniques need to be used together with clinical information standards such as openEHR to be able to reuse and exchange still unstructured data sensibly.Entities:
Year: 2020 PMID: 33058101 PMCID: PMC7725544 DOI: 10.1055/s-0040-1716403
Source DB: PubMed Journal: Methods Inf Med ISSN: 0026-1270 Impact factor: 2.176
Overview of the openEHR archetypes used for representing medical history data
| Concept name | Archetype ID | Internationally available? |
|---|---|---|
| Adverse reaction risk | EVALUATION.adverse_reaction_risk.v1 1 | Yes – published |
| Age | OBSERVATION.age.v0 2 | Yes—Draft |
| Blood pressure | OBSERVATION.blood_pressure.v2 3 | Yes—published |
| Body temperature | OBSERVATION.body_temperature.v2 4 | Yes—published |
| Capillary refill | CLUSTER.capillary_refill_time.v0 5 | Yes—draft |
| Dosage | CLUSTER.dosage.v1 6 | Yes—published |
| Examination of abdomen | CLUSTER.exam_abdomen.v0 7 | Yes—draft |
| Examination of a pupil | CLUSTER.exam-pupil.v0 8 | Yes—draft |
| Examination of skin | CLUSTER.exam_skin.v0 9 | Yes—draft |
| Family history | EVALUATION.family_history.v2 10 | Yes—published |
| Food and nutrition summary | EVALUATION.nutrition_summary.v0 11 | Yes—draft |
| Gender | EVALUATION.gender.v1 12 | Yes—Published |
| Laboratory test result | OBSERVATION.laboratory_test_result.v1 13 | Yes—Published |
| Medication management | ACTION.medication.v1 14 | Yes—Draft |
| Pediatric Glasgow Coma Scale (pGCS) | OBSERVATION.glasgow_coma_scale_pediatric.v0 15 | Yes—Draft |
| Physical examination findings | OBSERVATION.exam.v1 16 | Yes—Published |
| Problem/Diagnosis | EVALUATION.problem_diagnosis.v1 17 | Yes—Published |
| Pulse/Heart beat | OBSERVATION.pulse.v2 18 | Yes—Published |
| Pulse oximetry | OBSERVATION.pulse_oximetry.v1 19 | Yes—Published |
| Report | COMPOSITION.report.v1 20 | Yes—Published |
| Respiration | OBSERVATION.respiration.v2 21 | Yes—Published |
| Story/History | OBSERVATION.story.v1 22 | Yes—Published |
| Symptom/Sign | CLUSTER.symptom_sign.v1 23 | Yes—Published |
| Patient admission | ADMIN_ENTRY.admission.v0 | No |
https://ckm.openehr.org/ckm/#showArchetype_1013.1.1713 ; 2 https://ckm.openehr.org/ckm/#showArchetype_1013.1.3361 ; 3 https://ckm.openehr.org/ckm/#showArchetype_1013.1.3574 ; 4 https://ckm.openehr.org/ckm/#showArchetype_1013.1.2796 ; 5 https://ckm.openehr.org/ckm/#showArchetype_1013.1.3319 ; 6 https://ckm.openehr.org/ckm/#showArchetype_1013.1.2751 ; 7 https://ckm.openehr.org/ckm/#showArchetype_1013.1.219 ; 8 https://ckm.openehr.org/ckm/#showArchetype_1013.1.3882 ; 9 https://ckm.openehr.org/ckm/#showArchetype_1013.1.3933 ; 10 https://ckm.openehr.org/ckm/#showArchetype_1013.1.2469 ; 11 https://ckm.openehr.org/ckm/#showArchetype_1013.1.2755 ; 12 https://ckm.openehr.org/ckm/#showArchetype_1013.1.3715 ; 13 https://ckm.openehr.org/ckm/#showArchetype_1013.1.2191 ; 14 https://ckm.openehr.org/ckm/#showArchetype_1013.1.123 ; 15 https://ckm.openehr.org/ckm/#showArchetype_1013.1.4188 ; 16 https://ckm.openehr.org/ckm/#showArchetype_1013.1.271 ; 17 https://ckm.openehr.org/ckm/#showArchetype_1013.1.169 ; 18 https://ckm.openehr.org/ckm/#showArchetype_1013.1.4295 ; 19 https://ckm.openehr.org/ckm/#showArchetype_1013.1.3084 ; 20 https://ckm.openehr.org/ckm/#showArchetype_1013.1.677 ; 21 https://ckm.openehr.org/ckm/#showArchetype_1013.1.4218 ; 22 https://ckm.openehr.org/ckm/#showArchetype_1013.1.68 ; 23 https://ckm.openehr.org/ckm/#showArchetype_1013.1.195 .
Fig. 1OpenEHR template for representing a pediatric medical history.
Fig. 2Schematic representation of the developed workflow, including (1) the input module, (2) the marker concepts and regular expressions realized in the NLP pipeline module, (3) the process of mapping to the (4) an archetype nested in the openEHR medical history template stored in the (5) openEHR-based data repository. NLP, natural language processing.
Fig. 3Snippet from the Java code for mapping the extracted information snippet on unique archetype paths (mapping and integration module), including (1) running through all defined rules and the firing of a suitable rule which then (2) enables the instantiation of a new age observation by filling the associated archetype paths with the extracted information delivered in the eventObject.
Results of the AQL query to retrieve extracted information snippets
| Event ID | Snippet, extracted from pipeline | Archetype | Archetype path and archetype term code (at-code) | |
|---|---|---|---|---|
| 2107 | Patientin [patient, female] | Gender |
Administrative gender
| Patientin [patient, female] |
| 2106 | 10 Jahre alt [10 years old] | Age |
Chronological age
| P10Y |
|
Comment
| 10 Jahre alt [10 y old] | |||
| 2104 | Klinikum Musterstadt [Hospital Musterstadt] | Patient admission |
Type of admission
| Klinikum Musterstadt [Hospital Musterstadt] |
| 3103 | Patient blass [pale patient] | Physical examination findings |
Clinical description
| Patient blass [pale patient] |
| 2101 | Erbrechen [vomiting] | Problem/Diagnosis |
Problem/Diagnosis name
| Erbrechen [vomiting] |
| 2101 | Kopfschmerzen [headache] | Problem/Diagnosis |
Problem/Diagnosis name
| Kopfschmerzen [headache] |
| 3206 | 39.7°C Körpertemperatur [39.7°C body temperature] | Body temperature |
Temperature
| 39.7 Cel |
| 3411 | Herzfrequenz bei 130 [heart rate at 130] | Pulse/Heart beat |
Pulse rate
| 130 bpm |
| 3503 | Pupillen eng [pupils are narrow] | Physical examination findings |
Clinical description
| Pupillen eng [pupils are narrow] |
| 3701 | Abdomen weich [soft abdomen] | Physical examination findings |
Clinical description
| Abdomen weich [soft abdomen] |
| 2710 | Vorherig bestand Lungenentzündung [previously existing pneumonia] | Story/History |
Story
| Vorherig |
|
Symptom/Sign name
| Lungenentzündung [pneumonia] | |||
| 3303 | Sauerstoffsättigung bei 82% [oxygen saturation at 82%] | Pulse oximetry |
SpO
2
| 82.0 |
| 3404 | Rekapillarisierungszeit < 2 Sekunden [capillary refill time <2 seconds] | Capillary refill |
Capillary refill time
| Less than 2 s |
| 2502 | Allergie gegen Latex [allergy to latex] | Adverse reaction risk |
Category
| Allergie [allergy] |
|
Substance
| Latex | |||
| 2705 | Familiär bekannter Immundefekt [family history: immune deficiency] | Family history |
Symptom/Sign name at
| Immundefekt [immun deficiency] |
| 2707 | Familiär D84 [familial D84] | Family history |
Symptom/Sign name at
| D84 |
| 2202 | 50 mg Vomex | Medication management |
Medication item
| Vomex |
|
Dose amount
| 50.0 | |||
|
Dose unit
| mg | |||
| 2101 | Kein Erbrechen [no more vomiting] | Problem/Diagnosis |
Problem/Diagnosis name
| Kein Erbrechen [no more vomiting] |
Abbreviation: AQL, Archetype Query Language.
Overview of the types of marker concepts identified within the manual annotation (ground truth) and the distribution of true positives, false negatives, and false positives
| Type of marker concept | Number of events extracted (ground truth) | True positives | False negatives | False positives |
|---|---|---|---|---|
|
|
|
|
|
|
| Vital signs | 190 | 168 | 22 | 7 |
| Diagnosis | 107 | 103 | 4 | 4 |
| General condition and behavior | 90 | 87 | 3 | 2 |
| Skin characteristics | 50 | 50 | 0 | 0 |
| Abdomen characteristics | 25 | 25 | 0 | 0 |
| Medication | 22 | 22 | 0 | 0 |
| Special situations (e.g., transfer, emergency) | 19 | 18 | 1 | 1 |
| Ophthalmology | 13 | 13 | 0 | 0 |
| Neurology | 8 | 8 | 0 | 0 |
| Allergies | 5 | 5 | 0 | 2 |