| Literature DB >> 32025655 |
Na Hong1, Andrew Wen1, Feichen Shen1, Sunghwan Sohn1, Chen Wang1, Hongfang Liu1, Guoqian Jiang1.
Abstract
OBJECTIVE: To design, develop, and evaluate a scalable clinical data normalization pipeline for standardizing unstructured electronic health record (EHR) data leveraging the HL7 Fast Healthcare Interoperability Resources (FHIR) specification.Entities:
Keywords: Fast Healthcare Interoperability Resources; data standards; electronic health records; natural language process
Year: 2019 PMID: 32025655 PMCID: PMC6993992 DOI: 10.1093/jamiaopen/ooz056
Source DB: PubMed Journal: JAMIA Open ISSN: 2574-2531
Figure 1.NLP2FHIR pipeline for EHR data modeling. EHR: electronic health record; FHIR: Fast Healthcare Interoperability Resources.
Figure 2.A screenshot of the graphic user interface of the implemented NLP2FHIR pipeline. Input type can be COMPOSITION_RESOURCE, BUNDLE_RESOURCE, XMI, or TEXT.
Examples of mapping rules between EHR sources, NLP output types, and FHIR elements
| Source | NLP output types | FHIR elements | Mapping types | Examples |
|---|---|---|---|---|
| Medication list | MedXN: Drug | MedicationStatement.medicationCodableConcept | 1:1 | Oxamniquine→[SNOMED: 747006] |
|
MedXN: Drug: attributes: type=”frequency” MedTime: MedTimex3: type=”SET” |
MedicationStatement.dosage.timing.frequency MedicationStatement.dosage.timing.frequencyMax MedicationStatement.dosage.timing.period MedicationStatement.dosage.timing.periodMax MedicationStatement.dosage.timing.periodUnit MedicationStatement.dosage.asNeeded.asNeededBoolean MedicationStatement.dosage.timing.dayofWeek MedicationStatement.dosage.timing.when | 1: |
Once daily →1[frequency], 1[period], d[periodUnit] 4–6 times →4[frequency], 6[frequencyMax] As needed for heel pain→true Regular: once daily every six hours Irregular: as needed for pain every Monday, Tuesday Wednesday | |
|
MedXN: Drug: attributes: type=”duration” MedTime: MedTimex3: type=”DURATION” |
Dosage.duration Dosage.durationMax Dosage.durationUnit | 1: |
3 days →3[duration], d[durationUnit] | |
| MedXN: Drug: attributes: type=”route” | Dosage.route | 1:1 | By mouth [oral route] | |
| MedXN: Drug: attributes: type=”strength” |
Medication.ingredient.amount. numerator.quantity.value Medication.ingredient.amount. numerator.quantity.unit Medication.ingredient.amount. denumerator.quantity.value Medication.ingredient.amount. denumerator.quantity.unit | 1: |
Regular: 500 mg /5 mL→ 500[numerator.quantity.value], mg[numerator.quantity.unit], 5[denumerator.quantity.value], mL [denumerator.quantity.unit] Irregular: 200 mg → Default assign: 1[denumerator.quantity.value] | |
| MedXN: Drug: attributes: type=”form” | Medication.form | 1:1 | tab[Tablet] | |
| MedXN: Drug: attributes: type=”dosage” |
Dosage.doseQuantity.value Dosage.doseQuantity.unit Dosage.doseQuantity.Range.low.value Dosage.doseQuantity.Range.low.unit Dosage.doseQuantity.Range.high.value Dosage.doseQuantity.Range.high.unit | 1: |
10 mL →10[value], mL[unit] 2–3 tabs →2[range.low.value], tab[range.low.unit], 3[range.high.value], tab[range.high.unit] | |
| Problem list | cTAKES: Disease_disorder | Condition.code | 1:1 | The Lingering sore throat → [SNOMED: 140004] /Chronic pharyngitis |
|
cTAKES: Anatomical_Site relations: type=“LocationOf” | Condition.bodySite | 1:1 |
Back of the head → [SNOMED: 774007] / Head and neck | |
| cTAKES: modifier: type=“Severity” | Condition.severity | 1:1 | Very bad → [SNOMED: 24484000] / Severe | |
| Family history |
Relation relations: type:”SideOfFamily” relations: type:”Blood” relations: type:”Adopted” | FamilyMemberHistory.relationship |
| Grandpa→ MGRFTH / maternal grandfather |
| Laboratory test | test_code | Observation.code | 1:1 | Albumin in Semen →[LOINC: 10558-5] /Albumin [Moles/volume] in Semen |
| Section | Source Section | Composition.section.code | 1:1 |
Family history→ [LOINC: 10157-6] / History of family member diseases narrative |
| Temporal information | MedTime: MedTimex3: type=”DATE” | MedicationStatement.effectiveDatetime | 1:1 | April 16th |
| MedTime: MedTimex3: type=”TIME” | MedicationStatement.effectiveDatetimeDosage.timeofDay | 1: | April 8, 2008 at 04: 38 |
Proposed FHIR NLP extensions for clinical NLP
| Proposed FHIR NLP extension | FHIR resource | Definition reference sources | Description |
|---|---|---|---|
| offset | Any |
[Ref: cTAKES/ LineAndTokenPosition] [Ref: OHDSI NLP/offset] | Token line and offset of the extracted term in the input note |
| raw_text | Any | [Ref: OHDSI NLP/ lexical_variant] | Raw text extracted from the NLP tool |
| context | Any |
[Ref: cTAKES /LookupWindowAnnotation] [Ref: cTAKES /ContextAnnotation] [Ref: OHDSI NLP/snippet] | Contextual information of an entity |
| nlp_system | Any | [Ref: OHDSI NLP/nlp_system] | Name and version of the NLP system that extracted the term. Useful for data provenance |
| nlp_date/nlp_datetime | Any | [Ref: OHDSI NLP/nlp_date, nlp_datetime] | The date or datetime of the note processing. Useful for data provenance |
| term_temporal | Any |
[Ref: cTAKES/HistoryOfModifier] [Ref: OHDSI NLP/term_temporal] | The time modifier associated with the extracted term |
| confidence_score | Any | NLP experts inputs | The confidence score indicates the probability of accuracy with the extracted term |
| conditional_modifier | Any | [Ref: cTAKES/ConditionalModifier] | Used to indicate that a procedure or assertion occurs under certain conditions |
| negated_modifier |
Condition Procedure Medication | [Ref: cTAKES/ PolarityModifier] | Used to indicate that a procedure or assertion did not occur or does not exist |
| certainty_modifier | Condition | [Ref: cTAKES/ UncertaintyModifier] | An introduction of a measure of doubt into a statement |
| LabDeltaFlag_modifier | Observation | [Ref: cTAKES/ ssLabDeltaFlagModifier] | An indicator to warn that the laboratory test result has changed significantly from the previous identical laboratory test result |
Abbreviations: FHIR: Fast Healthcare Interoperability Resources; NLP: natural language processing.
For expansions of abbreviations used in definition reference sources, please refer to text.
Structured data integrated from EHRs for the NLP2FHIR pipeline
| NLP2FHIR pipeline | Elements populated from structured data | Data type | Definitions |
|---|---|---|---|
| Condition | Condition.clinicalStatus | CodeableConcept | active | recurrence | relapse | inactive | remission | resolved(HL7 ValueSet: ConditionClinicalStatusCodes) |
| Condition.category | CodeableConcept | problem-list-item | encounter-diagnosis(HL7 ValueSet: ConditionCategoryCodes) | |
| Condition.subject | Reference | Who has the condition | |
| Condition.encounter | Reference | The encounter during which this condition was created or diagnosed | |
| Conditon.recordedDate | dateTime | Date record was first recorded | |
| Procedure | Procedure.status | code | A code specifying the state of the procedure. Generally, this will be the in-progress or completed state |
| Procedure.subject | Reference | The person, animal or group on which the procedure was performed | |
| Procedure.category | CodeableConcept | Classification of the procedure | |
| Procedure.encounter | Reference | The Encounter during which this Procedure was created or performed or to which the creation of the record is tightly associated | |
| MedicationStatement | MedicationStatement.status | code | active | completed | entered-in-error | intended | stopped | on-hold | unknown | not-taken |
| MedicationStatement.subject | Reference | Who is/was taking the medication | |
| MedicationStatement.category | CodeableConcept | Type of medication usage(SNOMED CT) | |
| MedicationStatement.dateAsserted | dateTime | When the statement was asserted | |
| FMH | FamilyMemberHistory.status | code | partial | completed | entered-in-error | health-unknown(HL7 ValueSet: FamilyHistoryStatus) |
| FamilyMemberHistory.dataAbsentReason | CodeableConcept | subject-unknown | withheld | unable-to-obtain | deferred (HL7 ValueSet: FamilyHistoryAbsentReason) | |
| FamilyMemberHistory.patient | Reference | Patient history is about | |
| FamilyMemberHistory.date | dateTime | When history was recorded or last updated |
Abbreviations: EHR: electronic health record; FHIR: Fast Healthcare Interoperability Resources; HL7: Health Level Seven International; NLP: natural language processing.
Normalization results for each NLP2FHIR pipeline
| NLP2FHIR pipeline | No. of rules | Element examples | Data type | Normalization examples |
|---|---|---|---|---|
| MedicationStatement | 25 | MedicationStatement.medicationCodableConcept | CodeableConcept | Oxamniquine→747006[coding.code] |
| MedicationStatement.dosage.timing.frequency | integer | Once daily→1[frequency] | ||
| MedicationStatement.dosage.asNeeded.asNeededBoolean | boolean | As needed for heel pain→true | ||
| MedicationStatement.dosage.timing.dayofWeek | code | Every Monday → mon[ | ||
| Procedure | 10 | Procedure.code | CodeableConcept | Kidney echography → 306005/echography of kidney |
| Procedure.reasonCode | CodeableConcept | 134006/decreased hair growth | ||
| Procedure.performed[x].performedDateTime | dateTime | April 16th, 2010 | ||
| Condition | 13 | Condition.code | CodeableConcept | The Lingering sore throat → 140004/Chronic pharyngitis |
| Condition.bodySite | CodeableConcept | 774007/Head and neck | ||
| Condition.abatementString | string | Resolved | ||
| FMH | 14 | FamilyMemberHistory.condition.code | CodeableConcept | 3511005/Infectious thyroiditis |
| FamilyMemberHistory.relationship | CodeableConcept | MGRFTH/maternal grandfather |
Abbreviations: FHIR: Fast Healthcare Interoperability Resources; MGRFTH: a role code for maternal grandfather; NLP: natural language processing.
Figure 3.Example of the FHIR bundle resource with a standard section of “Problem List—Reported (LOINC: 11450-4)” and its referenced FHIR resources. FHIR: Fast Healthcare Interoperability Resources; LOINC: Logical Observation Identifiers and Codes.
Evaluation results on the performance of the NLP2FHIR pipeline
| FHIR resource | FHIR element | Precision | Recall |
| Baseline |
|---|---|---|---|---|---|
| MedicationStatement and Medication | MedicationStatement.medicationCodeableConcept | 0.996 | 0.982 | 0.988 |
MedXN: 0.581–0.954MedTime: 0.880 |
| Dosage.timing.repeat.frequency | 0.795 | 0.873 | 0.832 | ||
| Dosage.timing.repeat.period | 0.959 | 0.914 | 0.936 | ||
| Dosage.timing.repeat.duration | 0.600 | 1 | 0.750 | ||
| Dosage.route | 0.957 | 0.816 | 0.878 | ||
| Medication.ingredient.amount.numerator.quantity.value | 0.930 | 0.815 | 0.869 | ||
| Medication.ingredient.amount.numerator.quantity.unit | 0.926 | 0.899 | 0.911 | ||
| Medication.form | 0.871 | 0.704 | 0.779 | ||
| Dosage.timing.repeat.when | 1 | 0.571 | 0.727 | ||
| Dosage.asNeededBoolean | 0.913 | 0.583 | 0.712 | ||
| Condition | Condition.code | 0.865 | 0.696 | 0.771 | cTAKES:
0.768–0.954 |
| Condition.bodySite | 0.871 | 0.611 | 0.718 | ||
| Condition.severity | 0.909 | 0.556 | 0.690 | ||
| Condition.extension.negated_modifier | 0.992 | 0.998 | 0.995 | ||
| Procedure | Procedure.code | 0.889 | 0.643 | 0.746 | cTAKES:
0.768–0.954 |
| Procedure.bodySite | 0.895 | 0.798 | 0.844 | ||
| FamilyMemberHistory | FamilyMemberHistory.condition.code | 0.940 | 0.716 | 0.813 | cTAKES:
0.768–0.954 |
| FamilyMemberHistory.extension.negated_modifier | 0.937 | 0.967 | 0.952 | ||
| FamilyMemberHistory.relationship | 0.756 | 0.739 | 0.747 |
Abbreviations: FHIR: Fast Healthcare Interoperability Resources; NLP: natural language processing.