| Literature DB >> 36044253 |
Peijin Han1, Sunyang Fu2, Julie Kolis3, Richard Hughes3, Brian R Hallstrom3, Martha Carvour4, Hilal Maradit-Kremers2,5, Sunghwan Sohn2, V G Vinod Vydiswaran6,7.
Abstract
BACKGROUND: Natural language processing (NLP) methods are powerful tools for extracting and analyzing critical information from free-text data. MedTaggerIE, an open-source NLP pipeline for information extraction based on text patterns, has been widely used in the annotation of clinical notes. A rule-based system, MedTagger-total hip arthroplasty (THA), developed based on MedTaggerIE, was previously shown to correctly identify the surgical approach, fixation, and bearing surface from the THA operative notes at Mayo Clinic.Entities:
Keywords: information extraction; model transferability; natural language processing; total hip arthroplasty
Year: 2022 PMID: 36044253 PMCID: PMC9475406 DOI: 10.2196/38155
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Overview of the NLP deployment and evaluation process. IRB: institutional review board; NLP: natural language processing.
Figure 2The workflow of the note-processing pipeline at the Michigan site. The rectangles represent the data and the rounded rectangles represent the process. PAO: periacetabular osteotomy; THA: total hip arthroplasty.
Out-of-box performance of MedTagger-total hip arthroplasty (THA) for surgical approach: comparison of the gold standard (registry data) and notes classified by MedTagger-THA in the training and test data.a
| Gold standard | MedTagger-THA, n (%) | |||||
|
| Anterior | Anterolateral | Posterior | Ambiguous | Missing inference | |
|
| ||||||
|
| Anterior | 261 (12.7) | 0 (0) | 2 (0.1) | 1 (0) | 0 (0) |
|
| Anterolateral | 0 (0) | 1 (0) | 2 (0.1) | 0 (0) | 1 (0) |
|
| Posterior | 4 (0.2) | 2 (0.1) | 1737 (84.2) | 1 (0) | 50 (2.4) |
|
| ||||||
|
| Anterior | 68 (13.4) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
|
| Anterolateral | 0 (0) | 1 (0.2) | 0 (0) | 0 (0) | 0 (0) |
|
| Posterior | 0 (0) | 1 (0.2) | 421 (83) | 0 (0) | 15 (3) |
|
| Transtrochanteric | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 1 (0.2) |
aAccuracy: 96.6% (95% CI 94.6%-97.9%); precision: 99.8% (95% CI 98.7%-100%); recall: 96.6% (95% CI 94.6%-97.9%); F1-score: 98.2% (95% CI 96.5%-99.1%).
Out-of-box performance of MedTagger-total hip arthroplasty (THA) for fixation: comparison of the gold standard (registry data) and notes classified by MedTagger-THA in the training and test data.a
| Gold standard | MedTagger-THA, n (%) | ||||
|
| Cemented | Hybrid | Uncemented | Ambiguous | |
|
| |||||
|
| Cemented | 0 (0) | 1 (0.1) | 0 (0) | 0 (0) |
|
| Hybrid | 1 (0.1) | 76 (7.2) | 3 (0.3) | 17 (1.6) |
|
| Uncemented | 0 (0) | 29 (2.8) | 925 (87.8) | 1 (0.1) |
|
| |||||
|
| Cemented | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
|
| Hybrid | 0 (0) | 23 (9) | 2 (0.8) | 5 (2) |
|
| Uncemented | 0 (0) | 4 (1.6) | 222 (86.7) | 0 (0) |
aAccuracy: 95.7% (95% CI 92.4%-97.6%); precision: 95.7% (95% CI 92.4%-97.6%); recall: 95.7% (95% CI 92.4%-97.6%); F1-score: 95.7% (95% CI 92.4%-97.6%).
Classification errors and ambiguous cases for approach and fixation in the Michigan data set.
| Keyword | Classification error | Ambiguous cases | Missing |
|
|
The mention of the correct The notes in the “Complications” section contained mentions for a different Multiple different |
Notes related to diagnosis sections but not the procedures contained different mentions of Notes related to “indications” contained hypothetical conditions; for example, “We offered her the option of anterior or posterior |
Direct mentions of No mentions indicating the Misspelling of the mentions led to unrecognition (eg, “shortrotators”). |
|
|
“Uncemented” was misclassified as “Hybrid” The note mentioned “non cement stem” but the certainty of the inference was positive for the “Hybrid” was misclassified as “Uncemented”; for example, “femur” was not included in the stem keyword list, and no “Cemented” was misclassified as “Hybrid” as “femur” was not included in the stem keyword list, “Hybrid” was misclassified as “Cemented” as “Cemented” was a direct mention and had priority over others; for example: “Total Hip Arthroplasty, cemented, Right Hip” was misclassified as “Cemented” In the notes, only the femoral canal is cemented. |
For a single surgery note, some sections misclassified “Hybrid” as “Cemented” as “Cemented” was a direct mention of |
Missingness in |
aTHA: total hip arthroplasty.
bConcept name.
Approach after refinement: comparison of the gold standard and notes classified by refined MedTagger-total hip arthroplasty (THA) in the Michigan test data set (N=507).a
| Gold standard | MedTagger-THA-Michigan, n (%) | ||||
|
| Anterior | Anterolateral | Posterior | Ambiguous | Missing inference |
| Anterior | 68 (13.4) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| Anterolateral | 0 (0) | 1 (0.2) | 0 (0) | 0 (0) | 0 (0) |
| Posterior | 0 (0) | 0 (0) | 434 (85.6) | 0 (0) | 3 (0.6) |
| Transtrochanteric | 0 (0) | 0 (0) | 1 (0.2) | 0 (0) | 0 (0) |
aAccuracy: 99% (95% CI 97.6%-99.6%); precision: 99.6% (95% CI 98.4%-100%); recall: 99% (95% CI 97.6%-99.6%); F1-score: 99.3% (95% CI 98%-99.8%).
Fixation after refinement: comparison of the gold standard and notes classified by refined MedTagger-total hip arthroplasty (THA) in the Michigan test data set (N=256).a
| Gold standard | MedTagger-THA-Michigan, n (%) | |||
|
| Cemented | Hybrid | Uncemented | Ambiguous |
| Cemented | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
| Hybrid | 1 (0.4) | 26 (10.2) | 1 (0.4) | 2 (0.8) |
| Uncemented | 0 (0) | 1 (0.4) | 225 (87.9) | 0 (0) |
aAccuracy: 98% (95% CI 95.3%-99.3%); precision: 98% (95% CI 95.3%-99.3%); recall: 98% (95% CI 95.3%-99.3%); F1-score: 98% (95% CI 95.3%-99.3%).
Approach: comparison of the gold standard and notes classified by MedTagger-total hip arthroplasty (THA) in the University of Iowa data set (N=100).a
| Gold standard | MedTagger-THA-Iowa, n (%) | Total, n (%) | |||||||
|
| Anterior | Anterolateral | Posterior |
| |||||
|
| |||||||||
|
| Anterior | 12 (24) | 1 (2) | 0 (0) | 13 (26) | ||||
|
| Anterolateral | 0 (0) | 0 (0) | 0 (0) | 0 (0) | ||||
|
| Posterior | 0 (0) | 0 (0) | 37 (74) | 37 (74) | ||||
|
| |||||||||
|
| Anterior | 14 (28) | 0 (0) | 0 (0) | 14 (28) | ||||
|
| Anterolateral | 0 (0) | 0 (0) | 0 (0) | 0 (0) | ||||
|
| Posterior | 0 (0) | 0 (0) | 36 (72) | 36 (72) | ||||
aAccuracy: 100% (95% CI 91.3%-100%); precision 100% (95% CI 91.3%-100%); recall: 100% (95% CI 91.3%-100%); F1-score: 100% (95% CI 91.3%-100%).
Fixation: comparison of the gold standard and notes classified by MedTagger-total hip arthroplasty (THA) in the University of Iowa data set (N=100).a
| Gold standard | MedTagger-THA-Iowa, n (%) | Total, n (%) | ||||
|
| Cemented | Hybrid | Uncemented |
| ||
|
| ||||||
|
| Cemented | 0 (0) | 0 (0) | 0 (0) | 0 (0) | |
|
| Hybrid | 0 (0) | 1 (2) | 0 (0) | 1 (2) | |
|
| Uncemented | 0 (0) | 0 (0) | 49 (98) | 49 (98) | |
|
| ||||||
|
| Cemented | 0 (0) | 0 (0) | 0 (0) | 0 (0) | |
|
| Hybrid | 1 (2) | 0 (0) | 0 (0) | 1 (2) | |
|
| Uncemented | 0 (0) | 0 (0) | 49 (98) | 49 (98) | |
aAccuracy: 98% (95% CI 88.3%-100%); precision: 98% (95% CI 88.3%-100%); recall: 98% (95% CI 88.3%-100%); F1-score: 98% (95% CI 88.3%-100%).
Bearing surface: comparison of the gold standard and notes classified by MedTagger-total hip arthroplasty (THA) in the University of Iowa data set (N=100).a
| Gold standard | MedTagger-THA-Iowa, n (%) | Total, n (%) | ||||||||||
|
| MoPb | CoPc | MoMd | CoCe |
| |||||||
|
| ||||||||||||
|
| MoP | 25 (50) | 1 (2) | 1 (2) | 0 (0) | 27 (54) | ||||||
|
| CoP | 0 (0) | 17 (34) | 0 (0) | 0 (0) | 17 (34) | ||||||
|
| MoM | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | ||||||
|
| CoC | 0 (0) | 6 (12) | 0 (0) | 0 (0) | 6 (12) | ||||||
|
| ||||||||||||
|
| MoP | 20 (40) | 2 (4) | 0 (0) | 0 (0) | 22 (44) | ||||||
|
| CoP | 0 (0) | 26 (52) | 0 (0) | 0 (0) | 26 (52) | ||||||
|
| MoM | 0 (0) | 0 (0) | 0 (0) | 1 (2) | 1 (2) | ||||||
|
| CoC | 0 (0) | 1 (2) | 0 (0) | 0 (0) | 1 (2) | ||||||
aAccuracy: 92% (95% CI: 80.5%-97.3%); precision: 92% (95% CI 80.5%-97.3%); recall: 92% (95% CI 80.5%-97.3%); F1-score: 92% (95% CI 80.5%-97.3%).
bMoP: metal-on-polyethylene
cCoP: ceramic-on-polyethylene.
dMoM: metal-on-metal.
eCoC: ceramic-on-ceramic.