| Literature DB >> 34967748 |
Qingyu Chen1, Alex Rankine1,2, Yifan Peng1,3, Elaheh Aghaarabi1,4, Zhiyong Lu1.
Abstract
BACKGROUND: Semantic textual similarity (STS) measures the degree of relatedness between sentence pairs. The Open Health Natural Language Processing (OHNLP) Consortium released an expertly annotated STS data set and called for the National Natural Language Processing Clinical Challenges. This work describes our entry, an ensemble model that leverages a range of deep learning (DL) models. Our team from the National Library of Medicine obtained a Pearson correlation of 0.8967 in an official test set during 2019 National Natural Language Processing Clinical Challenges/Open Health Natural Language Processing shared task and achieved a second rank.Entities:
Keywords: biomedical and clinical text mining; deep learning; semantic textual similarity; sentence embeddings; transformers; word embeddings
Year: 2021 PMID: 34967748 PMCID: PMC8759018 DOI: 10.2196/27386
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Model architecture overview. (A), (B), and (C) demonstrate the architecture of the Convolutional Neural Network (CNN), BioSentVec, and Bidirectional Encoder Representations from Transformers models, respectively. Details are provided in the Methods section. BERT: Bidirectional Encoder Representations from Transformers; CONV: convolutional layer; FC: fully-connected layer.
Hyperparameters of the sentence similarity models. Common hyperparameters are shared among all of the models. In contrast, model-specific hyperparameters are only for specific models.
| Hyperparameters | CNNa | BioSentVec | BERTb variation | ||||
|
| |||||||
|
| FCc layers | 128 | 512, 256, 128, 32 | 128, 32 | |||
|
| Dropout | 0.5 | 0.5 | 0.5 | |||
|
| Optimizer | Adam | SGDd | AdamWarmup | |||
|
| Learning rate | 1e-3 | 5e-3 | 2e-5 | |||
|
| Batch size | 64 | 16 | 32 | |||
|
| |||||||
|
| Maximum length | 170 | N/Ae | 128 | |||
|
| Convf | 1800 | N/A | N/A | |||
|
| Pooling | Maximum | N/A | Maximum | |||
aCNN: Convolutional Neural Network.
bBERT: Bidirectional Encoder Representations from Transformers.
cFC: fully-connected.
dSGD: stochastic gradient descent
eN/A: not applicable.
fConv: convolutional layers.
Effectiveness and efficiency results for the official test set. The models are ranked by the mean effectiveness results in descending order. The P value of the Wilcoxon rank-sum test at a 95% CI is shown for each model compared with the model with the highest effectiveness or efficiency results. The results of the ensemble model also are provided; however, this study focuses on single models in terms of, for example, their robustness to sentence pairs of different similarity levels and their inference time for production purposes.
| Model | Effectiveness (Pearson correlation) | Efficiency (seconds) | ||||||||||||
|
| Values, mean (SD) | Maximum | Values, mean (SD) | Lowest efficiency | ||||||||||
|
| ||||||||||||||
|
| BioSentVec | 0.8497 (0.0099) | N/Aa | 0.8654 | 1.48 (0.23) | N/A | 1.96 | |||||||
|
| BioBERT | 0.8481 (0.0122) | .74 | 0.8698 | 85.05 (4.93) | <.001 | 95.66 | |||||||
|
| ClinicalBERT | 0.8442 (0.0161) | .39 | 0.8677 | 85.20 (4.74) | <.001 | 95.21 | |||||||
|
| BlueBERT | 0.8320 (0.0232) | .02 | 0.8613 | 84.81 (1.63) | <.001 | 88.22 | |||||||
|
| CNNb | 0.8224 (0.0043) | <.001 | 0.8307 | 4.35 (0.27) | <.001 | 4.97 | |||||||
|
| ||||||||||||||
|
| Random forest | 0.6848 (0.0022) | N/A | N/A | 0.03 (0.00) | .99 | 0.03 | |||||||
|
| ||||||||||||||
|
| Ensemble model | 0.8782 | N/A | 0.8940 | N/A | N/A | N/A | |||||||
aN/A: not applicable.
bCNN: Convolutional Neural Network.
Additional effectiveness results of individual models. The models are ranked by the Pearson correlation coefficient in descending order.
| Model | Values, mean (SD) | ||||||||
|
| Pearson correlation | Spearman correlation | R²a | MSEb | |||||
|
| |||||||||
|
| BioSentVec | 0.8497 (0.0099) | 0.7708 (0.0073) | 0.6705 (0.0325) | 0.8709 (0.0434) | ||||
|
| BioBERT | 0.8481 (0.0122) | 0.7951 (0.0100) | 0.6636 (0.0275) | 0.8803 (0.0362) | ||||
|
| ClinicalBERT | 0.8442 (0.0161) | 0.8066 (0.0149) | 0.6357 (0.0391) | 0.9155 (0.0502) | ||||
|
| BlueBERT | 0.8320 (0.0232) | 0.7701 (0.0244) | 0.6520 (0.0544) | 0.8935 (0.0670) | ||||
|
| CNNc | 0.8224 (0.0043) | 0.7674 (0.0087) | 0.6136 (0.0436) | 0.9428 (0.0519) | ||||
|
| |||||||||
|
| Random forest | 0.6848 (0.0022) | 0.6572 (0.0027) | 0.4154 (0.0025) | 1.1614 (0.0025) | ||||
aR2: coefficient of determination.
bMSE: mean square error.
cCNN: Convolutional Neural Network.
Figure 2Mean squared error (MSE) of the models for each similarity range. Each category shows the number of sentence pairs and associated MSE of the models. The overall MSE (median, SD) are also provided in the legend. CNN: Convolutional Neural Network.
Qualitative examples with a relatively large mean squared error for the random forest model for sentence pair scores from 0.0 to 1.0.
| Case | Sentence pairs | Gold standard | Random forest | BioBERT | BioSentVec |
| 1 |
The patient tolerated the procedure well and was transferred to the recovery room in stable condition. The patient was transferred to the patient appointment coordinator for an appointment to be scheduled within the timeframe advised. | 0.0 | 2.5 | 0.5 | 1.2 |
| 2 |
Patient to call to schedule additional treatment sessions as needed otherwise patient dismissed from therapy. Patient tolerated session without adverse reactions to therapy. | 0.0 | 3.4 | 1.4 | 1.4 |
| 3 |
Patient was agreeable to speaking with social work. Patient was able to teach back concepts discussed. | 0.0 | 2.0 | 1.7 | 1.7 |
| 4 |
Left upper extremity: Inspection, palpation examined and normal. Abdomen: Liver and spleen, bowel sounds examined and normal. | 0.5 | 2.4 | 2.1 | 1.1 |
| 5 |
glucosamine capsule 1 capsule by mouth one time daily. Claritin tablet 1 tablet by mouth one time daily. | 1.0 | 2.6 | 1.7 | 1.5 |
Qualitative examples with a relatively large mean squared error for Bidirectional Encoder Representations from Transformers models for sentence pair scores from 4.0 to 5.0.
| Case | Sentence pairs | Gold standard | ClinicalBERT | BioBERT | BioSentVec |
| 1 |
Heart: S1/S2 regular rate and rhythm, without murmurs, gallops, or rubs Heart: S1, S2, regular rate and rhythm, no abnormal heart sounds or murmur | 5.0 | 2.5 | 3.4 | 3.9 |
| 2 |
He denies chest pain or shortness of breath He denies shortness of breath or chest pain | 5.0 | 2.3 | 3.3 | 3.9 |
| 3 |
This patient benefits from skilled occupational and/or physical therapy to improve participation in daily occupations Medical necessity: the patient would benefit from skilled physical therapy interventions to be able to return to work and engage in self-care activities | 4.0 | 2.4 | 2.2 | 2.5 |
| 4 |
All questions were answered to the parent’s satisfaction All questions were answered and consent was given to proceed | 4.0 | 2.8 | 2.6 | 3.7 |
| 5 |
The patient understands and is happy with the plan The patient verbalized understanding and wishes to proceed | 5.0 | 3.0 | 2.9 | 3.6 |