| Literature DB >> 33226350 |
Xi Yang1, Xing He1, Hansi Zhang1, Yinghan Ma1, Jiang Bian1, Yonghui Wu1.
Abstract
BACKGROUND: Semantic textual similarity (STS) is one of the fundamental tasks in natural language processing (NLP). Many shared tasks and corpora for STS have been organized and curated in the general English domain; however, such resources are limited in the biomedical domain. In 2019, the National NLP Clinical Challenges (n2c2) challenge developed a comprehensive clinical STS dataset and organized a community effort to solicit state-of-the-art solutions for clinical STS.Entities:
Keywords: clinical semantic textual similarity; deep learning; natural language processing; transformers
Year: 2020 PMID: 33226350 PMCID: PMC7721552 DOI: 10.2196/19735
Source DB: PubMed Journal: JMIR Med Inform
Descriptive statistics of the datasets.
| Dataset | Sentence pairs, n | Annotation distribution, n (%) | ||||
|
|
| [0.0, 1.0] | (1.0, 2.0] | (2.0, 3.0] | (3.0, 4.0] | (4.0, 5.0] |
| STS-Clinica Training | 1642 | 312 (19.0) | 154 (9.4) | 394 (24.0) | 509 (31.0) | 273 (16.6) |
| STS-Clinic Test | 412 | 238 (57.8) | 46 (11.2) | 32 (7.8) | 62 (15.0) | 34 (8.3) |
| STS-General Training | 7249 | 1492 (20.6) | 1122 (15.5) | 1413 (19.5) | 1260 (17.4) | 1962 (27.1) |
aSTS: semantic textual similarity.
Figure 1An overview of our single-model and ensemble solutions for clinical STS. STS: semantic textual similarity.
Figure 2The two-stage procedure for clinical STS model development. STS: semantic textual similarity.
Hyperparameters for transformer models.
| Model | Number of epochs | Batch size | Learning ratea |
| BERT-baseb | 4 | 8 | 1.00E-05 |
| BERT-mimic | 3 | 8 | 1.00E-05 |
| BERT-large | 3 | 8 | 1.00E-05 |
| XLNet-base | 3 | 4 | 1.00E-05 |
| XLNet-mimic | 3 | 4 | 1.00E-05 |
| XLNet-large | 4 | 4 | 1.00E-05 |
| RoBERTa-basec | 3 | 4 | 1.00E-05 |
| RoBERTa-mimic | 3 | 4 | 1.00E-05 |
| RoBERTa-large | 3 | 4 | 1.00E-05 |
| BERT-large + XLNet-large | 4 | 8 | 1.00E-05 |
| BERT-large + RoBERTa-large | 3 | 4 | 1.00E-05 |
| RoBERTa-large + XLNet-large | 4 | 4 | 1.00E-05 |
| BERT-large + XLNet-large + RoBERTa-large | 3 | 2 | 1.00E-05 |
aThe learning rate is a tuning parameter in an optimization algorithm that determines the step size at each iteration while moving toward a minimum of a loss function [39].
bBERT: Bidirectional Encoder Representations from Transformers.
cRoBERTa: Robustly optimized BERT approach.
Performances of the Pearson correlation on the test set.
| Model | Pearson correlation on test set | |
| BERT-basea | 0.8615 | <.001 |
| BERT-mimic | 0.8521 | <.001 |
| BERT-largeb | 0.8549 | <.001 |
| XLNet-base | 0.8470 | <.001 |
| XLNet-mimic | 0.8286 | <.001 |
| XLNet-largeb,c | 0.8864 | <.001 |
| RoBERTa-based | 0.8778 | <.001 |
| RoBERTa-mimic | 0.8705 | <.001 |
| RoBERTa-large |
| <.001 |
| BERT-large + XLNet-largeb | 0.8764 | <.001 |
| BERT-large + RoBERTa-large | 0.8914 | <.001 |
| RoBERTa-large + XLNet-large | 0.8854 | <.001 |
| BERT-large + XLNet-large + RoBERTa-large | 0.8452 | <.001 |
aBERT: Bidirectional Encoder Representations from Transformers.
bThe challenge submissions.
cThe best challenge submission (ranked 3rd).
dRoBERTa: Robustly optimized BERT approach.
Transformer model attention visualization on two examples from STS-Clinic.
| Category | Sentence pair | Gold standard | Prediction |
| Training |
S1a: advised to contact us with questions or concerns. S2: please do not hesitate to contact me with any further questions. | 5 | N/Ab |
| Test |
S1: patient discharged ambulatory without further questions or concerns noted. S2: please contact location at phone number with any questions or concerns regarding this patient. | 0 | 2.5 |
aS: sentence.
bN/A: not applicable.
Figure 3Transformer model attention visualization on two examples from STS-Clinic. STS: semantic textual similarity.