| Literature DB >> 34752490 |
Yoshimasa Kawazoe1, Daisaku Shibata1, Emiko Shinohara1, Eiji Aramaki2, Kazuhiko Ohe3.
Abstract
Generalized language models that are pre-trained with a large corpus have achieved great performance on natural language tasks. While many pre-trained transformers for English are published, few models are available for Japanese text, especially in clinical medicine. In this work, we demonstrate the development of a clinical specific BERT model with a huge amount of Japanese clinical text and evaluate it on the NTCIR-13 MedWeb that has fake Twitter messages regarding medical concerns with eight labels. Approximately 120 million clinical texts stored at the University of Tokyo Hospital were used as our dataset. The BERT-base was pre-trained using the entire dataset and a vocabulary including 25,000 tokens. The pre-training was almost saturated at about 4 epochs, and the accuracies of Masked-LM and Next Sentence Prediction were 0.773 and 0.975, respectively. The developed BERT did not show significantly higher performance on the MedWeb task than the other BERT models that were pre-trained with Japanese Wikipedia text. The advantage of pre-training on clinical text may become apparent in more complex tasks on actual clinical text, and such an evaluation set needs to be developed.Entities:
Mesh:
Year: 2021 PMID: 34752490 PMCID: PMC8577751 DOI: 10.1371/journal.pone.0259763
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The schematic view of morphological analysis and wordpiece tokenization.
Fig 2The schematic view of Masked-LM and Next Sentence Prediction task.
A. Masked LM predicts the original tokens for the masked, replaced or kept tokens. B. Next Sentence Prediction predicts if the second sentence in the pair is the subsequent sentence in the original documents. The role of special symbols are as follows: [CLS] is added in front of every input text, and the output vector is used for Next Sentence Prediction task; [MASK] is masked token in Masked-LM task; [SEP] is a break between sentences; [UNK] is unknown token that does not appear in the vocabulary.
Three examples of pseudo-tweets with the eight classes of symptoms.
| Lang | Pseudo-tweets | Flu | Diarrhea | Hay fever | Cough | Headache | Fever | Runny nose | Cold | |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | ja | 風邪で鼻づまりがやばい。 | N | N | N | N | N | N | P | P |
| en | I have a cold, which makes my nose stuffy like crazy. | |||||||||
| zh | 感冒引起的鼻塞很烦人。 | |||||||||
| 2 | ja | 花粉症のせいでずっと微熱でぼーっとしてる。眠い。 | N | N | P | N | N | P | P | N |
| en | I’m so feverish and out of it because of my allergies. I’m so sleepy. | |||||||||
| zh | 由于花粉症一直发低烧, 晕晕沉沉的。很困。 | |||||||||
| 3 | ja | 鼻風邪かなと思ってたけど、頭痛もしてきたから今日は休むことにしよう。 | N | N | N | N | P | N | P | P |
| en | It was just a cold and a runny nose, but now my head is starting to hurt, so I’m gonna take a day off today. | |||||||||
| zh | 想着或许是鼻伤风, 可头也开始疼了, 所以今天就休息吧。 |
The English (en) and Chinese (zh) sentences were translated from Japanese (ja).
Fig 3The schematic view of the network for evaluation.
The specifications of each BERT.
| UTH-BERT | KU-BERT | TU-BERT | mBERT | ||
|---|---|---|---|---|---|
| Publisher | The University of Tokyo Hospital | The University of Kyoto | The University of Tohoku | ||
| Language | Japanese | Japanese | Japanese | Multilingual | |
| Pre-training corpus | Clinical text (120 million) | JP Wikipedia (18 million) | JP Wikipedia (18 million) | 104 languages of Wikipedias | |
| Tokenizer | Morphological analyzer | MeCab | Juman++ | MeCab | - |
| External Dictionary | Mecab-ipadic-neologd, J-MeDic | - | Mecab-ipadic | - | |
| Number of vocabularies | 25,000 | 32,000 | 32,000 | 119,448 | |
| Total number of [UNK] tokens present in the MedWeb dataset. | 253 (0.68%) | 394 (1.11%) | 369 (0.94%) | 1 (0.00%) | |
Accuracies of Masked-LM and Next Sentence Prediction in pre-training for the evaluation dataset.
| UTH-BERT | Number of training steps (epochs) | |||
|---|---|---|---|---|
| 2.5 × 106 (1) | 5.0 × 106 (2) | 7.5 × 106 (3) | 10 × 106 (4) | |
| Masked LM (accuracy) | 0.743 | 0.758 | 0.768 | 0.773 |
| Next Sentence Prediction (accuracy) | 0.966 | 0.970 | 0.973 | 0.975 |
The exact-match accuracy of each model with five-fold cross validation.
| Model name | Exact match accuracy (95% CI) |
|---|---|
| UTH-BERT | 0.855 (0.848–0.862) |
| KU-BERT | 0.845 (0.833–0.857) |
| TU-BERT | 0.862 (0.857–0.866) |
| mBERT | 0.806 (0.794–0.817) |
The label-wise performances of each model with five-fold cross validation.
| Flu | Diarrhea | Hay fever | Cough | Headache | Fever | Runny nose | Cold | ||
|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
| ||
| UTH | 0.676/0.858 | 0.914/0.919 | 0.904/0.835 | 0.928/0.963 | 0.947/0.974 | 0.797/0.905 | 0.920/0.927 | 0.885/0.913 | 0.888 (0.846–0.931) |
|
| 0.916 | 0.865 |
|
| 0.845 | 0.923 | 0.898 | ||
| KU | 0.594/0.842 | 0.877/0.956 | 0.896/0.896 | 0.892/0.963 | 0.947/0.958 | 0.760/0.927 | 0.921/0.927 | 0.890/0.936 | 0.882 (0.828–0.935) |
| 0.694 | 0.915 |
| 0.926 | 0.952 | 0.835 | 0.924 |
| ||
| TU | 0.735/0.692 | 0.927/0.947 | 0.898/0.874 | 0.916/0.957 | 0.936/0.982 | 0.825/0.903 | 0.912/0.938 | 0.884/0.904 | 0.888 (0.837–0.939) |
| 0.710 |
| 0.885 | 0.936 | 0.958 |
|
| 0.893 | ||
| mBERT | 0.598/0.850 | 0.867/0.906 | 0.870/0.822 | 0.918/0.892 | 0.928/0.927 | 0.745/0.890 | 0.869/0.893 | 0.902/0.887 | 0.855 (0.807–0.902) |
| 0.696 | 0.885 | 0.841 | 0.905 | 0.926 | 0.810 | 0.879 | 0.894 | ||
|
| 0.714 | 0.913 | 0.872 | 0.928 | 0.949 | 0.838 | 0.913 | 0.899 |
The performances shown are Recall, Precision and F-measure.
Interpretations obtained from the results of the error analysis.
| No. | Error | Cause of the error | Num. of errors | Example sentence | Incorrect prediction |
|---|---|---|---|---|---|
| 1 | False positive (FP) | Co-occurring symptoms | 10 | (ja) インフルかと思って病院行ったけど、検査したら違ったよ。 | Fever pos. |
| (en) I thought I had the flu so I went to the doctor, but I got tested and I was wrong. | |||||
| 2 | Symptoms mentioned in general topics | 8 | (ja) 風邪といえば鼻づまりですよね。 | Cold pos. Runny nose pos. | |
| (en) To me, a cold means a stuffy nose. | |||||
| 3 | Suspected influenza | 5 | (ja) インフルエンザかもしれないから部活休もうかな。 | Flu pos. | |
| (en) I might have the flu so I’m thinking I’ll skip the club meeting. | |||||
| 4 | Fully recovered symptoms | 5 | (ja) やっと咳と痰が治まった。 | Cough pos. | |
| (en) My cough and phlegm are finally cured. | |||||
| 5 | Metaphorical expressions | 3 | (ja) 熱をあげているのは嫁と娘だ。 | Fever pos. | |
| (en) What makes me excited are my wife and daughter. | |||||
| 6 | Denied symptoms | 2 | (ja) 鼻水が止まらないので熱でもあるのかと思ったけど、全然そんなことなかったわ。 | Fever pos. | |
| (en) My nose won’t stop running, which got me wondering if I have a fever, but as it turns out I definitely do not. | |||||
| 7 | Symptoms for asking unspecified people | 2 | (ja) 誰か熱ある人いない? | Fever pos. | |
| (en) Anyone have a fever? | |||||
| 8 | Past symptoms | 1 | (ja) ネパールにいったら食べ物があわなくてお腹壊して下痢になった・・・ | Diarrhea pos. | |
| (en) When I went to Nepal, the food didn’t agree with me, and I got an upset stomach and diarrhea. . . | |||||
| 9 | False negative (FN) | Symptoms that are directly expressed | 8 | (ja) 痰が止まったとおもったらこんどは頭痛。 | Headache neg. |
| (en) Just when I thought the phlegm was over, now I have a headache | |||||
| 10 | Symptoms that are indirectly expressed | 5 | (ja) 中国にいた時は花粉症ならなかったのに再発したー! | Runny nose neg. | |
| (en) Even though I didn’t have allergies when I was in China, they’re back! | |||||
| 11 | Symptoms that can be inferred to be positive by being a tweet from a person | 4 | (ja) 今日花粉少ないとか言ってるやつ花粉症じゃないから。 | Runny nose neg. | |
| (en) The people who are saying there’s not a lot of pollen today don’t have allergies. | |||||
| 12 | Symptoms that are in the recovery process | 1 | (ja) インフルが回復してきてだいぶ元気になった!けどあと2日は外出禁止なんだよな。 | Flu neg. | |
| (en) I’ve recovered from the flu and feel great! But I’m still not allowed to go out for two days. | |||||
| 13 | Symptoms occurring in the tweeter’s neighborhood | 1 | (ja) うちのクラス、集団で下痢事件 | Diarrhea neg. | |
| (en) There’s a diarrhea outbreak in my class |