| Literature DB >> 36167551 |
Zehao Yu1, Xi Yang1, Gianna L Sweeting2, Yinghan Ma1, Skylar E Stolte2, Ruogu Fang2, Yonghui Wu3.
Abstract
BACKGROUND: Diabetic retinopathy (DR) is a leading cause of blindness in American adults. If detected, DR can be treated to prevent further damage causing blindness. There is an increasing interest in developing artificial intelligence (AI) technologies to help detect DR using electronic health records. The lesion-related information documented in fundus image reports is a valuable resource that could help diagnoses of DR in clinical decision support systems. However, most studies for AI-based DR diagnoses are mainly based on medical images; there is limited studies to explore the lesion-related information captured in the free text image reports.Entities:
Keywords: Deep learning; Diabetic retinopathy; Named entity recognition; Natural language processing; Relation extraction
Mesh:
Year: 2022 PMID: 36167551 PMCID: PMC9513862 DOI: 10.1186/s12911-022-01996-2
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 3.298
Fig. 1An example of brat annotation for diabetic retinopathy (DR)
Concepts distributions for training and test
| Training set | Test set | Total | Example concept | |
|---|---|---|---|---|
| Total notes | 391 | 145 | 536 | |
| Lesion | 2,383 | 896 | 3,279 | ‘hemorrhage’ |
| Laterality | 1,280 | 485 | 1,765 | ‘right eye’ |
| Severity | 579 | 249 | 828 | ‘mild’ |
| Eye part | 45 | 17 | 62 | ‘foveal’ |
| Total concepts | 4,287 | 1,647 | 5,934 |
Negation attributes distributions for training and test
| Training set | Test set | Total | |
|---|---|---|---|
| Total notes | 391 | 145 | 536 |
| Non-negated_lesion | 2,057 | 747 | 2,804 |
| Negated_lesion | 416 | 149 | 901 |
Performance comparison for concept extraction
| Strict | Lenient | |||||
|---|---|---|---|---|---|---|
| Precision | Recall | F1 score | Precision | Recall | F1 score | |
| LSTM_general | 0.9492 | 0.9186 | 0.9337 | 0.9630 | 0.9320 | 0.9472 |
| LSTM_mimic | 0.9464 | 0.8682 | 0.9056 | 0.9609 | 0.8810 | 0.9192 |
| BERT_general | 0.8885 | 0.9575 | 0.9217 | 0.9067 | 0.9739 | 0.9391 |
| BERT_mimic | 0.9486 | 0.952 | 0.9642 | 0.9648 | ||
| RoBERTa_general | 0.9248 | 0.9636 | 0.9438 | 0.9353 | 0.9739 | 0.9542 |
| RoBERTa_mimic | 0.9391 | 0.9551 | 0.947 | 0.9498 | 0.9654 | 0.9575 |
*Best F1 scores are highlighted in bold
Detailed performance for each concept category for BERT_mimic
| Strict | Lenient | |||||
|---|---|---|---|---|---|---|
| Precision | Recall | F1 score | Precision | Recall | F1 score | |
| Lesion | 0.9555 | 0.9576 | 0.9565 | 0.9776 | 0.9743 | 0.976 |
| Severity | 0.9627 | 0.9317 | 0.9469 | 0.9668 | 0.9357 | 0.951 |
| Eye part | 0.8 | 0.7059 | 0.75 | 0.8 | 0.7059 | 0.75 |
| Laterality | 0.9339 | 0.9608 | 0.9472 | 0.9439 | 0.9711 | 0.9573 |
| Overall | 0.9486 | 0.952 | 0.9503 | 0.9642 | 0.9648 | 0.9645 |
Performance comparison for relation extraction models
| Settings | NLP Models | Strict | Lenient | ||||
|---|---|---|---|---|---|---|---|
| Precision | Recall | F1 score | Precision | Recall | F1 score | ||
| Use gold-standard concepts | BERT_general | 0.9199 | 0.9437 | 0.9199 | 0.9437 | ||
| RoBERTa_general | 0.9024 | 0.9291 | 0.9024 | 0.9291 | |||
| BERT_MIMIC | 0.9254 | 0.9254 | 0.9254 | 0.9254 | |||
| RoBERTa_MIMIC | 0.9147 | 0.9467 | 0.9304 | 0.9147 | 0.9467 | 0.9304 | |
| End-to-end | BERT_general_e2e | 0.8767 | 0.9056 | ||||
| RoBERTa_general_e2e | 0.8274 | 0.8565 | 0.8861 | ||||
| BERT_MIMIC_e2e | 0.8282 | 0.8584 | 0.843 | 0.8584 | 0.8858 | 0.8719 | |
| RoBERTa_MIMIC_e2e | 0.8362 | 0.8782 | 0.8567 | 0.8688 | 0.9072 | 0.8876 | |
*Best precision, recall, and F1 are highlighted in bold. The strict and lenient scores are identical for the ‘gold-standard’ settings as the gold-standard annotation for concepts and attributes were used