| Literature DB >> 35761935 |
Hossam Faris1,2,3, Mohammad Faris3, Maria Habib3, Alaa Alomari3,4.
Abstract
Automatic symptom identification plays a crucial role in assisting doctors during the diagnosis process in Telemedicine. In general, physicians spend considerable time on clinical documentation and symptom identification, which is unfeasible due to their full schedule. With text-based consultation services in telemedicine, the identification of symptoms from a user's consultation is a sophisticated process and time-consuming. Moreover, at Altibbi, which is an Arabic telemedicine platform and the context of this work, users consult doctors and describe their conditions in different Arabic dialects which makes the problem more complex and challenging. Therefore, in this work, an advanced deep learning approach is developed consultations with multi-dialects. The approach is formulated as a multi-label multi-class classification using features extracted based on AraBERT and fine-tuned on the bidirectional long short-term memory (BiLSTM) network. The Fine-tuning of BiLSTM relies on features engineered based on different variants of the bidirectional encoder representations from transformers (BERT). Evaluating the models based on precision, recall, and a customized hit rate showed a successful identification of symptoms from Arabic texts with promising accuracy. Hence, this paves the way toward deploying an automated symptom identification model in production at Altibbi which can help general practitioners in telemedicine in providing more efficient and accurate consultations.Entities:
Keywords: Deep learning; Machine learning; Multi-classification; Multi-label; Telemedicine
Year: 2022 PMID: 35761935 PMCID: PMC9233221 DOI: 10.1016/j.heliyon.2022.e09683
Source DB: PubMed Journal: Heliyon ISSN: 2405-8440
Figure 1An illustration of the problem, where users in Altibbi communicate with doctors via phone calls, free question-and-answer, or chat.
Figure 2An illustrative description of the proposed intelligent symptom identifier.
Figure 3The frequency of the most recurring symptoms across the dataset.
Figure 4The distribution of questions lengths (a) and the number of symptoms (b).
Summary of the used datasets.
| The Dataset for Embedding | |
|---|---|
| Number of questions | 3,310,996 |
| Vocabulary size before preprocessing | 1,032,093 |
| Vocabulary size | 171,385 |
| The Dataset for the Classifier | |
| Data size after removing duplicates | 578,941 |
| Data size after removing infrequent labels | 567,399 |
| Data size after sampling 5000 row per label | 501,004 |
| Number of unique symptoms before preprocessing | 4,689 |
| Number of unique symptoms after preprocessing | 2,348 |
| Vocabulary size | 165,781 |
Figure 5A demonstration of deploying the symptoms identification model to production in Altibbi.
Parameters settings for fine-tuning AraBERT. Keys: (L.R.) is the learning rate, (B.S.) is the batch size, (M.T.C) is the maximum token count, and (Dim.) is the embedding dimension.
| Parameters | Value |
|---|---|
| Optimizer | Adam |
| L.R. | 0.001 |
| Activation | Sigmoid |
| Epochs | 10 |
| B.S. | 128 |
| Loss | BCELoss |
| M.T.C | 200 |
| Dim. | 768 |
Parameters settings for AltibbiVec.
| Parameters | Value |
|---|---|
| Window size | 40 |
| Minimum count frequency | 5 |
| Down sampling | 0.01 |
| Epochs | 30 |
The results of fine-tuning BiLSTM based on the number of units.
| No. of Units | Recall |
|---|---|
| 8 | 0.442 |
| 16 | 0.496 |
| 24 | 0.514 |
| 32 | 0.518 |
| 40 | 0.523 |
| 48 | 0.523 |
| 56 | |
| 64 | 0.525 |
| 72 | |
| 80 | 0.522 |
Figure 6A description of the designed methodology. In which, different dimensions of the embedding models are used, i.e., 768 for AraBERT and (50, 100, 200) for AltibbiVec.
The evaluation results based on precision and recall.
| Model | Dim. | Metric | G1 | G2 | G3 | G4 | G5 |
|---|---|---|---|---|---|---|---|
| Word2Vec-SG | 50 | Precision | 0.266 | 0.310 | 0.319 | 0.291 | |
| Recall | 0.527 | 0.453 | 0.402 | 0.371 | 0.291 | ||
| Word2Vec-SG | 100 | Precision | 0.267 | 0.331 | |||
| Recall | 0.541 | 0.468 | |||||
| Word2Vec-SG | 200 | Precision | 0.321 | 0.345 | 0.291 | ||
| Recall | 0.405 | 0.381 | 0.291 | ||||
| Word2Vec-CBOW | 50 | Precision | 0.255 | 0.304 | 0.308 | 0.337 | 0.309 |
| Recall | 0.516 | 0.453 | 0.393 | 0.309 | |||
| Word2Vec-CBOW | 100 | Precision | 0.259 | 0.300 | 0.309 | 0.344 | 0.291 |
| Recall | 0.518 | 0.450 | 0.392 | 0.369 | 0.286 | ||
| Word2Vec-CBOW | 200 | Precision | 0.257 | 0.301 | 0.304 | 0.325 | 0.269 |
| Recall | 0.517 | 0.448 | 0.389 | 0.374 | 0.263 | ||
| AraBERT-base-v1 | 768 | Precision | 0.243 | 0.304 | 0.312 | 0.330 | 0.291 |
| Recall | 0.481 | 0.441 | 0.395 | 0.369 | 0.291 | ||
| AraBERT-base-v2 | 768 | Precision | 0.232 | 0.290 | 0.306 | 0.321 | 0.291 |
| Recall | 0.444 | 0.418 | 0.385 | 0.364 | 0.291 |
The performance of the BiLSTM model over groups of data based on the number of symptoms.
| Model | Dim. | Group | Hit Rate | |||||
|---|---|---|---|---|---|---|---|---|
| P1 | P2 | P3 | P4 | P5 | Overall | |||
| G1 | 0.541 | - | - | - | - | |||
| G2 | 0.698 | 0.238 | - | - | - | |||
| Word2Vec-SG | 50 | G3 | 0.781 | 0.378 | 0.072 | - | - | |
| G4 | 0.898 | 0.489 | 0.176 | 0.011 | - | |||
| G5 | 0.771 | 0.514 | 0.171 | 0.000 | 0.000 | 0.532 | ||
| G1 | 0.541 | - | - | - | - | |||
| G2 | 0.696 | 0.240 | - | - | - | |||
| Word2Vec-SG | 100 | G3 | 0.791 | 0.371 | 0.066 | - | - | |
| G4 | 0.881 | 0.477 | 0.153 | 0.023 | - | |||
| G5 | 0.771 | 0.571 | 0.229 | 0.029 | 0.000 | 0.532 | ||
| G1 | 0.544 | - | - | - | - | |||
| G2 | 0.708 | 0.236 | - | - | - | |||
| Word2Vec-SG | 200 | G3 | 0.779 | 0.358 | 0.077 | - | - | |
| G4 | 0.443 | 0.017 | - | |||||
| G5 | 0.771 | 0.457 | 0.200 | 0.029 | 0.000 | 0.535 | ||
| G1 | 0.516 | - | - | - | - | |||
| G2 | 0.684 | 0.221 | - | - | - | |||
| Word2Vec-CBOW | 50 | G3 | 0.779 | 0.343 | 0.057 | - | - | |
| G4 | 0.869 | 0.483 | 0.017 | - | ||||
| G5 | 0.771 | 0.486 | 0.229 | 0.000 | 0.508 | |||
| G1 | 0.524 | - | - | - | - | |||
| G2 | 0.686 | 0.225 | - | - | - | |||
| Word2Vec-CBOW | 100 | G3 | 0.771 | 0.347 | 0.056 | - | - | |
| G4 | 0.875 | 0.443 | 0.148 | - | ||||
| G5 | 0.743 | 0.486 | 0.200 | 0.029 | 0.000 | 0.515 | ||
| G1 | 0.517 | - | - | - | - | |||
| G2 | 0.683 | 0.222 | - | - | - | |||
| Word2Vec-CBOW | 200 | G3 | 0.770 | 0.352 | 0.055 | - | - | |
| G4 | 0.892 | 0.449 | 0.153 | 0.017 | - | |||
| G5 | 0.771 | 0.400 | 0.143 | 0.029 | 0.000 | 0.509 | ||
| G1 | 0.545 | - | - | - | - | |||
| G2 | 0.719 | 0.262 | - | - | - | |||
| AraBERT-base-v1 | 768 | G3 | 0.084 | - | - | |||
| G4 | 0.795 | 0.384 | 0.086 | 0.016 | - | |||
| G5 | 0.154 | 0.026 | 0.000 | 0.537 | ||||
| G1 | - | - | - | - | ||||
| G2 | - | - | - | |||||
| AraBERT-base-v2 | 768 | G3 | 0.806 | 0.391 | - | - | ||
| G4 | 0.881 | 0.483 | 0.148 | 0.017 | - | |||
| G5 | 0.771 | 0.571 | 0.000 | |||||
The precision and recall based on the manual evaluation. Keys: A.S.C is the average number of symptoms identified correctly per consultation, and A.L.O.S is at least one symptom identified correctly.
| Metric | Value |
|---|---|
| Recall | 0.706 |
| Precision | 0.233 |
| A.S.C | 1.164 |
| A.L.O.S | 0.711 |