| Literature DB >> 34458573 |
Mohammed Ibrahim1, Susan Gauch1, Omar Salman1, Mohammed Alqahtani1.
Abstract
BACKGROUND: Clear language makes communication easier between any two parties. A layman may have difficulty communicating with a professional due to not understanding the specialized terms common to the domain. In healthcare, it is rare to find a layman knowledgeable in medical terminology which can lead to poor understanding of their condition and/or treatment. To bridge this gap, several professional vocabularies and ontologies have been created to map laymen medical terms to professional medical terms and vice versa.Entities:
Keywords: Consumer health vocabulary; Ontologies; Vocabulary enrichment; Word embedding
Year: 2021 PMID: 34458573 PMCID: PMC8371999 DOI: 10.7717/peerj-cs.668
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Sample UMLS concepts and some of their OA CHV associated laymen terms.
| CUI | UMLS Concept | Associated laymen terms | |||
|---|---|---|---|---|---|
| C0018681 | Headache | headache | headaches | head ache | ache head |
| C0003864 | Arthritis | arthritis | arthritides | arthritide | |
| C0033860 | Psoriasis | Psoriasis | psoriasi | psoriasis | |
Figure 1Methodology of finding new laymen terms.
Figure 2Methodology of improved GloVe with WordNet corpus enhancement.
MedHelp.org community corpus statistics.
| No. | Community | Posts | Tokens |
|---|---|---|---|
| 1. | Addiction | 82,488 | 32,871,561 |
| 2. | Pregnancy | 308,677 | 33,989,647 |
| 3. | Hepatitis-C | 46,894 | 21,142,999 |
| 4. | Neurology | 62,061 | 9,394,044 |
| 5. | Dermatology | 67,109 | 8,615,484 |
| 6. | STDs/STIs | 59,774 | 7,275,289 |
| 8. | Gastroenterology | 43,394 | 6,322,356 |
| 9. | Women health | 66,336 | 5,871,323 |
| 10. | Heart Disease | 33,442 | 5,735,739 |
| 11. | Eye Care | 31,283 | 4,281,328 |
| Total | 801,458 | 135,499,770 | |
UMLS concepts with their seed terms from the MedlinePlus dataset.
| CUI | Medical concept | Associated Laymen terms | ||
|---|---|---|---|---|
| C0043246 | laceration | lacer | torn | tear |
| C0015672 | fatigue | weariness | tired | fatigued |
| C0021400 | influenza | flu | influenza | grippe |
Figure 3(A) Size of the OAC CHV dataset to the MedlinePlus dataset. (B) Shared concepts and their laymen terms between the MedlinePlus and OAC CHV datasets.
Figure 4The Macro F-Score for the GloVe algorithm with different vector and window sizes.
The micro-precision of GloVe.
| Vector size | NumCon | Micro | ||
|---|---|---|---|---|
| Precision | Recall | F-score | ||
| 100 | 420 |
| 38.91 | 8.51 |
| 200 | 444 |
| 41.33 | 9 |
| 300 | 442 |
| 42.02 | 9.19 |
| 400 | 457 | 5.28 | 42.97 | 9.41 |
Note:
The micro-precision is highlighted in bold.
Evaluation of the basic GloVe, GloVeSyno, GloVeHypo, and GloVeHyper algorithms over the OAC CHV and MedlinePlus datasets.
| NumCon | Macro | MRR | |||
|---|---|---|---|---|---|
| Precision | Recall | F-score | |||
| OAC CHV | |||||
| Basic GloVe | 457 | 48.46 | 48.41 | 48.44 | 0.29 |
| GloVeSyno |
|
|
|
|
|
| GloVeHypo | 280 | 29.69 | 29.66 | 29.68 | 0.33 |
| GloVeHyper | 433 | 45.92 | 45.87 | 45.89 | 0.35 |
| MedlinePlus | |||||
| Basic GloVe | 48 | 51.06 | 47.52 | 49.23 | 0.38 |
| GloVeSyno |
|
|
|
|
|
| GloVeHypo | 32 | 33.33 | 31.68 | 32.49 | 0.37 |
| GloVeHyper | 35 | 37.23 | 34.65 | 35.9 | 0.35 |
Note:
The highest results reported by the algorithms over the two ground truth datasets are highlighted in bold.
The average results of the basic GloVe, GloVeSyno, GloVeHypo, and GloVeHyper algorithms over the OAC CHV and MedlinePlus datasets.
| Algorithm | NumCon | Macro | MRR | F-score Rel-Improv. | ||
|---|---|---|---|---|---|---|
| Precision | Recall | F-score | ||||
| Basic GloVe | 252.5 | 49.76 | 47.965 | 48.835 | 0.335 | |
| GloVeSyno |
|
|
|
|
|
|
| GloVeHypo | 156 | 31.51 | 30.67 | 31.085 | 0.350 | −36% |
| GloVeHyper | 234 | 41.575 | 40.26 | 40.895 | 0.350 | −16% |
Note:
Best results have been highlighted with bold.
Sample of the GloVeSyno output (seeds stemmed).
| CUI | Seed | Candidate Synonyms | ||||||
|---|---|---|---|---|---|---|---|---|
| C0015967 | feverish | febric |
|
|
| chili_pepp | chilli | influenza |
| C0020505 | overeat | gormand | pig_out | ingurgit | gormandis | scarf_out | overindulg | gourmand |
| C0013604 | edema |
|
|
| swell | puffi | ascit | crestless |
| C0039070 | syncop |
| deliquium |
| vasovag | neurocardi | dizzi | lighthead |
| C0015726 | fear |
|
|
| terrifi |
| panic | anxieti |
| C0014544 | seizur | rictus |
| raptus | prehend | shanghaier | seizer | clutch |
| C0036916 | stds |
| gonorrhea | encount | chlamydia | hiv | herp | syphili |
Note:
The candidate synonyms that appear in the ground truth list of synonyms are highlighted with bold.
Figure 5Micro F-Score and the number of concepts for the GloVeSyno algorithm over the OAC CHV dataset.
Figure 6Micro F-Score and the number of concepts for the GloVeSyno algorithm over the MedlinePlus dataset.
Figure 7(A & B) F-Score results over the precision and recall for the GloVeSyno algorithm over the OAC CHV and MedlinePlus datasets.