| Literature DB >> 21900701 |
Mark Stevenson1, Eneko Agirre, Aitor Soroa.
Abstract
OBJECTIVE: Current techniques for knowledge-based Word Sense Disambiguation (WSD) of ambiguous biomedical terms rely on relations in the Unified Medical Language System Metathesaurus but do not take into account the domain of the target documents. The authors' goal is to improve these methods by using information about the topic of the document in which the ambiguous term appears.Entities:
Mesh:
Year: 2011 PMID: 21900701 PMCID: PMC3277615 DOI: 10.1136/amiajnl-2011-000415
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Contingency table showing distribution of terms in documents
| Medical Subject Heading code | Totals | ||
| + | − | ||
| Term | |||
| + | o++ | o+− | o+* |
| − | o−+ | o−− | o−* |
| Totals | o*+ | o*− | o** |
Example key terms identified by relevance feedback approach for Medical Subject Heading codes associated with different meanings of ‘culture’
| Cells, cultured | Socio-economic factors |
| Inhibitors | Health |
| Cell | Education |
| Virus | Income |
| Inhibition | Social |
| Assay | Countries |
| Inhibited | Economic |
| Cytotoxicity | Care |
| Staining | Need |
| Virions | Services |
| Epithelial | Children |
Samples of contexts generated for the sentence ‘The main goal of the present study was to determine whether or not oligodendrocytes in culture constitutively express the different βAPP isoforms’ (simplified for brevity).
| Local context | goal#1 present#1 study#1 oligodendrocytes#1 culture#1 different#1 isoforms#1 |
| Key terms | inhibitors#1 cell#1 virus#1 inhibition#1 assay#1 |
| Key terms (IDF) | inhibitors#1.36 cell#1.36 virus#1.36 inhibition#1.36 assay#1.36 |
| Local context and Key terms (IDF) | goal#1 present#1 study#1 oligodendrocytes#1 culture#1 different#1 isoforms#1 inhibitors#1.36 cell#1.36 virus#1.36 inhibition#1.36 assay#1.36 |
IDF, inverse document frequency.
Word Sense Disambiguation results using local and domain context
| Domain context alone | Domain context and local context | |||||||||
| Count | Local context | G2 | G2 (IDF) | RF | RF (IDF) | G2 | G2 (IDF) | RF | RF (IDF) | |
| All | 70.4 | 70.0 | 70.8 | 70.6 | 71.5 | 72.8 | 73.5 | 73.5 | ||
| McInnes subset | 54.5 | 57.5 | 57.9 | 57.7 | 58.6 | 58.9 | 59.2 | 59.1 | ||
| 93 | 33.3 | 32.3 | 34.4 | 35.5 | 33.3 | 34.4 | 35.5 | 38.7 | 37.6 | |
| 100 | 46.0 | 51.0 | 52.0 | 52.0 | 53.0 | 52.0 | 52.0 | 53.0 | 54.0 | |
| Cold | 95 | 30.5 | 60.0 | 64.2 | 63.2 | 64.2 | 66.3 | 67.4 | 68.4 | 68.4 |
| Condition | 92 | 41.3 | 6.5 | 15.2 | 8.7 | 5.4 | 13.0 | 20.7 | 13.0 | 9.8 |
| Culture | 100 | 80.0 | 87.0 | 91.0 | 83.0 | 86.0 | 88.0 | 92.0 | 85.0 | 86.0 |
| 65 | 92.3 | 95.4 | 93.8 | 96.9 | 96.9 | 95.4 | 95.4 | 95.4 | 95.4 | |
| Depression | 85 | 88.2 | 100.0 | 98.8 | 98.8 | 98.8 | 98.8 | 97.6 | 98.8 | 97.6 |
| Determination | 79 | 94.9 | 73.1 | 87.2 | 87.2 | 84.6 | 79.7 | 91.1 | 87.3 | 83.5 |
| Discharge | 75 | 81.3 | 82.7 | 81.3 | 84.0 | 82.7 | 85.3 | 84.0 | 89.3 | 84.0 |
| Energy | 100 | 95.0 | 86.9 | 86.9 | 93.9 | 92.9 | 87.0 | 87.0 | 94.0 | 93.0 |
| 100 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | 50.0 | |
| Extraction | 87 | 28.7 | 32.2 | 39.1 | 33.3 | 40.2 | 35.6 | 41.4 | 35.6 | 42.5 |
| Failure | 29 | 93.1 | 86.2 | 82.8 | 65.5 | 79.3 | 86.2 | 86.2 | 75.9 | 82.8 |
| Fat | 73 | 95.9 | 97.3 | 97.3 | 97.3 | 97.3 | 97.3 | 97.3 | 97.3 | 97.3 |
| Fit | 18 | 11.1 | 5.6 | 5.6 | 0.0 | 0.0 | 11.1 | 11.1 | 5.6 | 5.6 |
| Fluid | 100 | 90.0 | 92.9 | 92.9 | 93.9 | 93.9 | 90.0 | 90.0 | 91.0 | 92.0 |
| Frequency | 94 | 98.9 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 |
| Ganglion | 100 | 73.0 | 69.0 | 72.0 | 71.0 | 73.0 | 80.0 | 79.0 | 80.0 | 81.0 |
| Glucose | 100 | 90.0 | 92.9 | 92.9 | 91.9 | 91.9 | 92.0 | 94.0 | 92.0 | 92.0 |
| 100 | 37.0 | 37.0 | 37.0 | 37.0 | 37.0 | 37.0 | 37.0 | 37.0 | 37.0 | |
| 100 | 62.0 | 73.0 | 73.0 | 74.0 | 74.0 | 73.0 | 74.0 | 74.0 | 74.0 | |
| Implantation | 98 | 87.8 | 70.4 | 83.7 | 74.5 | 88.8 | 76.5 | 90.8 | 84.7 | 93.9 |
| Inhibition | 99 | 3.0 | 2.0 | 2.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
| Japanese | 79 | 81.0 | 82.3 | 78.5 | 84.8 | 82.3 | 84.8 | 79.7 | 86.1 | 82.3 |
| Lead | 29 | 93.1 | 20.7 | 20.7 | 93.1 | 93.1 | 93.1 | 93.1 | 93.1 | 93.1 |
| Man | 92 | 45.7 | 81.3 | 85.7 | 76.9 | 82.4 | 84.8 | 87.0 | 81.5 | 83.7 |
| Mole | 84 | 57.1 | 56.6 | 53.0 | 62.7 | 57.8 | 69.0 | 65.5 | 72.6 | 70.2 |
| 97 | 71.1 | 59.8 | 56.7 | 59.8 | 58.8 | 67.0 | 67.0 | 70.1 | 71.1 | |
| 89 | 29.2 | 49.4 | 53.9 | 46.1 | 52.8 | 47.2 | 50.6 | 44.9 | 49.4 | |
| Pathology | 99 | 33.3 | 16.7 | 17.7 | 16.7 | 17.7 | 20.2 | 22.2 | 22.2 | 22.2 |
| Pressure | 96 | 97.9 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 | 100.0 |
| 98 | 52.0 | 43.9 | 43.9 | 43.9 | 43.9 | 43.9 | 43.9 | 42.9 | 43.9 | |
| Reduction | 11 | 45.5 | 72.7 | 72.7 | 72.7 | 72.7 | 72.7 | 72.7 | 72.7 | 63.6 |
| 68 | 76.5 | 80.9 | 80.9 | 79.4 | 82.4 | 82.4 | 82.4 | 79.4 | 82.4 | |
| Resistance | 3 | 66.7 | 100.0 | 100.0 | 100.0 | 100.0 | 66.7 | 66.7 | 66.7 | 66.7 |
| 65 | 81.5 | 82.8 | 81.2 | 82.8 | 82.8 | 73.8 | 73.8 | 72.3 | 75.4 | |
| Secretion | 100 | 99.0 | 99.0 | 99.0 | 99.0 | 99.0 | 99.0 | 99.0 | 99.0 | 99.0 |
| 51 | 33.3 | 62.7 | 62.7 | 62.7 | 62.7 | 64.7 | 62.7 | 64.7 | 64.7 | |
| Sex | 100 | 87.0 | 85.0 | 82.0 | 85.0 | 83.0 | 86.0 | 84.0 | 86.0 | 85.0 |
| Single | 100 | 94.0 | 87.0 | 86.0 | 79.0 | 85.0 | 91.0 | 89.0 | 90.0 | 90.0 |
| Strains | 93 | 94.6 | 91.4 | 86.0 | 95.7 | 90.3 | 96.8 | 95.7 | 96.8 | 95.7 |
| Support | 10 | 90.0 | 80.0 | 80.0 | 80.0 | 80.0 | 80.0 | 80.0 | 80.0 | 80.0 |
| Surgery | 100 | 97.0 | 98.0 | 98.0 | 98.0 | 98.0 | 98.0 | 98.0 | 98.0 | 98.0 |
| Transient | 100 | 99.0 | 92.9 | 97.0 | 88.9 | 96.0 | 98.0 | 99.0 | 98.0 | 99.0 |
| Transport | 94 | 95.7 | 98.9 | 98.9 | 98.9 | 98.9 | 97.9 | 97.9 | 97.9 | 97.9 |
| Ultrasound | 100 | 83.0 | 84.0 | 84.0 | 82.0 | 82.0 | 84.0 | 84.0 | 84.0 | 82.0 |
| Variation | 100 | 90.0 | 83.0 | 67.0 | 73.0 | 70.0 | 88.0 | 81.0 | 88.0 | 83.0 |
| Weight | 53 | 60.4 | 56.6 | 60.4 | 60.4 | 60.4 | 60.4 | 56.6 | 64.2 | 64.2 |
| 90 | 60.0 | 58.9 | 62.2 | 58.9 | 63.3 | 71.1 | 71.1 | 71.1 | 73.3 | |
Statistical significance with respect to the local context baseline, computed using bootstrap resampling.40
Terms used in McInnes subset16 are shown in italics.
IDF, inverse document frequency.