| Literature DB >> 19007443 |
Patrick Ruch1, Julien Gobeill, Christian Lovis, Antoine Geissbühler.
Abstract
BACKGROUND: In this paper, we describe the design and preliminary evaluation of a new type of tools to speed up the encoding of episodes of care using the SNOMED CT terminology.Entities:
Mesh:
Year: 2008 PMID: 19007443 PMCID: PMC2582793 DOI: 10.1186/1472-6947-8-S1-S6
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Figure 1Output of the tool (categorization mode): these six categories are associated to the abstract shown in Figure 2. Some strings are duplicated because they refer to different concepts. The two top ranked concepts (Burkholderia cepacia; Cystic fibrosis;) are precisely those expected by the manual MeSH annotation of the article. Categories proposed at lower ranks (glucuronic acid;fibrosis) are irrelevant regarding the manual annotation performed by NLM (National Library of Medicine) librarians.
Figure 2Citation with MeSH terms provided by professional indexers for PMID: 11506920.
Figure 3Output of the tool (browsing mode) with the query "Production in vitro, on different solid culture media, of two distinct exopolysaccharides by a mucoid clinical strain of Burkholderia cepacia": nineteen categories are displayed. The score associated with every predicted category drops after one to two terms, meaning that the quality of the association drops significantly.
Results for RegEx and (tf.idf) classifiers. weighting schemas. For the VS engine, tf.idf parameters are provided: the first triplet indicates the weighting applied to the "document collection", i.e. the concepts, while the second is for the "query collection", i.e. the abstracts.
| System or parameters | Top precision | Mean average precision |
| RegEx | 0.641 | 0.400 |
| tf.idf (VS) | ||
| lnc.atn | 0.696 | 0.35525 |
| anc.atn | 0.691 | 0.3545 |
| ltc.atn | 0.75 | 0.33525 |
| ltc.lnn | 0.637 | 0.2775 |
Results of the system when combining the vector space and the regular expression modules.
| Weighting function concepts.abstracts | Top Precision | Mean average Precision |
| Hybrids: tf.idf (VS) + RegEx | ||
| ltc.lnn | 0.800 | 0.4545 |
| lnc.lnn | 0.791 | 0.453 |
| anc.ntn | 0.787 | 0.4515 |
| atn.ntn | 0.823 | 0.4485 |