| Literature DB >> 24303256 |
Lingyun Luo1, Rong Xu, Guo-Qiang Zhang.
Abstract
The complex inner structures of concept names in the Foundational Model of Anatomy (FMA) remain an obstacle for further analyzing the ontology using lexical methods. A very common problem is the ambiguity lying in names with the sometimes multiple occurrences of the preposition "of." In this paper, we propose an automatic method to help disambiguating FMA terms by leveraging the taxonomy and partonomy information. If a sub-phrase of a concept name also appears in its parents, it is likely to occur as a sub-tree in its parse tree, hence should be parsed as such. We classified all the concept names with a single occurrence of the preposition "of" by the appearances of their sub-phrases in the parent names using three test suites. Results show that more than 90% of them can be provided with useful information to assist their correct parsing.Entities:
Year: 2013 PMID: 24303256 PMCID: PMC3845762
Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc
Figure 1:
Two possible ways to parse the term “left surface of heart” as a noun phrase (NP). On the left, the scope of “left” is the noun phrase “surface of heart.” On the right, the scope of “left” is just “surface.”
Figure 2:
Four categories of child-parent relationships with respect to taxonomy (IS-A) and partonomy (Part-of). A : Some sub-phrase in T is contained in a parent term through IS-A, and some sub-phrase in T is also contained in a parent term through Part-Of. Both the sub-phrases and the parent terms may or may not be the same. B : Some sub-phrase in T is contained in a parent term through IS-A, but no sub-phrase in T is contained in any parent term through Part-Of. C : Some sub-phrase in T is contained in a parent term through Part-Of, but no sub-phrase in T is contained in any parent term through IS-A. D : Neither sub-phrase in T is contained in a parent term through Part-Of and nor sub-phrases in T is contained in any parent term through IS-A.
Total counts and percentages of FMA terms into ten categories. Data on categories A, B, C appear in rows two, three, and four. Category D is broken into four columns. The first three columns provide data for categories DA, DB and DC. The fourth column provides the number for category DD, with 12709 terms representing 60.46% of total terms, which are further broken into DDA (245; 1.16%), DDB (10547; 50.17%), DDC (14; 0.06%) and DDD (1903; 9.05%).
|
| |||||||
|---|---|---|---|---|---|---|---|
| A | 321 (1.52%) | ||||||
| B | 3025 (14.39%) | ||||||
| C | 1044 (4.97%) | ||||||
| D | 16631 (79.11%) | ||||||
| DA | DB | DC | DD | ||||
| 396 (1.88%) | 2217 (10.55%) | 1309 (6.22%) | 12709 (60.46%) | ||||
| DDA | DDB | DDC | DDD | ||||
| 245 (1.17%) | 10547 (50.17%) | 14 (0.07%) | 1903 (9.05%) | ||||