Christopher Ochs1, James Geller1, Yehoshua Perl1, Yan Chen2, Junchuan Xu3, Hua Min4, James T Case5, Zhi Wei1. 1. Computer Science Department, New Jersey Institute of Technology, Newark, New Jersey, USA. 2. Computer Information Systems Department, BMCC, CUNY, New York, New York, USA. 3. Division of Knowledge Informatics, NYU, New York, New York, USA. 4. Department of Health Administration and Policy, George Mason University, Fairfax, Virginia, USA. 5. NLM/NIH, Bethesda, Maryland, USA.
Abstract
OBJECTIVE: Standards terminologies may be large and complex, making their quality assurance challenging. Some terminology quality assurance (TQA) methodologies are based on abstraction networks (AbNs), compact terminology summaries. We have tested AbNs and the performance of related TQA methodologies on small terminology hierarchies. However, some standards terminologies, for example, SNOMED, are composed of very large hierarchies. Scaling AbN TQA techniques to such hierarchies poses a significant challenge. We present a scalable subject-based approach for AbN TQA. METHODS: An innovative technique is presented for scaling TQA by creating a new kind of subject-based AbN called a subtaxonomy for large hierarchies. New hypotheses about concentrations of erroneous concepts within the AbN are introduced to guide scalable TQA. RESULTS: We test the TQA methodology for a subject-based subtaxonomy for the Bleeding subhierarchy in SNOMED's large Clinical finding hierarchy. To test the error concentration hypotheses, three domain experts reviewed a sample of 300 concepts. A consensus-based evaluation identified 87 erroneous concepts. The subtaxonomy-based TQA methodology was shown to uncover statistically significantly more erroneous concepts when compared to a control sample. DISCUSSION: The scalability of TQA methodologies is a challenge for large standards systems like SNOMED. We demonstrated innovative subject-based TQA techniques by identifying groups of concepts with a higher likelihood of having errors within the subtaxonomy. Scalability is achieved by reviewing a large hierarchy by subject. CONCLUSIONS: An innovative methodology for scaling the derivation of AbNs and a TQA methodology was shown to perform successfully for the largest hierarchy of SNOMED.
OBJECTIVE: Standards terminologies may be large and complex, making their quality assurance challenging. Some terminology quality assurance (TQA) methodologies are based on abstraction networks (AbNs), compact terminology summaries. We have tested AbNs and the performance of related TQA methodologies on small terminology hierarchies. However, some standards terminologies, for example, SNOMED, are composed of very large hierarchies. Scaling AbN TQA techniques to such hierarchies poses a significant challenge. We present a scalable subject-based approach for AbN TQA. METHODS: An innovative technique is presented for scaling TQA by creating a new kind of subject-based AbN called a subtaxonomy for large hierarchies. New hypotheses about concentrations of erroneous concepts within the AbN are introduced to guide scalable TQA. RESULTS: We test the TQA methodology for a subject-based subtaxonomy for the Bleeding subhierarchy in SNOMED's large Clinical finding hierarchy. To test the error concentration hypotheses, three domain experts reviewed a sample of 300 concepts. A consensus-based evaluation identified 87 erroneous concepts. The subtaxonomy-based TQA methodology was shown to uncover statistically significantly more erroneous concepts when compared to a control sample. DISCUSSION: The scalability of TQA methodologies is a challenge for large standards systems like SNOMED. We demonstrated innovative subject-based TQA techniques by identifying groups of concepts with a higher likelihood of having errors within the subtaxonomy. Scalability is achieved by reviewing a large hierarchy by subject. CONCLUSIONS: An innovative methodology for scaling the derivation of AbNs and a TQA methodology was shown to perform successfully for the largest hierarchy of SNOMED.
Authors: Christopher Ochs; Yehoshua Perl; James Geller; Michael Halper; Huanying Gu; Yan Chen; Gai Elhanan Journal: AMIA Annu Symp Proc Date: 2013-11-16
Authors: Yue Wang; Michael Halper; Duo Wei; Huanying Gu; Yehoshua Perl; Junchuan Xu; Gai Elhanan; Yan Chen; Kent A Spackman; James T Case; George Hripcsak Journal: J Biomed Inform Date: 2011-09-01 Impact factor: 6.317
Authors: Yehoshua Perl; James Geller; Michael Halper; Christopher Ochs; Ling Zheng; Joan Kapusnik-Uner Journal: Ann N Y Acad Sci Date: 2016-10-17 Impact factor: 5.691