| Literature DB >> 11417200 |
K Viikki1, M Juhola, I Pyykkö, P Honkavaara.
Abstract
Decision tree induction, as well as other inductive learning methods, requires training data of high quality to be able to generate accurate and reliable classification models. Example cases should form a representative sample from the application area, and the attributes used to describe example cases should be relevant and adequate for the classification task to be solved. In this paper, measures of the strength of association and an entropy-based approach have been used to assess the quality of the training data. Studied classification tasks related to three otological data sets: a conscript data set, a vertigo data set, and a postoperative nausea and vomiting data set. The paper suggests that the studied approaches give some guidelines about the quality of the training data, but other approaches are also needed to guide training data building.Entities:
Mesh:
Year: 2001 PMID: 11417200 DOI: 10.1023/a:1005624715089
Source DB: PubMed Journal: J Med Syst ISSN: 0148-5598 Impact factor: 4.460