OBJECTIVE: Learning of classification models in medicine often relies on data labeled by a human expert. Since labeling of clinical data may be time-consuming, finding ways of alleviating the labeling costs is critical for our ability to automatically learn such models. In this paper we propose a new machine learning approach that is able to learn improved binary classification models more efficiently by refining the binary class information in the training phase with soft labels that reflect how strongly the human expert feels about the original class labels. MATERIALS AND METHODS: Two types of methods that can learn improved binary classification models from soft labels are proposed. The first relies on probabilistic/numeric labels, the other on ordinal categorical labels. We study and demonstrate the benefits of these methods for learning an alerting model for heparin induced thrombocytopenia. The experiments are conducted on the data of 377 patient instances labeled by three different human experts. The methods are compared using the area under the receiver operating characteristic curve (AUC) score. RESULTS: Our AUC results show that the new approach is capable of learning classification models more efficiently compared to traditional learning methods. The improvement in AUC is most remarkable when the number of examples we learn from is small. CONCLUSIONS: A new classification learning framework that lets us learn from auxiliary soft-label information provided by a human expert is a promising new direction for learning classification models from expert labels, reducing the time and cost needed to label data.
OBJECTIVE: Learning of classification models in medicine often relies on data labeled by a human expert. Since labeling of clinical data may be time-consuming, finding ways of alleviating the labeling costs is critical for our ability to automatically learn such models. In this paper we propose a new machine learning approach that is able to learn improved binary classification models more efficiently by refining the binary class information in the training phase with soft labels that reflect how strongly the human expert feels about the original class labels. MATERIALS AND METHODS: Two types of methods that can learn improved binary classification models from soft labels are proposed. The first relies on probabilistic/numeric labels, the other on ordinal categorical labels. We study and demonstrate the benefits of these methods for learning an alerting model for heparin induced thrombocytopenia. The experiments are conducted on the data of 377 patient instances labeled by three different human experts. The methods are compared using the area under the receiver operating characteristic curve (AUC) score. RESULTS: Our AUC results show that the new approach is capable of learning classification models more efficiently compared to traditional learning methods. The improvement in AUC is most remarkable when the number of examples we learn from is small. CONCLUSIONS: A new classification learning framework that lets us learn from auxiliary soft-label information provided by a human expert is a promising new direction for learning classification models from expert labels, reducing the time and cost needed to label data.
Entities:
Keywords:
data labeling by human experts; machine learning; soft-label information
Authors: W M Tierney; J Fitzgerald; R McHenry; B J Roth; B Psaty; D L Stump; F K Anderson Journal: Med Decis Making Date: 1986 Jan-Mar Impact factor: 2.583
Authors: Mandy Lu; Qingyu Zhao; Kathleen L Poston; Edith V Sullivan; Adolf Pfefferbaum; Marian Shahid; Maya Katz; Leila Montaser Kouhsari; Kevin Schulman; Arnold Milstein; Juan Carlos Niebles; Victor W Henderson; Li Fei-Fei; Kilian M Pohl; Ehsan Adeli Journal: Med Image Anal Date: 2021-07-21 Impact factor: 13.828