| Literature DB >> 29854174 |
Youngjun Kim1, Ellen Riloff1, Stéphane M Meystre2.
Abstract
Classifying relations between pairs of medical concepts in clinical texts is a crucial task to acquire empirical evidence relevant to patient care. Due to limited labeled data and extremely unbalanced class distributions, medical relation classification systems struggle to achieve good performance on less common relation types, which capture valuable information that is important to identify. Our research aims to improve relation classification using weakly supervised learning. We present two clustering-based instance selection methods that acquire a diverse and balanced set of additional training instances from unlabeled data. The first method selects one representative instance from each cluster containing only unlabeled data. The second method selects a counterpart for each training instance using clusters containing both labeled and unlabeled data. These new instance selection methods for weakly supervised learning achieve substantial recall gains for the minority relation classes compared to supervised learning, while yielding comparable performance on the majority relation classes.Entities:
Mesh:
Year: 2018 PMID: 29854174 PMCID: PMC5977715
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076