Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Investigating the impact of weakly supervised data on text mining models of publication transparency: a case study on randomized controlled trials.

Literature DB >> 35854729

Investigating the impact of weakly supervised data on text mining models of publication transparency: a case study on randomized controlled trials.

Linh Hoanga¹, Lan Jiang¹, Halil Kilicoglu¹.

Abstract

Lack of large quantities of annotated data is a major barrier in developing effective text mining models of biomedical literature. In this study, we explored weak supervision to improve the accuracy of text classification models for assessing methodological transparency of randomized controlled trial (RCT) publications. Specifically, we used Snorkel, a framework to programmatically build training sets, and UMLS-EDA, a data augmentation method that leverages a small number of labeled examples to generate new training instances, and assessed their effect on a BioBERT-based text classification model proposed for the task in previous work. Performance improvements due to weak supervision were limited and were surpassed by gains from hyperparameter tuning. Our analysis suggests that refinements to the weak supervision strategies to better deal with multi-label case could be beneficial. Our code and data are available at https://github.com/kilicogluh/CONSORT-TM/tree/master/weakSupervision. ©2022 AMIA - All rights reserved.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35854729 PMCID： PMC9285178

Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN： 1559-4076

Keyword Cloud
References

24 in total

1. The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors: Olivier Bodenreider
Journal: Nucleic Acids Res Date: 2004-01-01 Impact factor: 16.971

Review 2. Avoidable waste in the production and reporting of research evidence.

Authors: Iain Chalmers; Paul Glasziou
Journal: Lancet Date: 2009-06-12 Impact factor: 79.321

3. Advancing PICO element detection in biomedical text via deep neural networks.

Authors: Di Jin; Peter Szolovits
Journal: Bioinformatics Date: 2020-06-01 Impact factor: 6.937

4. A clinical text classification paradigm using weak supervision and deep representation.

Authors: Yanshan Wang; Sunghwan Sohn; Sijia Liu; Feichen Shen; Liwei Wang; Elizabeth J Atkinson; Shreyasee Amin; Hongfang Liu
Journal: BMC Med Inform Decis Mak Date: 2019-01-07 Impact factor: 2.796

5. ExaCT: automatic extraction of clinical trial characteristics from journal publications.

Authors: Svetlana Kiritchenko; Berry de Bruijn; Simona Carini; Joel Martin; Ida Sim
Journal: BMC Med Inform Decis Mak Date: 2010-09-28 Impact factor: 2.796

6. Automatic classification of sentences to support Evidence Based Medicine.

Authors: Su Nam Kim; David Martinez; Lawrence Cavedon; Lars Yencken
Journal: BMC Bioinformatics Date: 2011-03-29 Impact factor: 3.169

Review 7. Automating data extraction in systematic reviews: a systematic review.

Authors: Siddhartha R Jonnalagadda; Pawan Goyal; Mark D Huffman
Journal: Syst Rev Date: 2015-06-15

8. Ontology-driven weak supervision for clinical entity classification in electronic health records.

Authors: Jason A Fries; Ethan Steinberg; Saelig Khattar; Scott L Fleming; Jose Posada; Alison Callahan; Nigam H Shah
Journal: Nat Commun Date: 2021-04-01 Impact factor: 14.919

Review 9. Does use of the CONSORT Statement impact the completeness of reporting of randomised controlled trials published in medical journals? A Cochrane review.

Authors: Lucy Turner; Larissa Shamseer; Douglas G Altman; Kenneth F Schulz; David Moher
Journal: Syst Rev Date: 2012-11-29

10. Bio-SCoRes: A Smorgasbord Architecture for Coreference Resolution in Biomedical Text.

Authors: Halil Kilicoglu; Dina Demner-Fushman
Journal: PLoS One Date: 2016-03-02 Impact factor: 3.240