Literature DB >> 30882087

Learning the Structure of Generative Models without Labeled Data.

Stephen H Bach1, Bryan He1, Alexander Ratner1, Christopher Ré1.   

Abstract

Curating labeled training data has become the primary bottleneck in machine learning. Recent frameworks address this bottleneck with generative models to synthesize labels at scale from weak supervision sources. The generative model's dependency structure directly affects the quality of the estimated labels, but selecting a structure automatically without any labeled data is a distinct challenge. We propose a structure estimation method that maximizes the ℓ 1-regularized marginal pseudolikelihood of the observed data. Our analysis shows that the amount of unlabeled data required to identify the true structure scales sublinearly in the number of possible dependencies for a broad class of models. Simulations show that our method is 100× faster than a maximum likelihood approach and selects 1/4 as many extraneous dependencies. We also show that our method provides an average of 1.5 F1 points of improvement over existing, user-developed information extraction applications on real-world data such as PubMed journal abstracts.

Entities:  

Year:  2017        PMID: 30882087      PMCID: PMC6417840     

Source DB:  PubMed          Journal:  Proc Mach Learn Res


  6 in total

1.  Training Complex Models with Multi-Task Weak Supervision.

Authors:  Alexander Ratner; Braden Hancock; Jared Dunnmon; Frederic Sala; Shreyash Pandey; Christopher Ré
Journal:  Proc Conf AAAI Artif Intell       Date:  2019 Jan-Feb

2.  Snorkel DryBell: A Case Study in Deploying Weak Supervision at Industrial Scale.

Authors:  Stephen H Bach; Daniel Rodriguez; Yintao Liu; Chong Luo; Haidong Shao; Cassandra Xia; Souvik Sen; Alex Ratner; Braden Hancock; Houman Alborzi; Rahul Kuchhal; Chris Ré; Rob Malkin
Journal:  Proc ACM SIGMOD Int Conf Manag Data       Date:  2019 Jun-Jul

3.  Snuba: Automating Weak Supervision to Label Training Data.

Authors:  Paroma Varma; Christopher Ré
Journal:  Proceedings VLDB Endowment       Date:  2018-11

4.  Weakly supervised classification of aortic valve malformations using unlabeled cardiac MRI sequences.

Authors:  Jason A Fries; Paroma Varma; Vincent S Chen; Ke Xiao; Heliodoro Tejeda; Priyanka Saha; Jared Dunnmon; Henry Chubb; Shiraz Maskatia; Madalina Fiterau; Scott Delp; Euan Ashley; Christopher Ré; James R Priest
Journal:  Nat Commun       Date:  2019-07-15       Impact factor: 14.919

5.  Hierarchical deep learning models using transfer learning for disease detection and classification based on small number of medical images.

Authors:  Guangzhou An; Masahiro Akiba; Kazuko Omodaka; Toru Nakazawa; Hideo Yokota
Journal:  Sci Rep       Date:  2021-03-01       Impact factor: 4.379

6.  Materials information extraction via automatically generated corpus.

Authors:  Rongen Yan; Xue Jiang; Weiren Wang; Depeng Dang; Yanjing Su
Journal:  Sci Data       Date:  2022-07-13       Impact factor: 8.501

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.