| Literature DB >> 29074582 |
Dileep George1, Wolfgang Lehrach2, Ken Kansky2, Miguel Lázaro-Gredilla1, Christopher Laan2, Bhaskara Marthi2, Xinghua Lou2, Zhaoshi Meng2, Yi Liu2, Huayan Wang2, Alex Lavin2, D Scott Phoenix2.
Abstract
Learning from a few examples and generalizing to markedly different situations are capabilities of human visual intelligence that are yet to be matched by leading machine learning models. By drawing inspiration from systems neuroscience, we introduce a probabilistic generative model for vision in which message-passing-based inference handles recognition, segmentation, and reasoning in a unified way. The model demonstrates excellent generalization and occlusion-reasoning capabilities and outperforms deep neural networks on a challenging scene text recognition benchmark while being 300-fold more data efficient. In addition, the model fundamentally breaks the defense of modern text-based CAPTCHAs (Completely Automated Public Turing test to tell Computers and Humans Apart) by generatively segmenting characters without CAPTCHA-specific heuristics. Our model emphasizes aspects such as data efficiency and compositionality that may be important in the path toward general artificial intelligence.Entities:
Mesh:
Year: 2017 PMID: 29074582 DOI: 10.1126/science.aag2612
Source DB: PubMed Journal: Science ISSN: 0036-8075 Impact factor: 47.728