Literature DB >> 28114000

Visually Grounded Meaning Representations.

Carina Silberer, Vittorio Ferrari, Mirella Lapata.   

Abstract

In this paper we address the problem of grounding distributional representations of lexical meaning. We introduce a new model which uses stacked autoencoders to learn higher-level representations from textual and visual input. The visual modality is encoded via vectors of attributes obtained automatically from images. We create a new large-scale taxonomy of 600 visual attributes representing more than 500 concepts and 700 K images. We use this dataset to train attribute classifiers and integrate their predictions with text-based distributional models of word meaning. We evaluate our model on its ability to simulate word similarity judgments and concept categorization. On both tasks, our model yields a better fit to behavioral data compared to baselines and related models which either rely on a single modality or do not make use of attribute-based input.

Year:  2016        PMID: 28114000     DOI: 10.1109/TPAMI.2016.2635138

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  3 in total

1.  How the Brain Dynamically Constructs Sentence-Level Meanings From Word-Level Features.

Authors:  Nora Aguirre-Celis; Risto Miikkulainen
Journal:  Front Artif Intell       Date:  2022-04-21

2.  A test of indirect grounding of abstract concepts using multimodal distributional semantics.

Authors:  Akira Utsumi
Journal:  Front Psychol       Date:  2022-10-04

3.  Linguistic issues behind visual question answering.

Authors:  Raffaella Bernardi; Sandro Pezzelle
Journal:  Lang Linguist Compass       Date:  2021-06-04
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.