Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A compact representation of visual speech data using latent variables.

Literature DB >> 24231875

A compact representation of visual speech data using latent variables.

Ziheng Zhou¹, Xiaopeng Hong, Guoying Zhao, Matti Pietikäinen.
1. University of Oulu, Oulu.

Abstract

The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the interspeaker variations of visual appearances and those caused by uttering within images, and incorporates the structural information of the visual data through placing priors of the latent variables along a curve embedded within a path graph.

Mesh：

Year: 2014 PMID： 24231875 DOI： 10.1109/TPAMI.2013.173

Source DB: PubMed Journal: IEEE Trans Pattern Anal Mach Intell ISSN： 0098-5589 Impact factor: 6.226

Keyword Cloud
Cited

2 in total

1. End-to-End Sentence-Level Multi-View Lipreading Architecture with Spatial Attention Module Integrated Multiple CNNs and Cascaded Local Self-Attention-CTC.

Authors: Sanghun Jeon; Mun Sang Kim
Journal: Sensors (Basel) Date: 2022-05-09 Impact factor: 3.847

2. Unsupervised random forest for affinity estimation.

Authors: Yunai Yi; Diya Sun; Peixin Li; Tae-Kyun Kim; Tianmin Xu; Yuru Pei
Journal: Comput Vis Media (Beijing) Date: 2021-12-06

2 in total