| Literature DB >> 24231875 |
Ziheng Zhou1, Xiaopeng Hong, Guoying Zhao, Matti Pietikäinen.
Abstract
The problem of visual speech recognition involves the decoding of the video dynamics of a talking mouth in a high-dimensional visual space. In this paper, we propose a generative latent variable model to provide a compact representation of visual speech data. The model uses latent variables to separately represent the interspeaker variations of visual appearances and those caused by uttering within images, and incorporates the structural information of the visual data through placing priors of the latent variables along a curve embedded within a path graph.Mesh:
Year: 2014 PMID: 24231875 DOI: 10.1109/TPAMI.2013.173
Source DB: PubMed Journal: IEEE Trans Pattern Anal Mach Intell ISSN: 0098-5589 Impact factor: 6.226