Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Phoneme-to-viseme mappings: the good, the bad, and the ugly

May 08, 2018

Helen L Bear, Richard Harvey

Figure 1 for Phoneme-to-viseme mappings: the good, the bad, and the ugly

Figure 2 for Phoneme-to-viseme mappings: the good, the bad, and the ugly

Figure 3 for Phoneme-to-viseme mappings: the good, the bad, and the ugly

Figure 4 for Phoneme-to-viseme mappings: the good, the bad, and the ugly

Share this with someone who'll enjoy it:

Abstract:Visemes are the visual equivalent of phonemes. Although not precisely defined, a working definition of a viseme is "a set of phonemes which have identical appearance on the lips". Therefore a phoneme falls into one viseme class but a viseme may represent many phonemes: a many to one mapping. This mapping introduces ambiguity between phonemes when using viseme classifiers. Not only is this ambiguity damaging to the performance of audio-visual classifiers operating on real expressive speech, there is also considerable choice between possible mappings. In this paper we explore the issue of this choice of viseme-to-phoneme map. We show that there is definite difference in performance between viseme-to-phoneme mappings and explore why some maps appear to work better than others. We also devise a new algorithm for constructing phoneme-to-viseme mappings from labeled speech data. These new visemes, `Bear' visemes, are shown to perform better than previously known units.

* Speech Communication, Special Issue on AV expressive speech. 2017

View paper on

Share this with someone who'll enjoy it:

Title:Phoneme-to-viseme mappings: the good, the bad, and the ugly

Paper and Code