VisualDubbing

Graphics, Vision & Video

VDub: Modifying Face Video of Actors for
Plausible Visual Alignment to a Dubbed Audio Track

Eurographics 2015

Pablo Garrido¹

Levi Valgaerts¹

Hamid Sarmadi¹

Ingmar Steiner²

Kiran Varanasi³

Patrick Perez³

Christian Theobalt¹

¹MPI for Informatics

²Saarland University and DFKI GmbH

³Technicolor

Abstract

Video

Bibtex

Abstract

In many countries, foreign movies and TV productions are dubbed, i.e., the original voice of an actor is replaced with a translation that is spoken by a dubbing actor in the country's own language. Dubbing is a complex process that requires specific translations and accurately timed recitations such that the new audio at least coarsely adheres to the mouth motion in the video. However, since the sequence of phonemes and visemes in the original and the dubbing language are different, the video-to-audio match is never perfect, which is a major source of visual discomfort. In this paper, we propose a system to alter the mouth motion of an actor in a video, so that it matches the new audio track. Our paper builds on high-quality monocular 3D facial performance, lighting and albedo capture of the dubbing and target actors, and uses audio analysis in combination with a space-time retrieval method to synthesize a new photo-realistically rendered and highly detailed 3D shape model of the mouth region to replace the target performance. We demonstrate plausible visual quality of our results compared to footage that has been professionally dubbed in the traditional way, both qualitatively and through a user study.


Paper pdf (2.6M) / (47.4M)	Supplementary Material pdf (13.3M)	Supplementary Video mp4 (132.9M)	Presentation pptx (148.0M)

Videos

Supplementary video to the paper

Results obtained by traditional dubbing

Bibtex

@article{GVSSVPT15,
  author    = {Pablo Garrido and Levi Valgaerts and Hamid Sarmadi and Ingmar Steiner and Kiran Varanasi and Patrick Perez and Christian Theobalt},
  title     = {VDub: Modifying Face Video of Actors for Plausible Visual Alignment to a Dubbed Audio Track},
  booktitle = {Comput. Graph. Forum (Proc. Eurographics)},
  volume    = {34},
  number    = {2},
  pages     = {193--204},
  year      = {2015}
}