Lightweight Binocular Facial Performance Capture under Uncontrolled Lighting
SIGGRAPH Asia 2012
Recent progress in passive facial performance capture has shown
impressively detailed results on highly articulated
motion. However, most methods rely on complex multi-camera
set-ups, controlled lighting or fiducial markers. This prevents
them from being used in general environments, outdoor scenes,
during live action on a film set, or by freelance animators and
everyday users who want to capture their digital selves. In this
paper, we therefore propose a lightweight passive facial
performance capture approach that is able to reconstruct
high-quality dynamic facial geometry from only a single pair of
stereo cameras. Our method succeeds under uncontrolled and
time-varying lighting, and also in outdoor scenes. Our approach
builds upon and extends recent image-based scene flow
computation, lighting estimation and shading-based refinement
algorithms. It integrates them into a pipeline that is
specifically tailored towards facial performance reconstruction
from challenging binocular footage under uncontrolled
lighting. In an experimental evaluation, the strong capabilities
of our method become explicit: We achieve detailed and
spatio-temporally coherent results for expressive facial motion
in both indoor and outdoor scenes -- even from low quality input
images recorded with a hand-held consumer stereo camera. We
believe that our approach is the first to capture facial
performances of such high quality from a single stereo rig and
we demonstrate that it brings facial performance capture out of
the studio, into the wild, and within the reach of everybody.
Supplementary video to the paper
Additional video showing a result for 560 frames (around 22 seconds)
@inproceedings{VWBS12, author = {Levi Valgaerts and Chenglei Wu and Andrés Bruhn and Hans-Peter Seidel and Christian Theobalt}, title = {Lightweight Binocular Facial Performance Capture under Uncontrolled Lighting}, booktitle = {ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2012)}, volume = {31}, number = {6}, pages = {187:1--187:11}, month = {November}, year = {2012}, url = {}, doi = {10.1145/2366145.2366206} }
Data Sets
We make the data shown in the following video available on
It consists of 200 frames of a stereo sequence captured at 25
fps (around 8 seconds) and the corresponding textured
spatio-temporally coherent 3D reconstructions. As mentioned in
the paper, synchronisation of the left and right video sequence
is performed by event, which makes it sub-frame accurate at
best. The face meshes have a resolution of 500K vertices, which
is 5 times higher than the results shown in the paper and the
supplementary video. The algorithm used to obtain these results
is the one described in the paper, with the higher resolution
only improving the captured detail.
Terms of use: The provided data is intended for research
purposes only and any use of it for non-scientific means is not
allowed. This includes the publishing of any scientific results
obtained with our data in non-scientific literature, such as
tabloid press. We ask the researcher to respect our actors and
not to use the data for any distasteful manipulations (such as
hideous deformations, exploding heads, manipulations that might
be culturally sensitive,...). We also ask the researcher not to
disseminate this data outside of his or her institute;
distribution within the affiliated institution is allowed.
Requesting the data: Please understand that we can only
make the data available to senior project managers or senior
researchers. To keep track of researchers and institutions
requesting the data and to ascertain that you abide by the above
terms of use, we make the data available after sending an email
stating the following:
Your name, title and institution.
Your intended use of the data.
A statement saying that you accept the following terms:
The rights to use, copy and distribute the 3D
reconstructions and image sequences provided on this
website are under the supervision of Prof. Christian
Theobalt of the Graphics, Vision & Video group at the
Max-Planck-Institute for Informatics, Saarbrücken.
You are given permission to copy this data in electronic
form and to distribute it within your institute for
scientific purposes only. Inclusion of rendered results
obtained from this data in a scholarly publication
(printed or electronic) is permitted. In this case, the
following sentence must be added to the acknowledgements
section of your paper: The captured performance data
were provided courtesy of the research group Graphics,
Vision & Video of the Max-Planck-Institute for
Informatics and the following paper must be cited:
Lightweight Binocular Facial Performance Capture
under Uncontrolled Lighting. For any usage other
than your intended scientific research, written
permission is required from Christian Theobalt. Any
commercial use is hereby excluded.
More data sets will be made available in the near future.