The reconstruction of dense 3D models of face geometry
and appearance from a single image is highly challenging
and ill-posed. To constrain the problem, many approaches
rely on strong priors, such as parametric face
models learned from limited 3D scan data. However, prior
models restrict generalization of the true diversity in facial
geometry, skin reflectance and illumination. To alleviate
this problem, we present the first approach that jointly
learns 1) a regressor for face shape, expression, reflectance
and illumination on the basis of 2) a concurrently learned
parametric face model. Our multi-level face model combines
the advantage of 3D Morphable Models for regularization
with the out-of-space generalization of a learned
corrective space. We train end-to-end on in-the-wild images
without dense annotations by fusing a convolutional
encoder with a differentiable expert-designed renderer and
a self-supervised training loss, both defined at multiple detail
levels. Our approach compares favorably to the stateof-
the-art in terms of reconstruction quality, better generalizes
to real world faces, and runs at over 250 Hz.
@InProceedings{tewari2017self, title = {Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250 Hz}, author = {Tewari, Ayush and Zollh{\"o}fer, Michael and Garrido, Pablo and Bernard, Florian and Kim, Hyeongwoo and P{\'e}rez, Patrick and Theobalt, Christian}, booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, year = {2018} }