Abstract

Generating controllable and photorealistic digital human avatars is a long-standing and important problem in Vision and Graphics. Recent methods have shown great progress in terms of either photorealism or inference speed while the combination of the two desired properties still remains unsolved. To this end, we propose a novel method, called DELIFFAS, which parameterizes the appearance of the human as a surface light field that is attached to a controllable and deforming human mesh model. At the core, we represent the light field around the human with a deformable two-surface parameterization, which enables fast and accurate inference of the human appearance. This allows perceptual supervision on the full image compared to previous approaches that could only supervise individual pixels or small patches due to their slow runtime. Our carefully designed human representation and supervision strategy leads to state-of-the-art synthesis results and inference time.

Main Video

Overview

Given the skeletal motion and the target camera pose, we synthesize highly detailed appearance of a subject in real-time using a surface light field parameterized by our two surface representation. We first obtain the inner surface using the motion-dependent deformable human model. The outer surface is constructed by offsetting the inner surface vertice along its normal axis. For each camera ray, we obtain the uv coordinates of the intersecting points with the two surfaces from the image-space uv maps. Then, we bilinearly sample the features at the corresponding uv coordinates from the temporal normal feature map of each mesh. The sampled two features together with the two intersection uv coordinates are fed into the light field MLP, which in turn generates the color value. The rendered 2D image is supervised with the L1 and perceptual losses.

Qualitative results: A. novel view synthesis

Qualitative results: B. novel pose synthesis

Comparison: A. novel view synthesis

Comparison: B. novel pose synthesis

Citation

@article{kwon2023deliffas,
title = {DELIFFAS: Deformable Light Fields for Fast Avatar Synthesis},
author = {Kwon, Youngjoong and Liu, Lingjie and Fuchs, Henry and Habermann, Marc and Theobalt, Christian},
year = {2023},
journal={Advances in Neural Information Processing Systems}
}