Abstract

We present D-Rex, a person-specific framework for photorealistic, relightable, expressive, and animatable full-body human avatars with free-viewpoint rendering. Existing methods for relightable full-body avatars rely on explicit 3D intrinsic decomposition with analytic reflectance models, which require accurate geometry registration and careful optimization to capture realistic light transport effects. This tight coupling of relighting with avatar modeling has hindered expressiveness: to our knowledge, no existing method demonstrates strong facial animation alongside relighting, limiting applicability in telepresence, gaming, and virtual production. We propose to decouple relighting entirely from avatar modeling by treating it as an image-space post-process: a learned translation from flat-lit, albedo-like renderings to a target HDR illumination. To this end, we leverage the strong generative prior of a pre-trained video diffusion relighting model, fine-tuned via LoRA on paired flat-lit and relit frames captured in a light stage. The flat-lit driving frames are produced by an independent expressive full-body avatar framework trained under white-light conditions, requiring no modification to support relighting, making D-Rex directly applicable to any white-light avatar system. We demonstrate that D-Rex enables view- and temporally consistent relighting while faithfully preserving expressive motion and fine-grained facial detail, outperforming physically-based relightable avatar baselines.

Method Overview

D-Rex overview. Given a calibrated sequence of flat-lit and HDR-illuminated multi-view frame pairs, D-Rex trains two independent components. An expressive, controllable albedo avatar (EVA) is trained on the flat-lit frames to render albedo-like images for arbitrary pose, expression, and viewpoint. In parallel, a video diffusion relighting model is fine-tuned via LoRA on the flat-lit to relit frame pairs, learning a person-specific relighting function that translates flat-lit images to a target illumination. At inference, the relighting function is applied to renderings from the albedo avatar, yielding the relightable avatar.

Results

Interactive Demo

Camera

Light

Time

Citation

@misc{teufel2025drex,
  title         = {D-Rex: Diffusion Rendering for Relightable Expressive Avatars},
  author        = {Timo Teufel and Xilong Zhou and Umar Iqbal and Jan Kautz and Marc Habermann and Vladislav Golyanik and Christian Theobalt},
  year          = {2025},
  archivePrefix = {arXiv},
  primaryClass  = {cs.CV}
}