MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis
Abstract
Conventional methods for human motion synthesis have either been deterministic or have had to struggle with the trade-off between motion diversity vs~motion quality. In response to these limitations, we introduce MoFusion, i.e., a new denoising-diffusion-based framework for high-quality conditional human motion synthesis that can synthesise long, temporally plausible, and semantically accurate motions based on a range of conditioning contexts (such as music and text). We also present ways to introduce well-known kinematic losses for motion plausibility within the motion diffusion framework through our scheduled weighting strategy. We also present ways to introduce well-known kinematic losses for motion plausibility within the motion diffusion framework through our scheduled weighting strategy. The learned latent space can be used for several interactive motion-editing applications like in-betweening, seed-conditioning, and text-based editing, thus, providing crucial abilities for virtual-character animation and robotics. Through comprehensive quantitative evaluations and a perceptual user study, we demonstrate the effectiveness of MoFusion compared to the state-of-the-art on established benchmarks in the literature. We urge the reader to watch our supplementary video.
Video
Reverse Diffusion Process for Human Motion Synthesis
Text-to-Motion Generation
Results
Music-to-Dance Generation
Results
Seed Conditioned Motion Forecasting
Quality Comparison with State-of-the-art
We observe better perceptual quality in dance generation on unseen music as compared to ground truth data and state-of-the-art in Music-to-Dance Generation, despite having higher FID. Note that lower FID doesn't correspond to better synthesis quality as seen in the examples below.
Downloads
Citation
@InProceedings{dabral2022mofusion, title={MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis}, author={Rishabh Dabral and Muhammad Hamza Mughal and Vladislav Golyanik and Christian Theobalt}, booktitle={Computer Vision and Pattern Recognition (CVPR)}, year={2023} }
Contact
For questions, clarifications, please get in touch with:Rishabh Dabral
rdabral@mpi-inf.mpg.de
Vladislav Golyanik
golyanik@mpi-inf.mpg.de