Generative adversarial networks achieve great performance in photorealistic image synthesis in various domains, including human images. However, they usually employ latent vector inputs encoding sampled output images globally. This does not allow convenient control of semantically-relevant individual parts of the image, and is not able to draw samples that only differ in partial aspects, such asclothing style. We address these limitations and present a generative model for images of dressed humans offering control over pose, local body part appearance and garment style. This is the first method to solve various aspect of human image generation such as global appearance sampling, pose transfer, parts and garment transfer, and parts ampling jointly in an unified framework. As our model encodes part-based latent appearance vectors in a normalized pose-independent space and warps them to different poses, it preserves body and clothing appearance under varying posture. Experiments show that our flexible and general generative method outperforms task-specific baselines for pose-conditioned image generation, pose transfer and part sampling in terms of realism and output resolution.

Animation Swapp

Results on Pose-Transfer

For the pose-transfer experiment, we have used the train/test pairs of DeepFashion dataset that was also used in the existing works such as PoseGAN, DPT, CBI, etc. Specifically, our training and testing pairs were generated from the publically available code of PoseGAN. In this page, we provide our results for the 176 testing pairs (a subset of the full testing pairs) that was used in the paper for quantitative results. Please find our results in the Downloads section.


  • Paper, PDF 30.1 MB

  • Results (512x512)
    176 testing pairs
    ~1 MB

  • Results (512x512)
    entire test set (8554 testing pairs)
    ~140 MB


	title={HumanGAN: A Generative Model of Human Images},
	author={Sarkar, Kripasindhu and Liu, Lingjie and Golyanik, Vladislav and Theobalt, Christian},
	booktitle={2021 International Conference on 3D Vision (3DV)},


This work was partially supported by the ERC Consolidator Grant 4DReply (770784).


For questions and clarifications please get in touch with:
Kripasindhu Ksarkar ksarkar@mpi-inf.mpg.de

This page is Zotero translator friendly. Page last updated Imprint. Data Protection.