Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation

04/24/2023
by   zeyu-lu, et al.
0

Diffusion models have attained impressive visual quality for image synthesis. However, how to interpret and manipulate the latent space of diffusion models has not been extensively explored. Prior work diffusion autoencoders encode the semantic representations into a semantic latent code, which fails to reflect the rich information of details and the intrinsic feature hierarchy. To mitigate those limitations, we propose Hierarchical Diffusion Autoencoders (HDAE) that exploit the fine-grained-to-abstract and lowlevel-to-high-level feature hierarchy for the latent space of diffusion models. The hierarchical latent space of HDAE inherently encodes different abstract levels of semantics and provides more comprehensive semantic representations. In addition, we propose a truncated-feature-based approach for disentangled image manipulation. We demonstrate the effectiveness of our proposed approach with extensive experiments and applications on image reconstruction, style mixing, controllable interpolation, detail-preserving and disentangled image manipulation, and multi-modal semantic image synthesis.

READ FULL TEXT

page 8

page 14

page 15

page 16

page 17

page 18

page 19

page 20

research
07/12/2023

DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation

Diffusion probabilistic models (DPMs) have shown remarkable results on v...
research
03/11/2023

PARASOL: Parametric Style Control for Diffusion Image Synthesis

We propose PARASOL, a multi-modal synthesis model that enables disentang...
research
10/21/2020

Semantics-Guided Representation Learning with Applications to Visual Synthesis

Learning interpretable and interpolatable latent representations has bee...
research
08/03/2020

The pursuit of beauty: Converting image labels to meaningful vectors

A challenge of the computer vision community is to understand the semant...
research
08/03/2020

IntroVAC: Introspective Variational Classifiers for Learning Interpretable Latent Subspaces

Learning useful representations of complex data has been the subject of ...
research
07/03/2020

Collaborative Learning for Faster StyleGAN Embedding

The latent code of the recent popular model StyleGAN has learned disenta...
research
08/06/2023

Photorealistic and Identity-Preserving Image-Based Emotion Manipulation with Latent Diffusion Models

In this paper, we investigate the emotion manipulation capabilities of d...

Please sign up or login with your details

Forgot password? Click here to reset