IntrinsicDiffusion: Joint Intrinsic Layers from Latent Diffusion Models

Jan 1, 2024·
Jundan Luo
,
Duygu Ceylan
,
Jae Shin Yoon
,
Nanxuan Zhao
,
Julien Philip
,
Anna Frühstück
,
Wenbin Li
,
Christian Richardt
,
Tuanfeng Y. Wang
· 0 min read
Intrinsic image decomposition and its applications.
Abstract
Reasoning about the intrinsic properties of an image, such as albedo, illumination, and surface geometry, is a long-standing problem with many applications in image editing and compositing. Existing solutions to this ill-posed problem either heavily rely on manually designed priors or learn priors from limited datasets that lack diversity. Hence, they fall short in generalizing to in-the-wild test scenarios. In this paper, we show that a large-scale text-to-image generation model trained on a massive amount of visual data can implicitly learn intrinsic image priors. In particular, we introduce a novel conditioning mechanism built on top of a pre-trained foundational image generation model to jointly predict multiple intrinsic modalities from an input image. We demonstrate that predicting different modalities in a collaborative manner improves the overall quality. This design also enables mixing datasets with annotations of only a subset of the modalities during training, contributing to the generalizability of our approach. Our method achieves state-of-the-art performance in intrinsic image decomposition, both qualitatively and quantitatively. We also demonstrate downstream image editing applications, such as relighting and retexturing.
Type
Publication
SIGGRAPH 2024