IntrinsicDiffusion: Joint Intrinsic Layers from Latent Diffusion Models

Jan 1, 2024·

Jundan Luo

Duygu Ceylan

Jae Shin Yoon

Nanxuan Zhao

Julien Philip

Anna Frühstück

Wenbin Li

Christian Richardt

Tuanfeng Y. Wang

· 0 min read

Project DOI

Intrinsic image decomposition and its applications.

Abstract

Reasoning about the intrinsic properties of an image, such as albedo, illumination, and surface geometry, is a long-standing problem with many applications in image editing and compositing. Existing solutions to this ill-posed problem either heavily rely on manually designed priors or learn priors from limited datasets that lack diversity. Hence, they fall short in generalizing to in-the-wild test scenarios. In this paper, we show that a large-scale text-to-image generation model trained on a massive amount of visual data can implicitly learn intrinsic image priors. In particular, we introduce a novel conditioning mechanism built on top of a pre-trained foundational image generation model to jointly predict multiple intrinsic modalities from an input image. We demonstrate that predicting different modalities in a collaborative manner improves the overall quality. This design also enables mixing datasets with annotations of only a subset of the modalities during training, contributing to the generalizability of our approach. Our method achieves state-of-the-art performance in intrinsic image decomposition, both qualitatively and quantitatively. We also demonstrate downstream image editing applications, such as relighting and retexturing.

Type

Conference paper

Publication

SIGGRAPH 2024

Last updated on Jan 1, 2024

← Shadow-aware Object Composition Jan 1, 2025

CRefNet: Learning Consistent Reflectance Estimation with a Decoder-sharing Transformer Dec 1, 2023 →