arXiv 2302.05543

Adding Conditional Control to Text-to-Image Diffusion Models

By Lvmin Zhang, Anyi Rao, et al.

Published 2023-02-10

Mindmap

Browse the paper's core ideas, clusters, and relationships in a structured outline.

We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions"…

View the original paper on arXiv