Project 5

Fun With Diffusion Models!

Yuan-Hao Huang

yhhuang20@berkeley.edu

Overview

This project plays around with diffusion models. In part A, we use diffusion models to obtain interesting results. In part B, we implement our own diffusion model using U-Net.

Part A: The Power of Diffusion Models!

0. Setup

The smaller and larger images are 64x64, 256x256, respectively.

num_inference_steps = 5:

num_inference_steps = 10:

num_inference_steps = 20:

Clearly, the images generated with larger num_inference_steps are clearer and have more details. All images correspond accurately to the prompt. The random seed used here is 180.

1.1 Implementing the Forward Process

Test image at different noise levels:

1.2 Classical Denoising

Gaussian blur filtering :

1.3 One-Step Denoising

Original image:

Noisy image:

Estimate of the original image using one-step denoise:

1.4 Iterative Denoising

Intermediate noisy image during denoising:

Denoising results (64x64, 256x256):

1.5 Diffusion Model Sampling

1.6 Classifier-Free Guidance (CFG)

gamma = 7:

Observation: The images generated is much more colorful than the previous algorithm.

It is also fun playing with the scale:

gamma = -70:

gamma = -15:

gamma = -7:

gamma = 0:

gamma = 7:

gamma = 15:

gamma = 70:

1.7 Image-to-image Translation

Original image:

SDEdit:

Original image:

SDEdit:

Original image:

SDEdit:

1.7.1 Editing Hand-Drawn and Web Images

(Web) Original image:

SDEdit:

(Web) Original image:

SDEdit:

(Hand-Drawn) Original image:

SDEdit:

(Hand-Drawn) Original image:

SDEdit:

(Hand-Drawn) Original image:

SDEdit:

1.7.2 Inpainting

Original image:

Inpainted:

Original image:

Mask:

Inpainted:

Original image:

Mask:

Inpainted:

1.7.3 Text-Conditional Image-to-image Translation

Original image:

Prompt: a photo of a dog

Original image:

Prompt: a man wearing a hat

1.8 Visual Anagrams

an oil painting of an old man

an oil painting of people around a campfire

a photo of the amalfi cost

a photo of a dog

an oil painting of a snowy mountain village

a pencil

1.9 Hybrid Images

a lithograph of a skull + a lithograph of waterfalls

a rocket ship + a photo of a dog

a photo of a dog + a lithograph of a skull

Part B: Diffusion Models from Scratch!

Part 1: Training a Single-Step Denoising UNet

A visualization of the noising process using sigma = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]:

Training loss curve:

Sample results on the test set after the 1st epoch [input | noisy | output]:

Sample results on the test set after the 5th epoch [input | noisy | output]:

Sample results on the test set with out-of-distribution noise levels, sigma = [0.0, 0.2, 0.4, 0.5, 0.6, 0.8, 1.0], [noisy | output]:

Part 2: Training a Diffusion Model

2.1 Adding Time Conditioning to UNet

Training loss curve:

Sampling results for the time-conditioned UNet after 5 epochs:

Sampling results for the time-conditioned UNet after 20 epochs:

2.4 Adding Class-Conditioning to UNet

Training loss curve:

Sampling results for the class-conditioned UNet after 5 epochs:

Sampling results for the class-conditioned UNet after 20 epochs: