author : @himanshustwts

Hey people! Hope you’re doing well:)

I got a thought to start learning diffusion models again, I once left till learning it’s architecture.

This time I’m gonna cover from scratch, making intuitions, diving into mathematics, ideas and ending to implementation.

So this Part 1 of the series. In this part I’ve covered GANs and VAE from very scratch.

The flow of the blog will be as follows:

  1. Idea behind Generative Deep Learning
  2. Earlier Attempts - GANs and VAE
  3. Architecture and short-comings of GANs and VAE
  4. Intuition behind Diffusion Models (DDPM paper)
  5. Ideation of Diffusion Models Architecture
  6. Forward Pass and implementation

Idea behind Generative Deep Learning

The idea is we want to learn distribution over the data in order to generate new data

I’m assuming you all are familiar with traditional machine learning approaches where models learn to predict labels (or outputs). We never wanted to limit ourselves with only prediction based approaches, so what can be more interesting?

Here comes generative models into the picture which aims to generate new content (could be images, music, text or even realistic 3D models!)

At its core, this generative DL aims to train models to recognize the pattern or underlying structure of data.

Let’s say we want to train a model to create a scene that looks like ‘Classroom of Elite’. If we show the model enough scenic examples of the masterpiece anime, the model can start understanding patterns, textures or styles to generate scenes depending upon the context!