CNN from Scratch with pure Mathematical Intuition

30th July, 2024

Hi all,

I hope you're equally excited as I am to embark on this journey of unraveling CNNs from the ground up. Today, we're diving deep into the heart of convolutional neural networks, armed with nothing but our mathematical intuition. What you all need is a flavour of Linear Algebra (how matrix operations are performed), Calculus and a little focus to read and understand this article. You can trust me, I'll not let you feel bored till the end of article. Without a further due, let's start!

In this article, we will cover following modules (this is basically the content part we will be looking to):

Convolution Operation
Convolutional Layer
Reshape Layer
Binary Cross Entropy Loss
Sigmoid Activation
Solving MNIST

Feeling overwhelmed while looking the content above? Don't worry, just go through the article and leave the rest on me.

1. Convolution Operation

In CNN, we have two fundamental components, Input and Kernel.

Input in CNN is typically an image or a multidimensional array representing data. Kernel is basically a small matrix of weights that perform convolution operations on input data.

Pretty sure you've a thought here, What is convolution?? It is a sliding window operation that combines two pieces of information, input and kernel. Let's feel this with a real intuition.

Imagine you're looking at a large painting through a small magnifying glass.

The painting is your input image.
The magnifying glass is your kernel.