What is Convolution?
Convolution is a process in which we apply a filter on top of an image to get some detailed insights about the image.
For instance, suppose I have one grayscale image, where the left half of the image is white, and the right half is black.
Original Image
The 6x6 matrix representation of the above image will look like this-
Matrix Representation of Image
Now, I will detect the separation of white and black color by applying one vertical edge detection filter on top of this image.
Vertical-edge Filter
If we apply this 3x3 filter on the 6x6 image (with stride one), we’ll get one 4x4 matrix as output. Let’s see how we calculate the value of these variables.
To calculate ‘a’, I have to apply the filter on the first 3x3 section of the image:
a = 0*1 + 0*0 + 0*(-1) + 0*2 + 0*0 + 0*(-2) + 0*1 + 0*0 + 0*(-1) = 0
Similarly, we’ll calculate the value of rest of the variables:
b = 0*1 + 0*0 + 1*(-1) + 0*2 + 0*0 + 1*(-2) + 0*1 + 0*0 + 1*(-1) = -4
c = 0*1 + 1*0 + 1*(-1) + 0*2 + 1*0 + 1*(-2) + 0*1 + 1*0 + 1*(-1) = -4
d = 1*1 + 1*0 + 1*(-1) + 1*2 + 1*0 + 1*(-2) + 1*1 + 1*0 + 1*(-1) = 0
e = 0*1 + 0*0 + 0*(-1) + 0*2 + 0*0 + 0*(-2) + 0*1 + 0*0 + 0*(-1) = 0
f = 0*1 + 0*0 + 1*(-1) + 0*2 + 0*0 + 1*(-2) + 0*1 + 0*0 + 1*(-1) = -4
g = 0*1 + 1*0 + 1*(-1) + 0*2 + 1*0 + 1*(-2) + 0*1 + 1*0 + 1*(-1) = -4
h = 1*1 + 1*0 + 1*(-1) + 1*2 + 1*0 + 1*(-2) + 1*1 + 1*0 + 1*(-1) = 0
i = 0*1 + 0*0 + 0*(-1) + 0*2 + 0*0 + 0*(-2) + 0*1 + 0*0 + 0*(-1) = 0
j = 0*1 + 0*0 + 1*(-1) + 0*2 + 0*0 + 1*(-2) + 0*1 + 0*0 + 1*(-1) = -4
k = 0*1 + 1*0 + 1*(-1) + 0*2 + 1*0 + 1*(-2) + 0*1 + 1*0 + 1*(-1) = -4
l = 1*1 + 1*0 + 1*(-1) + 1*2 + 1*0 + 1*(-2) + 1*1 + 1*0 + 1*(-1) = 0
m = 0*1 + 0*0 + 0*(-1) + 0*2 + 0*0 + 0*(-2) + 0*1 + 0*0 + 0*(-1) = 0
n = 0*1 + 0*0 + 1*(-1) + 0*2 + 0*0 + 1*(-2) + 0*1 + 0*0 + 1*(-1) = -4
o = 0*1 + 1*0 + 1*(-1) + 0*2 + 1*0 + 1*(-2) + 0*1 + 1*0 + 1*(-1) = -4
p = 1*1 + 1*0 + 1*(-1) + 1*2 + 1*0 + 1*(-2) + 1*1 + 1*0 + 1*(-1) = 0
Final Output:
In the output, we can see that the vertical edge (separation) is highlighted.
Building a Convolutional Neural Network
In this section, I will show you how to build a simple convolutional neural network step-by-step.
We’ll take the MNIST Digits images and build a convolution network to classify them.
Step 1: Importing Necessary Libraries
!pip install tensorflow
import tensorflow as tf
from __future__ import absolute_import, division, print_function, unicode_literals
from tensorflow.keras import datasets, layers, models
import datetime, os
Step 2: Downloading & Splitting the dataset
(train_images, train_labels), (test_images, test_labels) = datasets.mnist.load_data()
# Reshaping 3D tensors to 4D tensors to satisfy CNN requirements
train_images = train_images.reshape((60000,28,28,1))
test_images = test_images.reshape((10000,28,28,1))
Step 3: Building the Model
models = models.Sequential()
models.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape = (28,28,1)))
models.add(layers.MaxPooling2D((2, 2)))
models.add(layers.Conv2D(64, (3, 3), activation='relu'))
models.add(layers.MaxPooling2D((2, 2)))
models.add(layers.Conv2D(64, (3, 3), activation='relu'))
models.add(layers.Flatten())
models.add(layers.Dense(64, activation='relu'))
models.add(layers.Dense(10, activation='softmax'))
Step 4: Verifying the model structure and number of parameters
model.summary()
Step 5: Compiling the Model
models.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
Step 6: Fitting the model on our dataset
models.fit(train_images, train_labels, epochs=5)
Step 7: Evaluating the Model
test_loss, test_acc = models.evaluate(test_images, test_labels)
print(test_acc)
Result: Our simple CNN has achieved an accuracy of 99% approx.