Traffic Sign Recognition using CNN

Q: 1. What is categorical cross-entropy loss?

Ans. Categorical cross-entropy is a loss function used in multiclass classification tasks where a data sample can only belong to one out of many categories. Formally, it is designed to quantify the difference between two probability distributions.

Q: 2. What does the argmax function do?

Ans. Argmax is an operation that finds the argument that gives the maximum value from a target function. We use Argmax in machine learning for finding the class with the most significant predicted probability. In our above project, argmax converts the one-hot encoded values to return a single-valued number.

Q: 3. How does dropout work in a convolutional neural network?

Ans. Dropout zeroes out a column from the weight matrix associated with a fully-connected layer. This operation corresponds to dropping a neuron in the neural network.

Introduction

Deep learning is a developing field at the current time. Most of the problem statements use deep learning in any work. If we have to pick a deep learning technique for solving any computer vision problem statement, we will go with a conventional neural network.

This article will build our first image processing project using CNN and understand its power and why it has become so popular.

Brief About the Traffic Sign Recognition

Companies like Google, Tesla, Uber, Ford, Audi, Toyota, Mercedes-Benz, and many more are automating vehicles with enhanced technology. They are trying to make more accurate driverless vehicles. We all might know about self-driving cars, where the vehicle does not need any human guidance to run on the road, behaving itself as a driver. It is not wrong to think about the safety aspects—a chance of significant accidents from machines.

Researchers are finding new algorithms to ensure 100% road safety and accuracy. One such algorithm is Traffic Sign Recognition, which we will talk about in this article.

When we go on the road, we see various traffic signs like traffic signals, turn left or right, speed limits, zebra crossing, u-turn, no passing of heavy vehicles, no entry, children crossing, etc., that we need to follow for safety purpose. Likewise, autonomous or self-driving cars must interpret these signboards and make decisions to achieve maximum accuracy. This algorithm recognizes which class a traffic signboard belongs to is called Traffic signs recognition.

This Deep Learning article will build a model for classifying traffic signs available in the image into many categories using CNN and Keras library.

Image Classification

Image classification is when the system takes an input image and classifies it with an appropriate label.

Any organization uses image classification today to streamline, simplify, and fast. Have you ever wondered about my system being capable of identifying my and my family’s faces? Cars are qualified to follow traffic rules automatically. This all happened when Image Processing came into account. As technology advancement occurs, new algorithms and neural networks become more powerful and capable of handling substantial images and videos, processing them, and concluding with proper subtitles.

A CNN branch of deep learning processes the image and video data by extracting features and building a neural network by assigning weights and convolving them with a filter to classify an image.

Convolutional Neural Network is a prior choice of every data scientist to deal with any Image or video processing data. It is also easy to use the transfer learning model and modify it with our layers.

Now let’s dive into the implementation part:

Implementation

Dataset

We will download the dataset from the given link here. The dataset contains three folders. The first one is meta that includes 43 different images of different classes, and the rest two are train and test folders. The train folder consists of all 43 classes, and every category contains various images. The image dataset consists of more than 50,000 images of various traffic signs. A total of 43 different classes are present in the dataset for image classification. The count of images in every class varies in size.

Importing Libraries

Let’s start by importing all the required libraries. We will use the Keras library to load each layer. So, we will install TensorFlow before importing deep learning layers.

pip install tensorflow
pip install keras
pip install sciket-learn

You can also try this code with Online Python Compiler

Run Code

Importing all the libraries

import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from PIL import Image

You can also try this code with Online Python Compiler

Run Code

We use the OS module that helps iterate all the images with their respective classes and labels.

Loading the Dataset

Now, we will load all the images in a single list in the form of an array. The given list will describe the pixels of the image. We will make another list containing labels or classes of the corresponding image. To feed the image data to the CNN model, we first need to convert the data into a NumPy array.

The training dataset contains the name of labels or classes from 0 to 42. Thus, with the help of the os module, we will be iterating through each class folder and append the image to the data list(from the below code) and respective label to the labels list. We have the CSV files that contain the actual label category name.

image_path=r"C:\Users\goyal\Downloads\Train"
data = []
labels = []
classes = 43
for i in range(classes):
    path = os.path.join(image_path, str(i)) #0-42
    images = os.listdir(path)
    for img in images:
        try:
          image = Image.open(path +'/'+ img)
          image = image.resize((30,30))
          image = np.array(image)
          data.append(image)
          labels.append(i)
        except:
          print("Error loading image")
data = np.array(data)
labels = np.array(labels)

You can also try this code with Online Python Compiler

Run Code

Image_path stores the path of the dataset stored in our local device. Data holds the input image, and labels store the unique class id.

Plotting the histogram for the number of images

import seaborn as sns
fig = sns.distplot(data, kde=False, bins = 43, hist = True, hist_kws=dict(edgecolor="black", linewidth=2))
fig.set(title = "Traffic signs frequency graph",
        xlabel = "ClassId"
        ylabel = "Frequency")

You can also try this code with Online Python Compiler

Run Code

ClassId is the unique id given for each distinctive traffic sign.

As we can see from the above plot, the dataset does not contain equal amounts of images for each class, and hence, the model may be biased in detecting some traffic signs more accurately than others.

Splitting the dataset

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)
print("training shape: ",x_train.shape, y_train.shape)
print("testing shape: ",x_test.shape, y_test.shape)

You can also try this code with Online Python Compiler

Run Code

Output

training shape: (31367, 30, 30, 3) (31367,)

testing shape: (7842, 30, 30, 3) (7842,)

One hot encoding

We will convert the output array to categorical output as the model will return in such a way.

from tensorflow.keras.utils import to_categorical
y_train = to_categorical(y_train, 43)
y_test = to_categorical(y_test, 43)

You can also try this code with Online Python Compiler

Run Code

Building a CNN Model

We will start developing a CNN to classify images for correct labels. For image data, CNN is the best choice to work.

The architecture of our CNN model

2 Conv2D layer (filter=32, kernel_size=(5,5), activation=”relu”)
MaxPool2D layer ( pool_size=(2,2))
Dropout layer (rate=0.25)
2 Conv2D layer (filter=64, kernel_size=(3,3), activation=”relu”)
MaxPool2D layer ( pool_size=(2,2))
Dropout layer (rate=0.25)
Dense Fully connected layer (256 nodes, activation=”relu”)
Dropout layer (rate=0.5)
Dense layer (43 nodes, activation=” softmax”)

from keras.layers import Conv2D, Dense, Flatten, MaxPool2D, Dropout
from keras.models import Sequential
model = Sequential()
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=x_train.shape[1:]))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(rate=0.5))
model.add(Dense(43, activation='softmax'))

You can also try this code with Online Python Compiler

Run Code

Purpose of Different layers

MaxPool- This layer is used to reduce the size of images.
Dense – for the feed-forward neural network.
Flatten – It converts the parrel layers to squeeze the layers.
Dropout – It is a regularization technique to reduce overfitting.

We use an activation function as softmax for Multiclass classification at the last layer.

Training and Validating the model

While compiling the model, we need to specify the loss function, metrics and the optimizer we have to use.

model.compile(
    loss='categorical_crossentropy', 
    optimizer='adam', 
    metrics=['accuracy']
)

You can also try this code with Online Python Compiler

Run Code

Loss Function – It calculates the loss done by the model we use categorical cross-entropy. We use categorical cross-entropy in the case of a multiclass classification problem statement.
Optimizer – It is used to optimize the loss function.

We will fit the train and test data to our model and train the convolutional model. We need to define several epochs to train for and batch size while introducing the model.

epochs = 15
history = model.fit(x_train, y_train, epochs=epochs, batch_size=64, validation_data=(x_test, y_test))

You can also try this code with Online Python Compiler

Run Code

Plotting the loss and the accuracy

plt.figure(0)
plt.plot(history.history['accuracy'], label='training accuracy')
plt.plot(history.history['val_accuracy'], label='val accuracy')
plt.title('Accuracy')
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.legend()
plt.show()
plt.figure(1)
plt.plot(history.history['loss'], label='training loss')
plt.plot(history.history['val_loss'], label='val loss')
plt.title('Loss')
plt.xlabel('epochs')
plt.ylabel('loss')
plt.legend()
plt.show()

You can also try this code with Online Python Compiler

Run Code

The model performance is pretty good. We can see from the plotting the increasing accuracy and loss on the graph.

Testing the Model

The dataset contains a test folder with different test images and a test.csv file. The CSV file contains the details about the image path and their respective labels. We again load the data using pandas and resize each image into the shape of 30*30 pixels. Then, we convert the input image to a NumPy array. After processing test data images, we will check the model's accuracy against actual labels.

from sklearn.metrics import accuracy_score
test = pd.read_csv(r'C:\Users\goyal\Downloads\Test\Test.csv')
labels=np.array(test['ClassId'])
test_img_path = r"C:\Users\goyal\Downloads"
test_imgs = test['Path'].values
test_data = []
test_labels = []
for img in test_imgs:
    im = Image.open(test_img_path + '/' + img)
    im = im.resize((30,30))
    im = np.array(im)
    test_data.append(im)
test_data = np.array(test_data)
y_pred = model.predict(test_data)
predictions = np.round(y_pred).astype(int)
predictions = np.argmax(predictions, axis=1)
print("accuracy: ", accuracy_score(labels, predictions))

You can also try this code with Online Python Compiler

Run Code

Output

accuracy: 0.9676959619952494

Frequently Asked Questions

1. What is categorical cross-entropy loss?

Ans. Categorical cross-entropy is a loss function used in multiclass classification tasks where a data sample can only belong to one out of many categories. Formally, it is designed to quantify the difference between two probability distributions.

2. What does the argmax function do?

Ans. Argmax is an operation that finds the argument that gives the maximum value from a target function. We use Argmax in machine learning for finding the class with the most significant predicted probability. In our above project, argmax converts the one-hot encoded values to return a single-valued number.

3. How does dropout work in a convolutional neural network?

Ans. Dropout zeroes out a column from the weight matrix associated with a fully-connected layer. This operation corresponds to dropping a neuron in the neural network.

Key Takeaways

Let us brief the article.

Firstly, we saw the purpose of using the CNN model for the image processing models and how traffic recognition systems are helpful these days. Then, we implemented a CNN model for the same. We learned different kinds of stuff about image processing and CNN during the project.

I hope you all like this article.
Check out this problem - Largest Rectangle in Histogram

Happy Learning Ninjas!