Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction 
2.
AlexNet and ZFNet
2.1.
Convolutional Neural Network
2.2.
Deconvolutional Neural Network
2.3.
Kernel
2.4.
Stride
2.5.
Convolutional layer
2.6.
Pooling
2.7.
Activation layer
3.
ZFNet Architecture
3.1.
Unpooling
3.2.
Rectification
3.3.
Filtering
4.
Visualization of Each Layer
5.
How ZFNet improved AlexNet
6.
Frequently Asked Questions
6.1.
What is the importance of ZFNet?
6.2.
What is the best CNN architecture?
6.3.
How many layers are there in AlexNet?
6.4.
What is ZFNet used for?
7.
Conclusion
Last Updated: Mar 27, 2024
Easy

ZFNet

Master Python: Predicting weather forecasts
Speaker
Ashwin Goyal
Product Manager @

Introduction 

Deep learning is a fascinating field of Machine Learning used to solve various complex problems that were previously unthought of. The heart of any Deep learning algorithm is Neural networks. Neural networks are based on the working of the human brain. They contain neurons and nodes that transfer information from one layer to another. The awe-inspiring working of these neural networks was considered to be enclosed in a black box that no one could see through. There was no clear idea of how they could improve. The only method was using trial and error. But all this was changed when Rob Fergus and Matthew D. Zeiler introduced ZFNet. ZFNet is named after their surname Zeiler and Fergus. ZFNet was a revolutionary way to understand how each layer of a neural network is performing. Let us see how ZFNet uncovers the secrets of neural networks.

link

AlexNet and ZFNet

It is important to know about AlexNet to appreciate ZFNet fully. AlexNet was the winner of ImageNet ILSRVC 2012. ImageNet ILSVRC is an annual computer vision competition. It is done on a subset of a computer vision dataset called ImageNet. It is available publically. ImageNet contains 1281167 training images, 50000 validation images, and 100,000 test images. The images are classified into 1000 classes, and the size of the data set is around 150GB. AlexNet is a convolutional neural network. It performed exceptionally well in ILSRVC 2012 and won with a considerable margin. It stacked convolutional layers on top of each other.

ImageNet dataset

AlexNet was considered a massive jump in the accuracy of neural networks. The 2013 ImageNet ILSRVC was one by ZFNet. ZFNet fine-tuned AlexNet to achieve this accuracy. ZFNet actually visualized how each layer of AlexNet performs and what parameters can be tuned to achieve greater accuracy. ZFNet uses a multi-layered deconvolutional neural network to reveal the input stimuli that excite the feature maps at any layer in the model. Some technical jargons

Let us look at some important technical terms before going on to the architecture of ZFNet.

Convolutional Neural Network

It is a deep learning algorithm that performs superior to any other deep learning algorithm on image data. It takes an image as the input. It extracts its features and learns from them to give high-accuracy predictions.

Deconvolutional Neural Network

It performs all the operations performed by the convolutional neural network but in reverse. It aims to map the feature map to its corresponding image.

Kernel

A kernel is basically a filter used to extract features from the images. It is a matrix smaller than the image size that moves over the image and extracts the corresponding features.

Stride

It is the step size taken by the kernel in each direction as it moves along the image.

Convolutional layer

It is the basic block of any convolutional neural network. It contains the kernels and filters that are used to extract the features of the image.

Pooling

It is used to reduce the image size by reducing its dimensions. It takes the maximum or average from a portion of the image and then uses it as the next layer's input. It is usually placed after a convolutional layer.

Activation layer

The activation layer classifies the information carried by the neuron as useful or not. It decides whether a neuron should be activated or not.

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

ZFNet Architecture

ZFNet uses a deconvolutional neural network for visualization. The deconvolutional neural network is attached to all the layers of the CNN we want to visualize. 

The image is fed into the CNN as the input, and the features are extracted by using different layers. In the end, a feature map is created. To examine this CNN, all the activation functions are set to zero, and the feature map is passed as input to the deconvolutional neural network. Then all the activities performed by the different layers are reversed. We unpool, rectify, and filter the input to figure out the output of the previous layer that activated the chosen activation. This is repeated until the input pixel is reached.

This is how a deconvolutional neural network is attached to CNN. It will reconstruct the features of the previous layer that activated the current layer. 

It will perform the following operations:

Unpooling

The max pooling operation cannot be reversed. However, we can get the approximate value by keeping track of the maximum valued pixel which activated the current layer.

link

Rectification

CNN can have different activation functions according to the need. A very common activation function is ReLU. It ensures that the feature map is always positive. We pass the 

Filtering

CNN uses filters to extract the features. To backtrack this, a deconvolutional neural network applies filters to the rectified maps. It uses a different version of the same filters used in CNN. It transposes the filters, meaning flipping each filter vertically and horizontally.

Link

After unpooling, we get the unpooled maps that are passed to the ReLU activation function to get the Rectified feature map. It is then passed through the deconvolutional filtering layer, which results in the reconstruction of the feature map.

ZFNet

Visualization of Each Layer

ZFNet shows the top 9 activations of a feature map. Going from a particular feature map to the previous layer gives away the different parts of the layer that activates the current feature map. 

As we go from one layer to another, we can understand the nature of each layer and can know which layer is not performing up to the mark. 

Here, we can clearly understand the different layers of the network. 

Layer 1: The first layer gets the most simple features like a line in the image

Layer 2: The second layer can figure out the edges and curves in the image.

Layer 3: We start to get some patterns from this layer, such as text.

Layer 4: starts to show class-specific variations like a dog’s face.

Layer 5: We get the entire object like dogs from this layer.

How ZFNet improved AlexNet

ZFNet figured two problems in two different layers of AlexNet. When we visualized AlexNet, we saw that the filters of layer one are a mix of very high and low-frequency information. The mid frequencies were not covered. This poses a problem throughout the network as the information is not carried to the next layers.

The other problem was layer 2 was showing aliasing. 

After understanding the problems, changes were made in AlexNet.

1 . The filter size of layer 1 was reduced from 11X11 to 7X7.

2.  The stride of 1st layer was changed from 4 to 2.

link

We can see ZFNet outperforms AlexNet convincingly.

Check out this article - Padding In Convolutional Neural Network

Frequently Asked Questions

What is the importance of ZFNet?

It is a revolutionary step in deep learning, where we visualize each layer of a neural network.

What is the best CNN architecture?

LeNet-5 is considered the best CNN architecture.

How many layers are there in AlexNet?

AlexNet has 8 layers. 5 convolutional layers and 3 dense layers.

What is ZFNet used for?

It is used for high accuracy object detection.

Conclusion

This article was focused on ZFNet. It is a very interesting take on neural networks that helps us to visualize the different layers of a neural network. It can be used to visually determine which parameters should be tuned to get better accuracy instead of trial and error. By visualizing the different layers of AlexNet, ZFNet improved the accuracy by fine-tuning AlexNet. ZFNet is a very good example of how much interesting the field of machine learning is. To know more about all the interesting and new approaches in the field of machine learning, check out our industry-level courses at coding ninjas.

Also read, Data Warehouse Architecture

Previous article
Classic ConvNet Architectures
Next article
VGG Network
Live masterclass