Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
In Convolutional Neural Networks (CNNs), stride refers to the step size by which the filter/kernel moves across the input image during the convolution operation. Adjusting the stride can affect the output size and computational efficiency of the network, influencing feature extraction and spatial dimensions of the data.
Read this blog till the end to get clear knowledge on Stride in Convolutional Neural Network concept in an easy way.
What is a Convolutional Neural Network (CNN)?
Before we discuss Stride, let's quickly revise the basics.
Source: https://course.ece.cmu.edu/
A Convolution Neural Network (CNN) is like a super-smart robot that's really good at understanding pictures. Like when we take photos on our phones?
CNNs help computers make sense of those pictures. CNN is used for various things such as recognizing cats and dogs, detecting diseases through X-rays, and, yes, it also helps in driving a self-driving car!
How are CNNs built?
In order to understand Stride, we should know little about how CNNs are built.
Source: https://course.ece.cmu.edu/
Think of a CNN as a collection of small, overlapping magnifying glasses called filters.
These filters scan over different parts of a picture to find interesting things, like edges, shapes, or colours. These filters slide or convolve over the entire image bit by bit.
What is Stride in CNN?
In simple terms, stride is like telling our filters how big of steps they should take while sliding over the picture in one direction.
It's similar to how we decide to take big leaps or small steps when playing jump games. These steps can be small or big.
In the world of CNNs, Stride determines how many squares or pixels our filters skip when they move across the image, from left to right and from top to bottom.
For example, consider the red square as a filter. The computer is going to use this filter to scan the image.
If stride = 1, the filter will move one pixel , see the below image:
If stride = 2, the filter will move two pixels, see the below image:
Why We Need Stride?
Stride is a Convolution Neural Network technique which has two main features. The first is to reduce the size of the output feature map. This is because the filter only overlaps with a subset of the input feature map so that the output feature map will be small, and it helps reduce the computational complexity.
The second is the overlap of the receptive field. The receptive field is the area of the input feature map that is used to calculate the output of a neuron.
For example, a stride of 2 reduces the overlap of receptive fields by half because the filter will overlap with half of the receptive fields in the previous layer. It helps prevent the CNN from learning redundant features.
How does Stride work?
Assume a convolutional neural network is analysing the content of an image. If the filter size is 4x4 pixels, the contained sixteen pixels will be converted down to 1 pixel in the output layer. As the stride increases, the resulting output decreases.
Stride is a parameter that works in conjunction with padding. Padding is the feature that puts empty blanks into the frame of the image to minimize the reduction of size in the output layer.
Actually, it is a way of increasing the size of an image to balance the size reduced by the strides. Padding and Stride are the fundamentals for CNN.
As we have discussed enough about padding and stride, let's see a comparison between the both.
Difference between Stride And Padding
The table below summarizes the key differences between the both:
Characteristic
Stride
Padding
Definition
The number of pixels that the filter moves over the input image at a time.
The addition of zeros around the edges of the input image.
Effect on output image size
A larger stride results in a smaller output image.
No, but indirectly affects it by increasing the size of the input image.
Default value
1
0
Computational complexity
Lower for larger strides
Higher for larger strides
Ability to preserve detail
Lower for larger strides
Higher for larger strides
Ability to prevent information loss
Lower for valid padding
Higher for same padding
How Does Stride Affect CNNs?
Stride is super important because changing the Stride can help CNNs to do different things, such as:
1. Fine Details vs. Big Picture
If we use a small Stride, the filters take tiny steps. This helps CNNs pay attention to all the tiny details in a picture. It's like looking at a painting up close and noticing every brushstroke.
2. Speed vs. Precision
If we use a big Stride, the filters take bigger steps, which will cover more area quickly. This makes CNN work faster, but it might miss some of the fine details.
It's like looking at the same painting from far away – we see the big picture but not every tiny detail.
3. Maintaining a Balance
CNNs often use different Strides at different stages. They start with a small Stride to capture fine details and then switch to a bigger Stride to speed things up while still getting the big picture.
Use Cases For Stride in CNN
Below are the key use cases for Stride:
Less Computationally Expensive
Stride is used to reduce the size of the output feature map by sliding the filter over the input feature map by a certain number of pixels.
Features at different levels of abstraction
Stride is used to control the size of the receptive field, which is the area of the input feature map that is used in calculation by sliding the filter over the input map by a certain number of pixels. This helps control the level of abstraction at which the CNN learns features.
Achieve Translation Invariance
Stride is used to achieve translation invariance by sliding the filter over the input feature map by a certain number of pixels. This ensures CNN's robustness to changes in the position of objects in the image.
Frequently Asked Questions
What is the best size for stride in CNN?
The optimal stride size in a Convolutional Neural Network (CNN) varies based on architecture, input size, and task. Commonly, smaller strides (1 or 2) offer detailed feature extraction, while larger strides (3+) aid in downsampling. It's a trade-off between preserving spatial information and computational efficiency, with no universal size.
When to use a larger or a smaller stride?
Use a larger stride to reduce the size of the output image and use a smaller size to preserve more detail in the output image.
Can we change the Stride during different stages of a CNN?
Yes, we can, and CNNs often use different Strides at different layers like they can start with a smaller Stride to capture details and then increase afterwards to reduce the dimensions and speed up the process.
What is the benefit of stride in CNN?
In Convolutional Neural Networks (CNNs), stride refers to the step size by which the filter/kernel moves across the input image. It helps in reducing the spatial dimensions of the output volume, reducing computational complexity, and extracting features at different resolutions, aiding in capturing both fine and coarse-grained patterns effectively.
Conclusion
This article has covered in detail about Stride in CNN and discussed how CNNs are built, then saw differences between a Stride and Padding. Finally, the affects of Strides on CNN have been discussed, and some frequently asked questions are discussed.