Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
What are Image Pyramids?
3.
Applications of Image Pyramids
4.
Gaussian Pyramid
5.
Laplace pyramid
6.
Frequently Asked Questions
6.1.
What is the Mat class's purpose in OpenCV?
6.2.
How do you calculate the degree of similarity between two images?
6.3.
To blur an image, can you use a linear filter?
7.
Conclusion
Last Updated: Mar 27, 2024
Easy

Image Pyramids

Introduction

The representation of an image at several resolutions is referred to as an image pyramid. The concept is that undetectable characteristics at one resolution can be easily spotted at a different resolution. For example, if the area of interest is large, a low-resolution image or coarse view will suffice. Small objects, on the other hand, benefit from the high-resolution examination. Evaluating the image at many resolutions can be advantageous if a picture contains both large and small objects of significance; therefore, image pyramids are based on this principle in order to highlight hidden and undetected features in an image.

Also See, Image Sampling 

What are Image Pyramids?

A multi-scale depiction of an image is known as an "image pyramid."

The use of an image pyramid allows us to locate items/objects in photographs at various scales. When used in conjunction with a sliding window, we may locate items in images in various positions.

The original image, in its original size, is at the bottom of the pyramid (in terms of width and height). The image is shrunk (subsampled) and optionally smoothed at each subsequent layer (usually via Gaussian blurring).

The image is gradually subsampled until a stopping condition is reached, which is usually a minimal size at which no more subsampling is required. This can be seen in the image below.

Source: Link

What is the use of blurring an image? Because doing so avoids the aliasing and ringing issues that can occur when downsampling an image directly. The pyramid is now called based on the type of blurring used. For example, if we use a mean filter, the pyramid is called a mean pyramid; if we use a Gaussian filter, the pyramid is called a Gaussian pyramid; and if we don't use any filtering, the pyramid is called a subsampling pyramid, and so on. We can use any interpolation algorithm for subsampling, such as the nearest neighbour, bilinear, bicubic, etc. 

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job
Bootcamp

Applications of Image Pyramids

The following list describes the uses of image pyramids and their applications:

  1. Image compression
  2. Multi-Scale texture mapping
  3. Multi-focus composites
  4. Multi-scale detection
  5. Multi-scale registration
  6. Image blending
  7. Noise removal
  8. Hybrid images
     

Let us now discuss the following two types of image pyramids: Gaussian and Laplacian Pyramids.

Gaussian Pyramid

Before we subsample the image in Gaussian Pyramid, we apply a 5 X 5 Gaussian filter. Reduce and Expand - are the two types of procedures available. The picture resolution will be reduced (half of the width and height) using the Reduce operation, while the image resolution will be increased (two times the width and height) in the Expand operation. It entails downsampling and applying repeated Gaussian blurring to a picture until certain stopping requirements are met. One of the stopping criteria, for example, could be the minimum image size. In the Gaussian Pyramids, the residuals of the subsamples are not retained. 

Mathematically, the Reduce Operation in the development of Gaussian Pyramid is given by:

                                            Source: Link

                                    
                                         Source: Link

Here, represents the level in the pyramid and i,j are used for iterating column-wise and row-wise on pixels of the image.

The mathematical representation of Expand Operation is given by:

                                           Source: Link

                                             Source: Link

As illustrated below, OpenCV has a built-in function for blurring and downsampling.

cv2.pyrDown(src[, dstsize[, borderType]])
// The size of the output image is calculated by default as Size((src.cols+1)/2, (src.rows+1)/2), i.e. one-fourth at each stage.

 

The source picture is src, and the rest are optional inputs like the output size (dstsize) and border type. 

This function downsamples the input image by eliminating even rows and columns after convolving it with a 5x5 Gaussian kernel. An example of how to use the aforesaid function may be found below.

import cv2
img = cv2.imread('D:/downloads/child.jpg')
img_level_1 = cv2.pyrDown(img)
img_level_2 = cv2.pyrDown(img_level_1)

                                                                                     Source: Link

Laplace pyramid

Let's see the Laplace pyramid now. Because the Laplacian is a high pass filter, the output at each pyramid level will be an image containing edges. The Laplacian can be estimated using the difference of Gaussian, as we explained in the edge detection blog.The Laplacian pyramid provides a coarse representation of the image as well as a set of detail images at different scales. So, we'll make use of this fact and subtract the Gaussian pyramid levels to get the Laplacian pyramid. In the Gaussian Pyramid, the Laplacian of a level is produced by subtracting that level from the extended version of its upper level. This is depicted in the diagram below.

Source: Link

As illustrated in the figure above, OpenCV also has a function to travel down the image pyramid or extend a specific level.

cv2.pyrUp(src[, dstsize[, borderType]])
// By default, Size(src.cols*2, (src.rows*2) is used to calculate the size of the output image.

 

The input image is upsampled by injecting even zero rows and columns, and the result is convolved with a 5x5 Gaussian kernel multiplied by 4. Let's look at an example of the Laplacian pyramid.

In practice, we start with the Gaussian pyramid and build the Laplacian pyramid from there. For a better understanding, look at the formula below.

Source: Link

Source: Link

To begin, load the image.

Then create a three-level Gaussian pyramid.

The topmost level of the Laplacian pyramid is identical to that of the Gaussian pyramid. The remaining levels are created by subtracting that Gaussian level from its upper expanded level from top to bottom. And continuing this way, we can obtain the Laplacian pyramid from the extraction of the Gaussian pyramid.

                                             Source: Link

Also see, Sampling and Quantization

Frequently Asked Questions

What is the Mat class's purpose in OpenCV?

The Mat(Matrix) class in the OpenCV namespace is mostly used to store and retrieve picture information. This class represents an n-dimensional array and is used to store data.

How do you calculate the degree of similarity between two images?

The degree of similarity between two images is measured using a distance metric established on the appropriate multi-dimensional feature space. The Minkowski distance, the Manhattan distance, the Euclidean distance, and the Hausdorff distance are standard distance metrics.

To blur an image, can you use a linear filter?

We cannot use a linear filter in blurring an image because to blur an image we need to compare and smooth neighbouring pixels. This cannot be done using a linear filter.

Conclusion

We learned about Image Pyramids in this blog and specifically talked about Gaussian and Laplacian Pyramids. The Laplacian pyramid is mainly used to compress images and the Gaussian pyramid is primarily used for multi-scale edge estimation. We hope you found this article enjoyable to read. For information on related topics you can visit our website and learn the concepts.

Happy Learning!

Previous article
Edge Detection
Next article
Scale-Invariant Feature Transform (SIFT)
Live masterclass