Table of contents
1.
Introduction
2.
Sample Model
3.
Horizontal and Vertical Augmentation
3.1.
Output:
4.
Horizontal and vertical flip augmentation
4.1.
Output:
5.
Brightness Augmentation
5.1.
Output:
6.
Frequently Asked Questions
7.
Key Takeaways
Last Updated: Mar 27, 2024

Data Augmentation

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Data Augmentation is a technique that is used to artificially expand the dataset. Data augmentation is used for better training of the model. This technique is used to expand the data set, sometimes this may cause overfitting of the model. For example, most of the data augmentation is done for images, in images, we can change color, filters, rotation, etc. 

 

source

 

In the above image, we are trying to duplicate the images by de-colorizing, de-texturized, flip/rotating. Dataset augmentation – the process of applying simple and complex transformations like flipping or style transfer to your data – can help overcome the increasingly large requirements of Deep Learning models. This post will walk through why dataset augmentation is important, how it works, and how Deep Learning fits into the equation.

We can augment 

  1. Text
  2. Audio
  3. Images
  4. Any other data
     

Also Read, Resnet 50 Architecture

Sample Model

Let’s take a sample image for applying data augmentation. 

 

 

Save this image as ‘bird.jpg’. Now let’s perform simple operations for augmenting the above image.

Let’s construct an image data generator.

# create data generator
datagen = ImageDataGenerator()
You can also try this code with Online Python Compiler
Run Code

Once constructed, an iterator can be created for an image dataset.

The iterator will return one batch of augmented images for each iteration.

An iterator can be created from an image dataset loaded in memory via the flow() function; for example:

# load image dataset
X, y = ...
# create iterator
it = datagen.flow(X, y)
# create iterator
it = datagen.flow_from_directory(X, y, ...)
You can also try this code with Online Python Compiler
Run Code

Horizontal and Vertical Augmentation

Let’s try horizontal and vertical augmentation. We use width_shift_range and height_shift_range that arguments Image Data Generator.

# example of horizontal shift image augmentation
from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot
# load the image
img = load_img('bird.jpg')
# convert to numpy array
data = img_to_array(img)
# expand dimension to one sample
samples = expand_dims(data, 0)
# create image data augmentation generator
datagen = ImageDataGenerator(width_shift_range=[-200,200])
# prepare iterator
it = datagen.flow(samples, batch_size=1)
# generate samples and plot
for i in range(9):
# define subplot
pyplot.subplot(330 + 1 + i)
# generate batch of images
batch = it.next()
# convert to unsigned integers for viewing
image = batch[0].astype('uint8')
# plot raw pixel data
pyplot.imshow(image)
# show the figure
pyplot.show()
You can also try this code with Online Python Compiler
Run Code

By running the above code a plot of augmented generated with a random horizontal shift.

Output:

 

Horizontal and vertical flip augmentation

An image flip means reversing the rows and columns of pixels in the case of a horizontal or vertical flip.

# example of horizontal flip image augmentation
from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot
# load the image
img = load_img('bird.jpg')
# convert to numpy array
data = img_to_array(img)
# expand dimension to one sample
samples = expand_dims(data, 0)
# create image data augmentation generator
datagen = ImageDataGenerator(horizontal_flip=True)
# prepare iterator
it = datagen.flow(samples, batch_size=1)
# generate samples and plot
for i in range(9):
# define subplot
pyplot.subplot(330 + 1 + i)
# generate batch of images
batch = it.next()
# convert to unsigned integers for viewing
image = batch[0].astype('uint8')
# plot raw pixel data
pyplot.imshow(image)
# show the figure
pyplot.show()
You can also try this code with Online Python Compiler
Run Code

In the above code, we are plotting the augmented images with a random horizontal flip.

Output:

Brightness Augmentation

The brightness of the image can be augmented by randomly increasing or decreasing the brightness.

# example of brighting image augmentation
from numpy import expand_dims
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot
# load the image
img = load_img('bird.jpg')
# convert to numpy array
data = img_to_array(img)
# expand dimension to one sample
samples = expand_dims(data, 0)
# create image data augmentation generator
datagen = ImageDataGenerator(brightness_range=[0.2,1.0])
# prepare iterator
it = datagen.flow(samples, batch_size=1)
# generate samples and plot
for i in range(9):
# define subplot
pyplot.subplot(330 + 1 + i)
# generate batch of images
batch = it.next()
# convert to unsigned integers for viewing
image = batch[0].astype('uint8')
# plot raw pixel data
pyplot.imshow(image)
# show the figure
pyplot.show()
You can also try this code with Online Python Compiler
Run Code


In the above code, we are plotting the augmented images with a random brightness flip.

Output:

 

Frequently Asked Questions

  1. What does data augmentation mean?
    Data augmentation in data analysis is a technique used to increase the amount of data by adding slightly modified copies of already existing data or newly created synthetic data from existing data.
     
  2. What is data augmentation in CNN?
    Data augmentation is a technique to artificially create new training data from existing training data. This is done by applying domain-specific techniques to examples from the training data that create new and different training examples.
     
  3. Does data augmentation increase accuracy?
    Prediction accuracy can be increased in the range of 1–3% by using data Augmentation. GAN is the preferred model for small sets, while VAE is better for larger ones.

Key Takeaways

In the above blog, we discussed

  • Data augmentation
  • Sample model
  • Horizontal and vertical augmentation
  • Horizontal and vertical flip augmentation
  • Brightness augmentation


To learn more about Machine Learning, take this awesome course from CodingNinjas.

Live masterclass