Introduction
Data Augmentation is a technique that is used to artificially expand the dataset. Data augmentation is used for better training of the model. This technique is used to expand the data set, sometimes this may cause overfitting of the model. For example, most of the data augmentation is done for images, in images, we can change color, filters, rotation, etc.
In the above image, we are trying to duplicate the images by de-colorizing, de-texturized, flip/rotating. Dataset augmentation – the process of applying simple and complex transformations like flipping or style transfer to your data – can help overcome the increasingly large requirements of Deep Learning models. This post will walk through why dataset augmentation is important, how it works, and how Deep Learning fits into the equation.
We can augment
- Text
- Audio
- Images
-
Any other data
Also Read, Resnet 50 Architecture
Sample Model
Let’s take a sample image for applying data augmentation.
Save this image as ‘bird.jpg’. Now let’s perform simple operations for augmenting the above image.
Let’s construct an image data generator.
# create data generator
datagen = ImageDataGenerator()
Once constructed, an iterator can be created for an image dataset.
The iterator will return one batch of augmented images for each iteration.
An iterator can be created from an image dataset loaded in memory via the flow() function; for example:
# load image dataset
X, y = ...
# create iterator
it = datagen.flow(X, y)
# create iterator
it = datagen.flow_from_directory(X, y, ...)