Table of contents
1.
What are Autoencoders?
2.
Applications of Autoencoders
2.1.
Image Denoising
2.2.
Recommendation System
2.3.
Image Generation
3.
Building a simple Autoencoder from scratch
3.1.
Step 1: Importing Necessary Libraries
3.2.
 
3.3.
Step 2: Loading the MNIST dataset in the notebook
3.4.
Step 3: Data Preparation
3.5.
Step 4: Initializing the Autoencoder Model
3.6.
Step 5: The Encoder and the Decoder Model
3.7.
Step 6: Training the model on the MNIST digits dataset
3.8.
Output
3.9.
Step 7: Generating Predictions
3.10.
Step 8: Visualizing the difference between original and reconstructed images
3.11.
Output
3.12.
Result
4.
Frequently Asked Questions
5.
Key Takeaways
Last Updated: Mar 27, 2024

Autoencoders - Introduction & Implementation

Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

What are Autoencoders?

Autoencoder is Feed-Forward Neural Networks where the input and the output are the same. Autoencoders encode the image and then decode it to get the same image. The core idea of autoencoders is that the middle layer must contain enough information to represent the input.

 

There are three important properties of autoencoders:

1. Data Specific: We can only use autoencoders for the data that it has previously been trained on. For instance, to encode an MNIST digits image, we’ll have to use an autoencoder that previously has been trained on the MNIST digits dataset.

2. Lossy: Information is lost while encoding and decoding the images using autoencoders, which means that the reconstructed image will have some details missing as compared to the original image.

3. Unsupervised: Autoencoders belong to the unsupervised machine learning category because we do not require explicit labels corresponding to the data; the data itself acts as input and output.

 

Caption: Architecture of an Autoencoder

Applications of Autoencoders

We primarily use autoencoders for data compression or dimensionality reduction. Once we have a more condensed (low-dimensional) representation of multidimensional data, we can easily visualize it.

Image Denoising

Noise in the image signifies corrupted or bad pixels. To get the real and clear image, we have to denoise the image, and for the task of denoising the images, we can use autoencoders.

 

To construct a denoising image encoder, we first add noise to our original image and then pass it through the feed-forward neural network to get the original image.

 

Original image

 

Adding noise to the above image will yield the following image.

Noisy image

 

After passing the noisy image through our trained autoencoder, it outputs the denoised image below.

 

Denoised Output

Recommendation System

We can use autoencoders to give users personalized recommendations based on their history. Deep autoencoders are used for Spotify music recommendations, YouTube video recommendations, or Netflix movie recommendations.

 

The input data will be the history of songs or the videos watched by a user. When we feed the input data to our autoencoder, the encoder part will capture the user’s interests. Then the decoder part will generate the videos or songs similar to the input data.

Image Generation

With the autoencoders, we can also generate similar images. Variational Autoencoder (VAE) is a type of generative model, which we use to generate images.

 

For instance, if we input a human face to the autoencoder, we will get similar face instances with slight tweaks.

 

Caption: Using autoencoder to generate anime faces

Source: https://iq.opengenus.org/

 

Not only human features, but we can also use the variational autoencoder (VAE) to generate nature sceneries, pictures of historical monuments, ecstatic images, etc.

Building a simple Autoencoder from scratch

To implement an autoencoder, we have to set some hyper-parameters:

 

  1. Code Size: The size of the compressed data. If we want a more condensed representation of the input, the code size will be less & vice-versa.
  2. Layers: The number of layers, we can specify any number of layers. More layers signify more learning of features.
  3. Loss Function: To calculate information loss, we use Binary Cross Entropy if the input values range from 0 to 1 and otherwise use the Mean Squared error.
  4. Nodes: The number of nodes/neurons per layer, we can specify two or more numbers of neurons corresponding to a layer (except for the input and the output).

 

It is pretty simple to build a one-layered autoencoder. Let’s see the stepwise demonstration.

Step 1: Importing Necessary Libraries

import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf
import keras
from keras import layers
from keras.datasets import mnist    # We will be working with MNIST Digits Images dataset
You can also try this code with Online Python Compiler
Run Code

 

Step 2: Loading the MNIST dataset in the notebook

# Loading the dataset in the notebook

(x_train, _), (x_test, _) = mnist.load_data()

print(x_train.shape)
print(x_test.shape)
You can also try this code with Online Python Compiler
Run Code

 

Step 3: Data Preparation

# Normalizing the dataset (setting pixel values between 0 and 1)
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Flattening the 28x28 images into a vector of size 784
x_train = x_train.reshape((len(x_train), np.prod(x_train.shape[1:])))
x_test = x_test.reshape((len(x_test), np.prod(x_test.shape[1:])))

print(x_train.shape)
print(x_test.shape)
You can also try this code with Online Python Compiler
Run Code

 

Step 4: Initializing the Autoencoder Model

# The autoencoder will have only one input layer, only one hidden layer, and one output layer

encoded_dimensions = 32    # We are compressing 784 pixels/neurons in 32 pixels, that means that the dimensionality reduction factor is 784/32 = 24.5

input_image = keras.Input(shape=(x_train.shape[1]))    # We will input all 784 pixels in the input layer

encoded = layers.Dense(encoded_dimensions, activation='relu')(input_image)    # Encoded input image with 32 pixels

decoded = layers.Dense(x_train.shape[1], activation='sigmoid')(encoded)    # Decoded encoded image with 784 pixels

autoencoder = keras.Model(inputs = input_image, outputs = decoded)
You can also try this code with Online Python Compiler
Run Code

 

Step 5: The Encoder and the Decoder Model

# Encoder Model

encoder = keras.Model(inputs = input_image, outputs = encoded)

# Decoder Model

encoded_input = keras.Input(shape=(encoded_dimensions,))
decoder_layer = autoencoder.layers[-1]
decoder = keras.Model(encoded_input, decoder_layer(encoded_input))
You can also try this code with Online Python Compiler
Run Code

 

Step 6: Training the model on the MNIST digits dataset

# Training our model on MNIST digits images dataset

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

autoencoder.fit(x_train, x_train, epochs=50, batch_size=256, shuffle=True, validation_data = (x_test, x_test))
You can also try this code with Online Python Compiler
Run Code

Output

 

Step 7: Generating Predictions

encoded_imgs = encoder.predict(x_test)
decoded_imgs = decoder.predict(encoded_imgs)
You can also try this code with Online Python Compiler
Run Code

 

Step 8: Visualizing the difference between original and reconstructed images

n = 6    # To display six digits

plt.figure(figsize=(20, 4))

for i in range(0, n):
   
   # Original Images
   ax = plt.subplot(2, n, i + 1)
   plt.imshow(x_test[i].reshape(28, 28))
   plt.gray()
   ax.get_xaxis().set_visible(False)
   ax.get_yaxis().set_visible(False)

   # Reconstructed Images
   ax = plt.subplot(2, n, i + 1 + n)
   plt.imshow(decoded_imgs[i].reshape(28, 28))
   plt.gray()
   ax.get_xaxis().set_visible(False)
   ax.get_yaxis().set_visible(False)
 
plt.show()
You can also try this code with Online Python Compiler
Run Code

Output

 

Result

The images in the first row are the original images, whereas the images in the second row are reconstructed. We built a very elementary one-layered neural network that is why we are losing some details in the output images; by training a more complex model and by hyper-parameter tuning, we can get a more detailed output.

 

Frequently Asked Questions

Q1. What are the different types of Autoencoders?

Ans. There are seven types of Autoencoders:

  1. Sparse Autoencoder
  2. Deep Autoencoder
  3. Convolutional Autoencoder
  4. Contractive Autoencoder
  5. Variational Autoencoder
  6. Denoising Autoencoder
  7. Undercomplete Autoencoder

 

Q2. What are the essential components of an autoencoder?

Ans. Every encoder has three components:

  1. Encoder
  2. Code
  3. Decoder

 

Q3. Autoencoders belongs to which category of Machine Learning?

Ans. Autoencoders belong to the unsupervised machine learning category; they do not need explicit labels for training because input and output are the same.

 

Q4. What are the three properties of Autoencoders?

Ans. The three properties of autoencoders are:

  1. Data Specific,
  2. Lossy (The reconstructed images loses details when compared to the original image),
  3. Learn automatically from the data examples.

 

Q5. What is Denoising Autoencoder?

Ans. The idea of the Denoising autoencoder is that we add random noise instances in the input images and then ask the autoencoder to recover the original image from the noisy one. The autoencoder has to subtract the noise and only output the meaningful features.

Key Takeaways

Congratulations on finishing the blog!! Below, I have some blog suggestions for you. Go ahead and take a look at these informative articles.

In today’s scenario, more & more industries are adapting to AutoML applications in their products; with this rise, it has become clear that AutoML can be the next boon in the technology. Check this article to learn more about AutoML applications.

Check out this link if you are a Machine Learning enthusiast or want to brush up your knowledge with ML blogs.

Live masterclass