Table of contents
1.
Introduction 
2.
What is transfer learning?
3.
How does Transfer Learning work?
4.
Why use Transfer learning?
5.
How to implement Transfer Learning?
6.
Code Implementation
7.
FAQs
8.
Key Takeaways
Last Updated: Mar 27, 2024
Easy

Transfer Learning - Implementation

Author Arun Nawani
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction 

Deep learning models are extremely data-hungry and computationally expensive to train. A CNN model built from scratch for multiclass classification may require hundreds of thousands or even millions of images to achieve satisfactory results. So, does that mean one always needs a huge amount of data and computational power to train a deep learning model? This is where transfer learning steps in. 

What is transfer learning?

Transfer learning is making use of the previously gained knowledge from a problem to solve a new but similar problem at hand. For example, a model trained to recognise a tiger may also be reused to recognise a lion with a few changes in layers of the model. So, essentially we exploit a pre-trained model for improving generalisation in another model.  

Source - link

This is done by preserving the weights learned from task A and making use of the same weights for task B. So this elevates the starting point for the new model since we aren’t initialising the weights from 0.

Also See, Resnet 50 Architecture

How does Transfer Learning work?

Usually, in neural networks, the initial layers are used to detect the edges of the object we are looking for. The layers in the middle are used to detect the shapes, while the end layers are used for detecting some task-specific features. In transfer learning, we preserve the initial and the middle layers and just change the final layers for our use case. 

Source  - link

The objective is to transfer as much knowledge as possible from the previous task to the new task at hand. 

Why use Transfer learning?

Transfer learning is a very emerging practice today owing to its efficiency. A neural network trained from scratch requires a huge amount of data. Just for reference, the imageNet dataset has over a million image samples. Practically, it may not always be feasible to gather that much data. Even if you do, you may still be limited by your computation power. It may potentially take weeks to train a satisfactory model from scratch. 

How to implement Transfer Learning?

  • Obtain a pre-trained model.
  • Create a base model.
  • Freeze layers to preserve previous training results.
  • Add new trainable layers.
  • Train new layers on the dataset. 
  • Fine-tune the model.

Code Implementation

Now Let’s try to get our hands dirty. We’ll implement a 5 class classifier using google’s MobileNet architecture. 

Importing the required libraries

import numpy as np
import cv2

import PIL.Image as Image
import os

import matplotlib.pylab as plt

import tensorflow as tf
import tensorflow_hub as hub

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential

 

Importing the mobilenet model

IMAGE_SHAPE = (224, 224)

classifier = tf.keras.Sequential([
    hub.KerasLayer("https://tfhub.dev/google/tf2-preview/mobilenet_v2/classification/4", input_shape=IMAGE_SHAPE+(3,))
])

 

Resizing the images to dimensions 224, 224

gold_fish = Image.open("goldfish.jpg").resize(IMAGE_SHAPE)
gold_fish

Normalising the image pixels. 

gold_fish = np.array(gold_fish)/255.0
gold_fish.shape

 

result = classifier.predict(gold_fish[np.newaxis, ...])
result.shape

 

Output:

# tf.keras.utils.get_file('ImageNetLabels.txt','https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
image_labels = []
with open("ImageNetLabels.txt", "r") as f:
    image_labels = f.read().splitlines()
image_labels[:5]

 

Output:

Loading flower dataset. Here, Untar = True would unzip the file. 

dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url,  cache_dir='.', untar=True)
# cache_dir indicates where to download data. I specified . which means current directory
# untar true will unzip it

 

image_count = len(list(data_dir.glob('*/*.jpg')))
print(image_count)

 

Output:

roses = list(data_dir.glob('roses/*'))
roses[:5]

 

Output:

PIL.Image.open(str(roses[1]))

We create a dictionary such that, key of the dictionary is the flower name and the value is the list of images.

flowers_images_dict = {
    'roses': list(data_dir.glob('roses/*')),
    'daisy': list(data_dir.glob('daisy/*')),
    'dandelion': list(data_dir.glob('dandelion/*')),
    'sunflowers': list(data_dir.glob('sunflowers/*')),
    'tulips': list(data_dir.glob('tulips/*')),
}

 

Creating a label directory. 

flowers_labels_dict = {
    'roses': 0,
    'daisy': 1,
    'dandelion': 2,
    'sunflowers': 3,
    'tulips': 4,
}

 

img = cv2.imread(str(flowers_images_dict['roses'][0]))

 

img.shape

 

Output:

Here, image shapes are inconsistent. We need to bring all the images to uniform dimensions.

X, y = [], []

for flower_name, images in flowers_images_dict.items():
    for image in images:
        img = cv2.imread(str(image))
        resized_img = cv2.resize(img,(224,224))
        X.append(resized_img)
        y.append(flowers_labels_dict[flower_name])

 

X = np.array(X)
y = np.array(y)

 

Creating the test train split.

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)

 

Normalising the images.

X_train_scaled = X_train / 255
X_test_scaled = X_test / 255

 

Now, taking the pre-trained model and retraining it using the flower images.

feature_extractor_model = "https://tfhub.dev/google/tf2-preview/mobilenet_v2/feature_vector/4"

pretrained_model_without_top_layer = hub.KerasLayer(
    feature_extractor_model, input_shape=(224, 224, 3), trainable=False)

 

We will just change the output layer, freezing all the previous layers. 

num_of_flowers = 5

model = tf.keras.Sequential([
  pretrained_model_without_top_layer,
  tf.keras.layers.Dense(num_of_flowers)
])

model.summary()

 

Output:

As we can see, there are more than 2 million parameters in the model. However, the trainable ones are just a few thousand. This is where transfer learning saves time and computation cost.

We now compile the model with epochs equal to 5.

model.compile(
  optimizer="adam",
  loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
  metrics=['acc'])

model.fit(X_train_scaled, y_train, epochs=5)

 

Output:

model.evaluate(X_test_scaled,y_test)

 

Output:

The final model has an accuracy score of 86. 

FAQs

  1. Mention some of the activation functions. 
    Sigmoid function(ranges between 0 and 1)
    Relu (ranges between 0 and infinity)
    tanh (ranges between -1 to 1)
     
  2. How does transfer learning come up with high accuracy models even with minimal data?
    This is one of the salient features of using transfer learning. We take an already pre-trained model and fine tune it to suit our use case. The data requirement in this case is very minimal compared to a model built from scratch. 

Key Takeaways

Transfer Learning is a very popular practice today because it cuts down the time and resources that would go into reinventing the wheel. This blog covers transfer learning with emphasis on its practical implementation in python. Readers are strongly advised to go through the blogs at least a couple of times. If you want to deep dive into machine learning and deep learning, check out our industry-oriented Machine learning course curated by Stanford University alumni and industry experts. 

Live masterclass