Do you think IIT Guwahati certified course can help you in your career?
No
Introduction
Deep learning models are extremely data-hungry and computationally expensive to train. A CNN model built from scratch for multiclass classification may require hundreds of thousands or even millions of images to achieve satisfactory results. So, does that mean one always needs a huge amount of data and computational power to train a deep learning model? This is where transfer learning steps in.
What is transfer learning?
Transfer learning is making use of the previously gained knowledge from a problem to solve a new but similar problem at hand. For example, a model trained to recognise a tiger may also be reused to recognise a lion with a few changes in layers of the model. So, essentially we exploit a pre-trained model for improving generalisation in another model.
This is done by preserving the weights learned from task A and making use of the same weights for task B. So this elevates the starting point for the new model since we aren’t initialising the weights from 0.
Usually, in neural networks, the initial layers are used to detect the edges of the object we are looking for. The layers in the middle are used to detect the shapes, while the end layers are used for detecting some task-specific features. In transfer learning, we preserve the initial and the middle layers and just change the final layers for our use case.
The objective is to transfer as much knowledge as possible from the previous task to the new task at hand.
Why use Transfer learning?
Transfer learning is a very emerging practice today owing to its efficiency. A neural network trained from scratch requires a huge amount of data. Just for reference, the imageNet dataset has over a million image samples. Practically, it may not always be feasible to gather that much data. Even if you do, you may still be limited by your computation power. It may potentially take weeks to train a satisfactory model from scratch.
How to implement Transfer Learning?
Obtain a pre-trained model.
Create a base model.
Freeze layers to preserve previous training results.
Add new trainable layers.
Train new layers on the dataset.
Fine-tune the model.
Code Implementation
Now Let’s try to get our hands dirty. We’ll implement a 5 class classifier using google’s MobileNet architecture.
Importing the required libraries
import numpy as np
import cv2
import PIL.Image as Image
import os
import matplotlib.pylab as plt
import tensorflow as tf
import tensorflow_hub as hub
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
result = classifier.predict(gold_fish[np.newaxis, ...])
result.shape
Output:
# tf.keras.utils.get_file('ImageNetLabels.txt','https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
image_labels = []
with open("ImageNetLabels.txt", "r") as f:
image_labels = f.read().splitlines()
image_labels[:5]
Output:
Loading flower dataset. Here, Untar = True would unzip the file.
dataset_url = "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz"
data_dir = tf.keras.utils.get_file('flower_photos', origin=dataset_url, cache_dir='.', untar=True)
# cache_dir indicates where to download data. I specified . which means current directory
# untar true will unzip it
Here, image shapes are inconsistent. We need to bring all the images to uniform dimensions.
X, y = [], []
for flower_name, images in flowers_images_dict.items():
for image in images:
img = cv2.imread(str(image))
resized_img = cv2.resize(img,(224,224))
X.append(resized_img)
y.append(flowers_labels_dict[flower_name])
X = np.array(X)
y = np.array(y)
Creating the test train split.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
We will just change the output layer, freezing all the previous layers.
num_of_flowers = 5
model = tf.keras.Sequential([
pretrained_model_without_top_layer,
tf.keras.layers.Dense(num_of_flowers)
])
model.summary()
Output:
As we can see, there are more than 2 million parameters in the model. However, the trainable ones are just a few thousand. This is where transfer learning saves time and computation cost.
Mention some of the activation functions. Sigmoid function(ranges between 0 and 1) Relu (ranges between 0 and infinity) tanh (ranges between -1 to 1)
How does transfer learning come up with high accuracy models even with minimal data? This is one of the salient features of using transfer learning. We take an already pre-trained model and fine tune it to suit our use case. The data requirement in this case is very minimal compared to a model built from scratch.
Key Takeaways
Transfer Learning is a very popular practice today because it cuts down the time and resources that would go into reinventing the wheel. This blog covers transfer learning with emphasis on its practical implementation in python. Readers are strongly advised to go through the blogs at least a couple of times. If you want to deep dive into machine learning and deep learning, check out our industry-oriented Machine learning course curated by Stanford University alumni and industry experts.
Live masterclass
Become a YouTube Analyst: Use Python to analyze viewers data
by Coding Ninjas
04 Feb, 2025
02:30 PM
Get hired as an Amazon SDE : Resume building tips
by Coding Ninjas
03 Feb, 2025
02:30 PM
Expert tips: Ace Leadership roles in Fortune 500 companies
by Coding Ninjas
03 Feb, 2025
12:30 PM
Become a YouTube Analyst: Use Python to analyze viewers data