Code360 powered by Coding Ninjas X Code360 powered by Coding Ninjas X
Table of contents
What is Model Optimization?
Building a Neural Network Model
Importing Libraries
Loading Data
Model Definition
Setting Hyperparameters
Defining the Optimization Loops
Optimizer and Loss Function
Final Demonstration
Frequently Asked Questions
What is PyTorch?
What is model optimization in PyTorch?
How can I prevent overfitting during model optimization?
How can I evaluate my model’s performance during optimization?
Last Updated: Mar 27, 2024

Model Optimization in PyTorch

Author Sohail Ali
1 upvote
Leveraging ChatGPT - GenAI as a Microsoft Data Expert
Prerita Agarwal
Data Specialist @
23 Jul, 2024 @ 01:30 PM


Most of the time, we create a model and skip the part of optimization. This leads to poor model accuracy and prediction. Thus, it becomes crucial to optimize a model as it will enhance accuracy and minimize errors. Optimization involves using the best parameters and hyperparameters to make the model generalize for unseen data.

Model Optimization in PyTorch

In this blog, we will be learning about different methods of model optimization in PyTorch. So without any further wait, let’s start learning!

What is Model Optimization?

Model optimization is a technique in which we adjust the parameters of a neural network model during the training phase to minimize differences between actual output and predicted output. It is the medium by which the parameters of a model get updated using gradients of the loss function.

what is model optimization

Thus choosing an optimal parameter is a crucial task that can decide how slow or fast a model will move toward the convergence point.

Also, read - Gradient Descent Algorithm

Get the tech career you deserve, faster!
Connect with our expert counsellors to understand how to hack your way to success
User rating 4.7/5
1:1 doubt support
95% placement record
Akash Pal
Senior Software Engineer
326% Hike After Job Bootcamp
Himanshu Gusain
Programmer Analyst
32 LPA After Job Bootcamp
After Job


Below are some pre-requisites to fully understand the concept of model optimization in PyTorch:

  • Familiarity with Python and Deep Learning: You should be familiar with the Python programming language and deep learning concepts like neural networks, layers, optimizers, etc.
  • PyTorch: The basic knowledge of PyTorch, including tensors, modules, and Autograd is necessary.
  • Data Loading: A basic understanding of how to load and process the data using PyTorch’s modules, like DataLoader and Torchvision, is beneficial.
  • FashionMNIST Dataset: A basic overview of the FashionMNIST dataset and its structure will be beneficial for understanding data loading and processing.
  • Machine Learning Terminologies: You should also be aware of many machine learning terms like loss functions, activation functions, batch size, etc.


Install the below libraries below proceeding further in the blog.

  • Python: The code for the model is written in Python, so you need to have Python installed on your machine.
  • PyTorch: It is an open-source machine learning library that provides flexible frameworks for building and training deep learning models.
  • TorchVision: It is a package in PyTorch that provides standard datasets and models for various computer vision tasks.

Note: You can install the libraries using the pip command as given below:

pip install torch torchvision

Building a Neural Network Model

Now, let’s start building a neural network model on which we will be going to apply optimizations.

Importing Libraries

First, we will import libraries that are necessary for building the model.

import torch
from torch import nn
from import DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor


Here, we imported the PyTorch library for building and training the neural network, which is imported as nn. Next, we imported DataLoader, which loads data in batches during training. Then we imported torchvision, which contains standard datasets, and lastly, the ToTensor to convert data into PyTorch tensors.

Loading Data

Now, let’s load the data on which we will be going to build the model.

transform = ToTensor()
training_data = datasets.FashionMNIST(root="data", train=True, download=True, transform=transform)
test_data = datasets.FashionMNIST(root="data", train=False, download=True, transform=transform)
train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=False)


Here, we loaded the FashionMNIST dataset for training and testing with specified transformations. After running the above code, the dataset will get downloaded on your device as shown below:

Downloading FashitionMNIST dataset

Model Definition

Now, let’s build the model class and create its required functions. 

class NeuralNetwork(nn.Module):
	# Initialize the class
	def __init__(self):
		super(NeuralNetwork, self).__init__()
		self.flatten = nn.Flatten()
		#creating fully connected layers
		self.fc1 = nn.Linear(28*28, 512)
		self.fc2 = nn.Linear(512, 512)
		self.fc3 = nn.Linear(512, 10)
		self.relu = nn.ReLU()

	def forward(self, x):
		x = self.flatten(x)
		#Applying fully connected layers and Relu activation
		x = self.relu(self.fc1(x))
		x = self.relu(self.fc2(x))
		output = self.fc3(x)
		return output

model = NeuralNetwork()


Here, we created a neural network class with three fully connected (dense) layers and a ReLU activation function applied after each hidden layer. Then the input to the network is flattened before passing through the fully connected layers.

Now, we are done building the model, so let’s start the process of model optimization in PyTorch.

Setting Hyperparameters

Let us first set different hyperparameters for our model. 

learning_rate = 1e-7
batch_size = 69
epochs = 10



  • learning_rate: It defines the rate at which the model parameters are updated.
  • batch_size: It is the total number of data samples.
  • epochs: It is the number of times the iteration over the data occurs.

Defining the Optimization Loops

Let us now define the optimization loop using which we will be going to train the model. Each iteration of this loop is called an epoch.

def loop(dataloader, model, loss_fn, optimizer, is_train=True):
	#Storing the size
	size = len(dataloader.dataset)
	#train() mode if true else set evaluation mode
	model.train() if is_train else model.eval()

	total_loss, correct = 0, 0
	num_batches = len(dataloader)

	#Setting up gradient based on is_train
	with torch.set_grad_enabled(is_train):
		#Iterating over dataloader
		for batch, (X, y) in enumerate(dataloader):
			#Performing calcuation
			pred = model(X)
			loss = loss_fn(pred, y)
			total_loss += loss.item()

			if is_train:
				#setting gradient zero
				#computing in backward pass

			correct += (pred.argmax(1) == y).sum().item()

			if is_train and batch % 100 == 0:
				current = (batch + 1) * len(X)
				print(f"loss: {loss.item():.6f}  [{current}/{size}]")

	accuracy = 100 * correct / size
	avg_loss = total_loss / num_batches
	return accuracy, avg_loss


Here, we created a function named loop which computes and returns the accuracy and average loss for every iteration we perform. The loop manages both training and evaluation modes based on the is_train flag variable. During training, it performs forward and backward passes to compute the loss, update the model's parameters using the optimizer, and print loss at regular intervals. At last, it returns the accuracy and average loss for the given dataset.

Optimizer and Loss Function

The loss function is used to compute the error between the actual values and the predicted values. And the optimizers are used to update the weights of the model based on the loss function.

loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=learning_rate)


Here, we have chosen the Cross-Entropy loss function and the Stochastic Gradient Descent optimizer with the specified learning rate. The Cross entropy loss function is ideal for multi-class classification tasks and the SDG optimizer which updates each training example's parameters one at a time.

Final Demonstration

Now, let us write a code to train our model for a specific number of epochs and which will give us the training and testing loss in each iteration. 

for i in range(epochs):
	print('Epoch', (i+1))
	loop(train_dataloader, model, loss_fn, optimizer)
	loop(test_dataloader, model, loss_fn, optimizer)


Here, we are iterating over the range of passed epochs and learning the best model parameters with the best accuracy.

final output

After running the code for two iterations, we got the average loss for each epoch. In this way, we can choose the best parameters for which our model will give the best average loss and accuracy for our model.

Frequently Asked Questions

What is PyTorch?

PyTorch is an open-source deep learning framework used for building, training, and deploying machine learning models.

What is model optimization in PyTorch?

In PyTorch, model optimization refers to the process of boosting the performance of a neural network model by adjusting its parameters during the training phase.

How can I prevent overfitting during model optimization?

You can prevent the problem of overfitting data during optimization by using techniques like regularization and data augmentation.

How can I evaluate my model’s performance during optimization?

You can evaluate your model performance by using evaluation metrics, including accuracy, precision, and F1-score depending on your task (classification or regression).


This article discusses the concept of model optimization in PyTorch. We built a model and optimized it by adjusting various hyperparameters of our model. We hope this blog has helped you grow your knowledge of model optimization in PyTorch. If you want to learn more, then check out our articles.

Refer to our Guided Path to upskill yourself in DSACompetitive ProgrammingJavaScriptSystem Design, and many more! If you want to test your coding ability, you may check out the mock test series and participate in the contests hosted on Coding Ninjas!

But suppose you have just started your learning process and are looking for questions from tech giants like Amazon, Microsoft, Uber, etc. In that case, you must look at the problemsinterview experiences, and interview bundles for placement preparations.

However, you may consider our paid courses to give your career an edge over others!

Happy Learning!

Live masterclass