What are GPUs?
GPU stands for Graphics Processing Unit. A GPU analyzes and processes graphical inputs like images, videos, games, etc. It reduces the processing time for graphical input. It can process many chunks of data simultaneously and is thus used in machine learning. It can be integrated with the system or have a separate hardware unit.
Training a CNN
Now, let us talk about the training of CNN.
Training a CNN is a heavy process because it requires significant hardware capacity to perform operations timely. Various operations take place in it, like the following.
-
Pre-processing the input data
-
Training the learning model
-
Storing the trained learning model
-
Final deployment of the model
Forward and backward passes are performed during the training of the data model.
In the forward pass, the input is processed, and the output is generated. In the backward pass, the generated output is analyzed, the error is calculated, and the neural network's input weights are adjusted accordingly.
Let us look at a diagram for this.

-
Here, we can see that we have many inputs (x1, x2, x3, ……., xn).
-
Also, the edges with which the input nodes are connected to the transfer function have associated weights(w1, w2, w3, …….., wn).
-
The transfer function processes the inputs and weights and provides the net input to the activation function, which finally, based on the threshold value, generates the output.
- Afterward, if there is any error in the output, the weights are readjusted accordingly.
Why are GPUs preferred over CPUs in Deep Learning?
These operations in the neural network discussed above are matrix multiplications, and GPUs are faster in doing these than CPUs.
Three main features of GPUs give them an advantage over CPUs while being used for deep learning.
These are:-
1. Larger Memory Bandwidth
2. Parallelization
3. Fast Memory Access
Let us study these features one by one.
Larger Memory Bandwidth
Consider the CPU a speed boat with less loading capacity but a faster access rate. Now, consider GPU a cargo ship with a substantial loading capacity but a slower access rate.
When dealing with matrix multiplication, loads of data are required. Now, a CPU will fetch a small amount of data thousands of times, whereas a GPU will load much more in a single go. That is why GPUs and CNN are preferred more due to their larger memory bandwidth.
Parallelization
Now, although GPUs fetch loads of data in a single go, they have a slower access rate than CPUs, so while they are fetching, the processor would have already finished the computation on the previously fetched data and would be idle. So, to avoid this, GPU does parallelization.
This means that at the same time, more than one part fetches the data so the processor does not remain idle. In simple words, with context to the previous analogy, multiple cargo ships are loading the data.
Fast Memory Access
GPUs have hundreds of simple cores that are smaller and hence faster. CPUs have few complex cores. Although CPUs are faster for handling system operations.
Explanation by Programming
Let us see some code in Python justifying our discussion.
We can use the following commands to understand the difference in speed between a GPU and a CPU when dealing with scalar and matrix multiplications.
CPU
First, let us code for checking the speed of the CPU.
import torch
%%timeit
t=torch.randn(2000,2000)
res=torch.matmul(t,t)
del t,res
%%timeit
t=torch.randn(2,2)
res=torch.matmul(t,t)
del t,res

You can also try this code with Online Python Compiler
Run Code
We imported the 'torch' library in Python because this is an ML library for creating deep neural networks.
Then, for CPU, we took two cases, one of scalar multiplication and the other of matrix multiplication.
We use the '%% timeit' line of code to get the time taken to compute the multiplications.
Also, we use the '.matmul' function, which returns the matrix product of two arrays that we have generated with random values using the '.randn' function.
Output

The CPU takes around 4.31 microseconds to compute scalar multiplication and around 198 ms to compute matrix multiplication.
GPU
Now, we will perform the same using GPU with the following code.
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
%%timeit
t=torch.randn(2,2).to(device)
res=torch.matmul(t,t)
del t,res
%%timeit
t=torch.randn(2000,2000).to(device)
res=torch.matmul(t,t)
del t,res

You can also try this code with Online Python Compiler
Run Code
If you are using Google Colaboratory for the programs, select runtime options and then choose GPU for running the programs using GPU.
Here, we use 'cuda' because it provides GPU-based parallel processing for running the program.
Here also, we use the same '%%timeit', '.randn', and '.matmul' functions. We use the '.to(device)' function to move the model to the device that we have specified before.
Output

Here, we can see that the time GPU takes for scalar multiplication is around 59.8 microseconds, and the time taken for matrix multiplication is around 26.8 ms.
Hence, for matrix multiplication, GPUs are relatively much faster than CPUs, which is why we use GPUs and CNN.
Frequently Asked Questions
What is a GPU?
GPU stands for Graphics Processing Unit. It is an electronic circuit specially designed for handling graphics-related work like image processing, videos, games, designing, etc. These work faster when dealing with matrix-related operations.
What is an Artificial Neural Network?
An Artificial Neural Network or ANN is a network derived from the biological neural network. It is a collection of several units that take inputs, process them and give outputs like a biological network. It tries to replicate the human-brain mechanism.
What is a CNN?
CNN stands for Convolutional Neural Network. It is a version of ANN primarily used for extracting information from gird-based inputs like images or videos.
Why are GPUs and CNN preferred over CPUs and CNN?
It is because GPUs provide three main advantages over CPUs for deep learning. These are large memory bandwidth, parallelization, and fast memory access.
Why are CPUs preferred over GPUs for scalar multiplication?
It is because CPUs have a faster fetch rate and complex cores for handling functions like system operations.
Conclusion
In this article, we first studied what Convolutional Neural Networks and Graphics Processing Units are. Then, we moved to the integration of GPUs and CNN. Finally, we understood it with the help of a simple example of why we use GPUs and CNN.
We hope this article helped you understand why GPUs are used when dealing with CNNs. If you like to study these topics more, do read the following:-
To learn more about DSA, competitive coding, and many more knowledgeable topics, please look into the guided paths on Coding Ninjas Studio. Also, you can enroll in our courses and check out the mock test and problems available. Please check out our interview experiences and interview bundle for placement preparations.
Happy Coding!