Code360 powered by Coding Ninjas X Naukri.com. Code360 powered by Coding Ninjas X Naukri.com
Table of contents
1.
Introduction
2.
Types of Unsupervised Learning
3.
Hierarchical Clustering
4.
Flat Clustering
5.
Applications of Unsupervised Learning
6.
FAQs
7.
Key Takeaways
Last Updated: Mar 27, 2024

Unsupervised Learning

Author Rajkeshav
0 upvote

Introduction

 

In Supervised Learning, we had some input data and correspondingly the output data as well. In Unsupervised Learning, the data collected has no labels, and we are unsure about the outputs. We try to model our algorithm to understand the patterns from the data and output the required answer. We don't interfere when the algorithm is learning. For example, learning the game of chess, the algorithm tries to understand the moves according to different situations; a self-driving car knows to run in real-time traffic, checking the mail if they are spam or not, etc.

 

Types of Unsupervised Learning

 

Unsupervised Learning is classified into two types of problem, Clustering, and Association. 

 

Clustering
Clustering groups the data into different clusters or groups. Data with the most similar features come under the same group, and those with more minor similar features come under another group. For example, In Market Segmentation problems, we try to group the customers based on their age, gender, etc., to know the interest of each group to provide the best they prefer. 

Clustering is further divided into Hierarchical Clustering and Flat Clustering.

 

Hierarchical Clustering

It says, let's put all the data into one cluster called root, and continuously divide the data into smaller groups till the point where we have reached the termination condition that we have decided. For example, we chose not to divide the cluster further if the total data points are less than 50. So Hierarchical Clustering says that one data point does not belong to only one cluster; rather, it belongs to many clusters.

 

Source: statisticshowto.com

 

The root cluster is the entire data point (A,B,C,D,E,F,G,H,I,J,K). Data point J belongs to many clusters (J,K), (I,J,K,H), (I,J,K,H,G), and similarly for others as well. 

 

Flat Clustering

One data point is either entirely inside or entirely outside a cluster in Flat Clustering. Here we have to decide the number of groups to form. 

 

Source: pythonprogramming.net

 

There are two clusters, and each data point is properly inside either of the clusters.

 

 

Association
Association tries to find the relationship between the data points. It determines the set of data that is likely to occur together in the dataset. For example, In a shopping mall, the items that a person buys are associated so that if they purchase the butter, most of them also buy milk. The Unsupervised Learning model learns the pattern from this data, and now if another person comes to buy butter, the model tells them that they are more likely to buy the milk.

 

Source: simplilearn.com

 

Applications of Unsupervised Learning

 

  1. Market Segmentation:  In Market Segmentation problems, we group the customers based on their age, gender, etc., to know the interest of each group to provide the best they prefer. 
  2. Customer Segmentation: We divide customers of a company into groups having similarities in each group.
  3. Product Segmentation: Product Segmentation is when a company modifies its product into different products to gain the customer's attention.
  4. Recommendation systems: Recommendation systems recommend things to the user based on their searches and interests.

FAQs

Q1) What are popular Unsupervised Learning algorithms?
The popular Unsupervised Learning algorithms are:

  • K-means clustering.
  • KNN (k-nearest neighbors)
  • Hierarchical clustering.
  • Anomaly detection.
  • Neural Networks.

 

Q2) In Hierarchical clustering, a data point belongs to how many clusters?
Data point belongs to the number of levels of Hierarchy.

 

Q3) What do K denote in K Means clustering?
K denotes the number of clusters

 

Q4) What is Anomaly detection?
Anomaly detection is the process to identify the dissimilar items between the datasets that differ from the norm

 

Q5) What are the cons of Unsupervised Learning?
Following are the cons of Unsupervised Learning
a) The result predicted is less accurate than the actual because input data is not labeled in advance
b) It is costlier as it might require human intervention to understand the patterns and correlate them with the domain knowledge.

Key Takeaways

So this comes to the end of the discussion about the basics of Unsupervised Learning. I hope I was able to impart the adhered knowledge with ease. For more information you may visit

https://www.codingninjas.com/courses/machine-learning

 

Thank you a lot, Keep Learning, Keep growing 😇😇

Live masterclass