Table of contents

Introduction

What is Entropy?

How Entropy is Calculated?

Example of Entropy Calculation

Uses of Entropy in Machine Learning

5.1.

1. Decision Trees

5.2.

2. Anomaly Detection

5.3.

3. Clustering

5.4.

4. Feature Selection

5.5.

5. Model Evaluation

Key Features of Entropy in Machine Learning

Implementing Entropy in Python

7.1.

Python

Frequently Asked Questions

8.1.

What does entropy measure in machine learning?

8.2.

How is entropy used in decision trees?

8.3.

Can entropy be negative?

8.4.

Conclusion

Last Updated: Aug 26, 2024

Easy

Entropy in Machine Learning

Author Rahul Singh

Do you think IIT Guwahati certified course can help you in your career?

Yes

Introduction

Entropy is used to measure uncertainty or disorder in a set of data. Entropy in machine learning is used primarily in decision tree algorithms to make decisions and split data effectively. It helps to quantify how mixed or pure a dataset is, thus guiding the algorithm to create the most descriptive splits.

What is Entropy?

Entropy quantifies the amount of uncertainty or disorder in a set of data. A dataset with high entropy has more randomness, while a dataset with low entropy is more ordered. In simpler terms, entropy gives us the details of how mixed the dataset is.

In the context of decision trees, entropy helps to determine how well a particular feature separates the data into different classes. The goal is to find the feature that provides the highest information gain, meaning it reduces the most uncertainty.

How Entropy is Calculated?

Entropy is calculated using the following formula:

Entropy=−∑i=1npi⋅log⁡2(pi)

where

pi is the probability of each class i in the dataset.
The sum is taken over all classes in the dataset.

Example of Entropy Calculation

Let's take a dataset of 10 examples, where 6 examples are of Class A and 4 examples are of Class B.

Calculate the probabilities:

Probability of Class A: pA=6/10=0.6
Probability of Class B: pB=4/10=0.4

Apply the entropy formula:

Entropy=−(0.6⋅log⁡2(0.6)+0.4⋅log⁡2(0.4))

Calculate each term:

log⁡2(0.6)≈−0.736
log⁡2(0.4)≈−1.322

Compute entropy:

Entropy=−(0.6⋅(−0.736)+0.4⋅(−1.322))

Entropy=−(−0.4416−0.5288)

Entropy=0.9704

So, the entropy of this dataset is approximately 0.97.

Uses of Entropy in Machine Learning

1. Decision Trees

In decision tree algorithms like ID3, C4.5, and CART, entropy is used to calculate information gain, which helps in selecting the best attribute to split the data at each node. Information gain is the reduction in entropy when a dataset is split on an attribute. The attribute with the highest information gain is chosen for the split, as it provides the most significant reduction in uncertainty.

2. Anomaly Detection

Entropy can be used to identify anomalies in data. An anomaly might be defined as a data point that increases the entropy of the dataset significantly. Models can be trained to detect such points and flag them as anomalies.

3. Clustering

Entropy can be used to assess the quality of clusters by measuring the uncertainty within clusters. Lower entropy within clusters indicates that the clusters are more distinct from each other.

4. Feature Selection

Entropy is used to calculate mutual information between features and the target variable, which helps in feature selection. Features with high mutual information are more relevant as they reduce uncertainty about the target variable.

5. Model Evaluation

In classification tasks, especially in neural networks, cross-entropy loss is a common objective function used to evaluate the performance of a model. It measures the difference between the predicted probability distribution and the actual distribution (typically the one-hot encoded labels), with lower cross-entropy indicating a better model fit.

Key Features of Entropy in Machine Learning

Measure of Uncertainty: Entropy helps quantify the uncertainty in the dataset. Higher entropy means more unpredictability in the dataset.
Guides Decision Trees: In algorithms like ID3 and C4.5, entropy is used to determine the best feature to split the dataset.
Information Gain Calculation: Entropy is used to calculate information gain, which helps in selecting the feature that best separates the dataset.
Easy to Compute: Entropy can be calculated with simple probability formulas, making it practical for machine learning applications.

Implementing Entropy in Python

Here's a Python example to calculate entropy for a dataset:

Python

Python

import math

def calculate_entropy(class_labels):
    total_items = len(class_labels)
    class_counts = {}
    
    # Count occurrences of each class
    for label in class_labels:
        if label in class_counts:
            class_counts[label] += 1
        else:
            class_counts[label] = 1
            
    entropy = 0.0
    
    # Calculate entropy
    for count in class_counts.values():
        probability = count / total_items
        entropy -= probability * math.log2(probability)
        
    return entropy

# Example usage
class_labels = ['A', 'A', 'B', 'A', 'B', 'B', 'A', 'A', 'A', 'B']
entropy = calculate_entropy(class_labels)
print("Entropy of the dataset:", entropy)

You can also try this code with Online Python Compiler

Run Code

Output

Entropy of the dataset: 0.9709505944546686

Frequently Asked Questions

What does entropy measure in machine learning?

Entropy measures the level of disorder or impurity in a dataset. In machine learning, it helps determine the best way to split data to reduce uncertainty.

How is entropy used in decision trees?

Entropy is used to evaluate how well a feature splits the data into different classes. A feature with lower entropy after the split is preferred as it indicates a clearer separation.

Can entropy be negative?

No, entropy cannot be negative. It is always a non-negative value, as it represents the amount of uncertainty in the data.

Conclusion

Entropy is a crucial concept in machine learning, particularly for algorithms like decision trees. It helps measure the level of uncertainty in a dataset and guides the creation of effective splits to improve classification accuracy. Understanding entropy and its application can significantly enhance your ability to build and optimize machine learning models.

You can also check out our other blogs on Code360.

Live masterclass

Top 5 GenAI Projects to Crack 25 LPA+ Roles in 2026

by Shantanu Shubham

10 Mar, 2026

03:00 PM

12+ registered

Zero to Data Analyst: Google Analyst Roadmap for 30L+ CTC

by Prashant

08 Mar, 2026

06:30 AM

152+ registered

Beginner to GenAI Engineer Roadmap for 30L+ CTC at Amazon

by Shantanu Shubham

08 Mar, 2026

08:30 AM

47+ registered

Amazon-Ready SQL & Python : Crack 20L+ CTC Data Analyst Roles

by Abhishek Soni

09 Mar, 2026

01:30 PM

142+ registered

Top GenAI Skills to crack 30 LPA+ roles at Amazon & Google

by Sumit Shukla

09 Mar, 2026

03:00 PM

12+ registered

Data Analysis for 20L+ CTC@Flipkart: End-Season Sales dataset

by Sumit Shukla

10 Mar, 2026

01:30 PM

30+ registered

Top 5 GenAI Projects to Crack 25 LPA+ Roles in 2026

by Shantanu Shubham

10 Mar, 2026

03:00 PM

12+ registered

Zero to Data Analyst: Google Analyst Roadmap for 30L+ CTC

by Prashant

08 Mar, 2026

06:30 AM

152+ registered

View more events