Table of contents
1.
Introduction
2.
Decision Tree on a 1D Regression Task
3.
Working of Decision Tree Regression
4.
Create a Random 1D Dataset
5.
Fit Regression Model
5.1.
Predict
5.2.
Plot the Results
6.
Decision Tree Regression with Multi-Output Targets
7.
Frequently Asked Questions
7.1.
What is Decision Tree Regression used for? 
7.2.
How does Decision Tree Regression handle overfitting? 
7.3.
Can Decision Tree Regression handle multi-output targets? 
8.
Conclusion
Last Updated: Mar 3, 2025
Medium

Decision Tree Regression

Author Gaurav Gandhi
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Decision Tree Regression is a supervised machine learning algorithm used for predicting continuous values. It works by splitting the dataset into smaller subsets based on decision rules, forming a tree-like structure. Each leaf node represents a predicted value. This technique is useful for capturing non-linear relationships in data. 

Decision Tree Regression

In this article, you will learn about Decision Tree Regression, its working mechanism, advantages, and implementation.

Decision Tree on a 1D Regression Task

A Decision Tree can be used for regression tasks involving one-dimensional data. It divides the dataset into different segments, assigning a constant value to each segment. This makes it an effective choice for modeling complex data relationships. Let’s go step by step through the process of implementing Decision Tree Regression.

Working of Decision Tree Regression

  1. Data Splitting: The dataset is recursively divided into subsets based on conditions applied to feature values.
     
  2. Leaf Nodes: The final segments of the tree contain predicted values.
     
  3. Prediction: For any new input, the algorithm follows the path in the tree to a leaf node and returns its value.
     
  4. Overfitting Prevention: Depth constraints and pruning techniques help maintain model generalization.

Create a Random 1D Dataset

Before implementing Decision Tree Regression , we need to generate sample data. Below is a Python script using NumPy and Matplotlib to create a dataset:

import numpy as np
import matplotlib.pyplot as plt

# Generate random data
np.random.seed(42)
X = np.sort(5 * np.random.rand(80, 1), axis=0)
y = np.sin(X).ravel() + np.random.normal(0, 0.1, X.shape[0])

# Plot the dataset
plt.scatter(X, y, color="blue", label="Data Points")
plt.xlabel("Feature")
plt.ylabel("Target")
plt.title("Generated 1D Dataset")
plt.legend()
plt.show()

 

Output

Output

Fit Regression Model

We will now apply Decision Tree Regression to fit our dataset:

from sklearn.tree import DecisionTreeRegressor

# Create and train the model
dt_regressor = DecisionTreeRegressor(max_depth=3)
dt_regressor.fit(X, y)

 

Here, we define a DecisionTreeRegressor with max_depth=3 to avoid overfitting and improve model interpretability.

Predict

Once the model is trained, we can make predictions:

X_test = np.arange(0, 5, 0.1).reshape(-1, 1)
y_pred = dt_regressor.predict(X_test)

 

This predicts the output for test values within the feature range.

Plot the Results

Let’s visualize the regression model’s predictions:

plt.scatter(X, y, color="blue", label="Training Data")
plt.plot(X_test, y_pred, color="red", linewidth=2, label="Decision Tree Prediction")
plt.xlabel("Feature")
plt.ylabel("Target")
plt.title("Decision Tree Regression")
plt.legend()
plt.show()

 

Output

Output

Decision Tree Regression with Multi-Output Targets

Decision Tree Regression can handle multiple target variables simultaneously. Below is an example where we predict multiple outputs:

# Generate multi-output data
Y = np.vstack((y, np.cos(X).ravel())).T

# Train a Decision Tree model for multi-output regression
dt_multi_output = DecisionTreeRegressor(max_depth=3)
dt_multi_output.fit(X, Y)

# Predict on test data
y_multi_pred = dt_multi_output.predict(X_test)

# Plot predictions
plt.scatter(X, Y[:, 0], color="blue", label="Target 1")
plt.scatter(X, Y[:, 1], color="green", label="Target 2")
plt.plot(X_test, y_multi_pred[:, 0], color="red", label="Prediction 1")
plt.plot(X_test, y_multi_pred[:, 1], color="orange", label="Prediction 2")
plt.xlabel("Feature")
plt.ylabel("Targets")
plt.title("Multi-Output Decision Tree Regression")
plt.legend()
plt.show()

 

Output

Output

Frequently Asked Questions

What is Decision Tree Regression used for? 

Decision Tree Regression is used for predicting continuous values by learning decision rules from training data.

How does Decision Tree Regression handle overfitting? 

Overfitting can be controlled using techniques like pruning, setting maximum depth, and limiting the number of samples per leaf.

Can Decision Tree Regression handle multi-output targets? 

Yes, it can predict multiple output values for a given input by using a multi-output DecisionTreeRegressor.

Conclusion

In this article, we discussed Decision Tree Regression , a machine learning algorithm used for predicting continuous values by splitting data into smaller subsets. It works by creating a tree-like model where decisions are made based on feature conditions. Decision Tree Regression is useful for handling non-linear relationships and provides an interpretable way to make predictions. Understanding this algorithm helps in solving real-world regression problems effectively.

Live masterclass