Table of contents
1.
Introduction
2.
Decision Surface
3.
Implementation
4.
Visualization of Decision Surface
5.
Frequently Asked Questions
6.
Key Takeaways
Last Updated: Mar 27, 2024

Decision Surface And Plotting

Author Mayank Goyal
1 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Well, we all things to be more transparent. We all want to know the inner something that’s inside. The same goes for the classification model. We merely depend on the performance metrics to weigh the model performance. However, visualizing the classification results has its charm and gives a clearer picture of how models classify results. 

A popular diagnostic for visualizing the decisions made by a classification model is decision surface/boundary. A decision surface is a plot that shows how a fit machine learning algorithm divides the input feature space by class label.

A decision surface is a powerful tool for understanding how a given model visualizes the prediction and how it decides to divide the input feature space by class label.    

Decision Surface

Classification in machine learning means training our data to assign labels in our input dataset.

Each input feature defines an axis on the feature space. The minimum number of features required to form a plane is two, with dots representing input coordinates in the feature space. If there were three input variables, the feature space would be a three-dimensional volume.

The motive of the classification model is to separate the feature space so that we can decide the class label for points in the feature space with minimum error.

This separation is done by decision surface or boundary, and it works as a demonstrative tool for visualizing the model on a classification predictive modeling task. 

The data points lying to one side of the decision surface belong to one class label to those lying on the other side of the surface. Due to the model learning process, we can create or modify decision boundaries.

Although the word ‘surface’ suggests a 2-D feature space, we can still use these methods for more than two features by creating a decision surface for each pair of input features.

Now, let's look at the implementation part to get a clearer picture. We will be using logistic regression classifier for our implementation.

Implementation

We will be using Breast Cancer Wisconsin(Diagnostic) dataset for our work.

Importing all the necessary libraries

import pandas as pd

import numpy as np
pd.set_option("display.max_rows"None"display.max_columns"None
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import IncrementalPCA
from sklearn.model_selection import GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn import metrics
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

 

Reading the dataset and displaying first 10 rows.

df = pd.read_csv('data.csv')
df.head(10)

 

 

Performing Label Encoding on Target Feature

df['diagnosis'] = df['diagnosis'].map({'B':0'M':1})

 

Dropping the Unwanted Columns

df.drop(['id''Unnamed: 32'], axis = 1, inplace = True)

 

Differentiating Dependent and Independent Features

x = df.iloc[:, 1:]
y = df['diagnosis']

 

x

 

 

Splitting the Input Dataset

x_train, x_test, y_train, y_test = train_test_split(x, y, train_size = 0.6,test_size = 0.4, random_state = 1234)

 

We split the dataset into a 60-40 ratio.

 

Feature Scaling

columns = x_train.columns
scalerx = StandardScaler()
x_train_scaled = scalerx.fit_transform(x_train)
x_train_scaled = pd.DataFrame(x_train_scaled, columns = columns)

x_test_scaled = scalerx.transform(x_test)
x_test_scaled = pd.DataFrame(x_test_scaled, columns = columns)

 

x_train_scaled

 

 

PCA

We reduce the dimension of the feature to two,as we can plot  decision surface for 2-d grid.

pca = IncrementalPCA(n_components = 2)
x_train_pca = pca.fit_transform(x_train_scaled)

x_test_pca = pca.transform(x_test_scaled)

Incremental Principal Component Analysis selects two features to explain as much variance as possible. We apply PCA in both testing and training data.

 

Plotting the Scatter-Plot for both Training and Testing dataset

plt.figure(figsize = (206))
plt.subplot(121)
plt.scatter(x_train_pca[:,0], x_train_pca[:,1], c = y_train)
plt.xlabel('Training 1st Principal Component')
plt.ylabel('Training 2nd Principal Component')
plt.title('Training Set Scatter Plot with labels indicated by colors, (0)-Violet,(1)-Yellow')
plt.subplot(122)
plt.scatter(x_test_pca[:,0], x_test_pca[:,1], c = y_test)
plt.xlabel('Test 1st Principal Component')
plt.ylabel('Test 2nd Principal Component')
plt.title('Test Set Scatter Plot with labels indicated by colors, (0)-Violet (1)-Yellow')
plt.show()

 

We can see the distinction between the two classes and possibly imagine the decision surface, maybe a correct diagonal between the two.

 

Performing Cross-Validation

params = {'C':[0.010.1110100]}

clf = LogisticRegression()

folds = 5
model_cv = GridSearchCV(estimator = clf, 
                        param_grid = params, 
                        scoring= 'accuracy'
                        cv = folds,
                        return_train_score=True,
                        verbose = 3)

model_cv.fit(x_train_pca, y_train)

We perform a 5-fold grid-search cross-validation on logistic regression classifier on the training set.

 

Best Hyperparameter from Grid-Search CV performed Above

print(model_cv.best_params_)

 

Output

{'C': 10}

 

Re-Training the model with best parameters

model = LogisticRegression(C = 10).fit(x_train_pca, y_train)

 

We re-train our model with c=10 obtained after performing hyperparameter tuning.

 

Predictions

y_train_pred = model.predict(x_train_pca)

y_test_pred = model.predict(x_test_pca)

 

Performance Analysis of the model in terms of different performance metrics

print('Training Accuracy of the Model: ', metrics.accuracy_score(y_train, y_train_pred))
print('Test Accuracy of the Model: ', metrics.accuracy_score(y_test, y_test_pred))
print()

print('Training Precision of the Model: ', metrics.precision_score(y_train, y_train_pred))
print('Test Precision of the Model: ', metrics.precision_score(y_test, y_test_pred))

 

 

Visualization of Decision Surface

We can create a decision boundary by fitting the model on the training data, then using the same model to make predictions for a grid of values for the input domain.

Once we have the grid of predictions, we can plot the values and their class label.

The best possible approach to visualize decision boundaries is to use a contour plot that can interpolate the colors between the points. We can use the contourf()function for plotting the decision surface.

We have to follow specific steps.

Firstly, we need to define the grid points in the whole feature space.

To do this, first, we find the maximum and minimum values for each feature and increase it by one step beyond that to ensure that the whole feature space is covered.

x_min, x_max = x_train_pca[:, 0].min() - 1, x_train_pca[:, 0].max() + 1
y_min, y_max = x_train_pca[:, 1].min() - 1, x_train_pca[:, 1].max() + 1

 

An arrange() function creates a uniform sample at a particular resolution across each dimension. We will use the meshgrid() function to create a grid of the two input vectors.

xx_train, yy_train = np.meshgrid(np.arange(x_min, x_max, 0.1),
                                np.arange(y_min, y_max, 0.1))

Z_train = model.predict(np.c_[xx_train.ravel(), yy_train.ravel()])

Z_train = Z_train.reshape(xx_train.shape)

 

 

Similarly for test dataset.

x_min, x_max = x_test_pca[:, 0].min() - 1, x_test_pca[:, 0].max() + 1
y_min, y_max = x_test_pca[:, 1].min() - 1, x_test_pca[:, 1].max() + 1

xx_test, yy_test = np.meshgrid(np.arange(x_min, x_max, 0.1),
                              np.arange(y_min, y_max, 0.1))

Z_test = model.predict(np.c_[xx_test.ravel(), yy_test.ravel()])

Z_test = Z_test.reshape(xx_test.shape)

 

 

Now we have grid values across the feature space.

The contourf() function takes different grids for each of the axes. Then, we plot the decision surface with a two-color colormap.

plt.figure(figsize = (206))
plt.subplot(121)
plt.contourf(xx_train, yy_train, Z_train)
plt.scatter(x_train_pca[:, 0], x_train_pca[:, 1], c = y_train, s = 30, edgecolor = 'k')
plt.xlabel('Training 1st Principal Component')
plt.ylabel('Training 2nd Principal Component')
plt.title('Scatter Plot with Decision Boundary for the Training Set')
plt.subplot(122)
plt.contourf(xx_test, yy_test, Z_test)
plt.scatter(x_test_pca[:, 0], x_test_pca[:, 1], c = y_test, s = 30, edgecolor = 'k')
plt.xlabel('Test 1st Principal Component')
plt.ylabel('Test 2nd Principal Component')
plt.title('Scatter Plot with Decision Boundary for the Test Set')
plt.show()

 

 

So we can see how the contourf() function plotted a beautiful decision boundary. With the help of the above screenshot, we can visualize how the input features are assigned their class labels.

 

Frequently Asked Questions

  1. How is the optimal decision boundary determined?
    A classification problem is a rule that partitions the features and assigns features of a partition to the same class. The ‘boundary’ of this partitioning is the decision boundary of the rule. The boundary that this rule produces is the optimal decision boundary.
     
  2. How do you determine decision boundaries in logistic regression?
    The logistic regression decision boundary is the set of all points that satisfy the equation P(y=1|x)=P(y=0|x)=½.
     
  3. How does decision tree decision boundary differs from that of logistic regression?
    Logistic regression decision boundary divides the feature space into precisely two halves with the help of a single line, whereas decision trees divide the space into smaller and smaller areas.
     
  4. What kind of decision boundary is built by a logistic regression classifier?
    In the case of logistic regression, the decision boundary is a straight line,i.e., it comes up with a hyperplane live SVM that divides the feature space into two different classes.

 

Key Takeaways

Let us brief the article,

Firstly, we saw a decision boundary, enhancing our visualization and providing a clear explanation of how data inputs are classified. Lastly, we saw how to implement a decision boundary using logistic regression classifier.

I recommend you all apply the same steps using another classification model to understand it better. Thus we can use decision surface and performance metrics to evaluate the model's performance.

That is the end of the article. Stay tuned for more exciting articles.

Keep Learning Ninjas!

Live masterclass