## Introduction

Have you ever wondered how your email account accurately segregates regular emails, important emails, and spam emails? Itâ€™s not a very complex trick and weâ€™ll learn the secret behind it. This is done with a supervised learning model called Logistic regression (However, it can be done with other machine learning algorithms also, but for the sake of this blog, weâ€™ll stick to Logistic regression).

Logistic regression is employed in supervised learning tasks. More specifically, it is used for classification tasks. We know that name throws some people off. But the regression in the logistic regression is slightly misleading. It is NOT a regression model. Logistic regression is a probabilistic function. That means it makes use of probabilities of events to make its prediction.

## Methodology

Suppose we are given a task, say we are given a customerâ€™s banking history and are tasked to find if the customer can be sanctioned a loan. Basically we need to find if given a loan, will the customer default on payment or not. We can use logistic regression for this purpose. It will be a binary classification between â€˜Yesâ€™ or â€˜Noâ€™. Logistic regression makes use of a sigmoid function and it is of the form -

We know the straight line equation -

y = w_{0} + w_{1}x

We know the sigmoid function has a range between 0 and 1. So letâ€™s divide the above equation by 1-y.

y / (1-y) : 0 for y = 0 and âˆž for y = 1

But we require our function to be between -âˆž to +âˆž. For that, weâ€™ll take logarithm so the new equation is:

Log (y / (1-y)) = w_{0} + w_{1}x

Upon simplifying our final equation then becomes -

Source - __Link__

Here y = predicted probability belonging to the default class( default class is 1(yes))

w_{0} + w_{1}x = the linear model within logistic regression.

Also, the function is of the form of a sigmoid function

The Sigmoid function has a range between 0 and 1. And therefore forms an S-like curve.

The logistic function predicts the probability of an outcome. Hence its value lies anywhere between 0 and 1. And thatâ€™s where it gets its name from. We choose a threshold value above which the final prediction would be 1 and 0 otherwise.

Letâ€™s talk about the linear equation w_{0} + w_{1}x within the logistic function. Why do we need the logistic regression function in the first place if it stems from linear regression?

Itâ€™s because the linear regression equation isnâ€™t confined within a range, unlike logistic regression. And it would be a very difficult task to assign a threshold value for class membership for a linear regression function. Thus we feed the predicted value to a sigmoid function which makes it Logistic regression having a range between 0 and 1. Now since the range is between 0 and 1(no outliers) it would be convenient to do a probabilistic classification.

It represents a linear relationship between the input features and the final output.

Here x = input feature

w_{0} = bias term

w_{1} = weight associated with the input variable

Now suppose we take 0.5 as our threshold value. That means

A predicted value >0.5 from the logistic function would have the final prediction as 1 and,

A predicted value â‰¤0.5 from the logistic function would have the final prediction as 0.

This is also called the decision boundary.

Source - __link__

Plotting the graph clears what makes logistic regression different from linear regression.