Need for AdaBoost Algorithm
AdaBoost is best used to improve decision tree performance on binary classification issues.
- AdaBoost is a machine learning algorithm that may be used to improve the performance of any other machine learning technique. It works well with students who are struggling. These are models that reach an accuracy just above random chance on a classification task.
- Decision trees with one level are the most suitable and commonly used algorithm with AdaBoost. These trees are known as decision stumps because they are so short and only have one categorised decision.
Working of AdaBoost Algorithm
Let's have a look at how the AdaBoost algorithm works. The data training period creates a certain number of decision trees. The improperly categorised record in the first model is given priority as the first decision tree/model is constructed. Only these records are sent to the second model as input. The procedure continues until we have decided on several base learners to develop. Remember that all boosting strategies allow for record repetition.
Algorithm
The following algorithm can be used to describe how AdaBoost works:
- Set up the dataset and give each data point the same amount of weight.
- Provide this as an input to the model and find the data points that were incorrectly classified.
- Increase the weight of the data points that were incorrectly classified.
-
If you got the required results:
→ Proceed to step 5.
- Otherwise, go to step 2.
- End
Example

(Diagram explaining the AdaBoost algorithm)
- The above diagram shows that Adaboost begins by randomly selecting a training subset. It trains the AdaBoost machine learning model iteratively by picking the training set based on the previous training's accurate prediction.
- It gives incorrectly classified observations a larger weight to have a higher chance of being classified in the next iteration. It also allocates weight to the trained classifier in each iteration based on the classifier's accuracy. The more accurate the classifier, the more weight will be given to it.
- This method is repeated until all of the training data fits perfectly or the maximum number of estimators is reached.
- Perform a "vote" across all of the learning algorithms you created to classify them.
Implementation in Python
The scikit-learn library provides the AdaBoostClassifier and AdaBoostRegressor classes in Python. In our example, we'd use AdaBoostClassifier (since our example is a classification task). Our dataset is split into training and test sets using the train-test split technique. We also import datasets from which we will use the Iris Dataset in the program.
Step 1 (Import Required Libraries and Load Iris Dataset)
You can use the IRIS dataset to develop the model, which is a well-known multi-class classification problem. There are four features in this dataset: sepal length, sepal width, petal length, and petal width, as well as a target (the type of flower). “Setosa”, “Versicolor”, and “Virginica” are the three flower classifications represented in this data. You may find the dataset in the scikit-learn library or get it from the UCI Machine Learning Library.
#program for adaboost classifier to classify iris dataset using sklearn
import numpy as np
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn import metrics
from sklearn.ensemble import AdaBoostClassifier
#load the iris dataset
iris = datasets.load_iris()
X = iris.data
y = iris.target
print(iris.data)

You can also try this code with Online Python Compiler
Run Code
Output

Step 2 (Split the Dataset)
We divide the data into two sets: a training set and a testing set because training and testing on the same data are inefficient for classification. To separate the data, we utilise the "train_test_split" function.
#split the dataset into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=4)
#shape of train and test objects
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)

You can also try this code with Online Python Compiler
Run Code
Output

Step 3 (Create and Train the Classifier)
Let's use Scikit-learn to build the AdaBoost Model. AdaBoost's default classifier is the Decision Tree Classifier. We then use the model's predict() method to determine which class it belongs to.
#create the adaboost classifier
clf = AdaBoostClassifier(n_estimators=50, random_state=1)
#train the classifier
clf.fit(X_train, y_train)
#predict the test set
y_pred = clf.predict(X_test)

You can also try this code with Online Python Compiler
Run Code
Step 4 (Evaluate the model)
Let's calculate how well the classifier or model can predict the cultivar type. By comparing actual test set values to expected values, accuracy can be calculated.
#print the accuracy
print("Accuracy:",metrics.accuracy_score(y_test, y_pred))

You can also try this code with Online Python Compiler
Run Code
Output

Frequently Asked Questions
Is AdaBoost intended solely for classification?
AdaBoost works by giving more weight to cases that are difficult to classify and less to those already well-classified. The AdaBoost algorithm can solve both classification and regression problems.
In what ways does AdaBoost improve classifier accuracy?
It combines many classifiers to improve classifier accuracy. AdaBoost is a method for creating iterative ensembles. The AdaBoost classifier creates a powerful classifier by combining several low-performing classifiers, resulting in a high-accuracy classifier.
What distinguishes the iris dataset?
The Iris dataset contains 50 samples of three Iris species with four characteristics (length and width of sepals and petals) (Iris setosa, Iris virginica and Iris versicolor). We use these measurements to develop a linear discriminant model to classify the species.
Why is AdaBoost considered adaptive?
AdaBoost is adaptive in that it tweaks successful weak learners in favour of instances misclassified by earlier classifiers. It may be less prone to the overfitting problem than other learning algorithms in particular situations.
Conclusion
This article extensively discussed the AdaBoost algorithm and its implementation in the Python programming language.
The key points covered in this article on the AdaBoost Algorithm are as follows:
- Ensemble learning
- Types of Ensemble Learning
- Boosting Ensemble Methods
- Need for AdaBoost Algorithm
- Working of AdaBoost Algorithm
- Implementation
We hope that this blog has helped you enhance your knowledge regarding the AdaBoost algorithm. If you want to learn more, check out our articles on "Data Preprocessing," "Python in Data Mining," "Orange in Data Mining," "Applications of Data Mining" and "Outliers in Data Analysis." Do upvote our blog to help other ninjas grow.
Head over to our practice platform Coding Ninjas Studio to practice top problems, attempt mock tests, read interview experiences, interview bundle, follow guided paths for placement preparations and much more!
Happy Reading!