What is AutoML?
The full-form of AutoML is Automated Machine Learning. With AutoML, we can automate the machine learning tasks and reduce human intervention, to lessen human efforts.
Source: www.wired.com
A Machine Learning pipeline consists of several steps, from Data Acquisition to Making Predictions on a new dataset.
Figure: ML Pipeline
From this complete pipeline, we can automate the two steps Model Training / Model Selection and Hyper-parameter optimization by using AutoML.
Also, see - Locally Weighted Regression.
AutoML Libraries
Today, many libraries are available to implement Automated Machine Learning. According to the use-case, the user can choose the library that best suits their needs.
Let’s see some of the widely used AutoML libraries.
Auto-sklearn
Auto-sklearn is designed specifically for Machine Learning models. Auto-sklearn works smoothly with all the sklearn libraries. It automatically selects the best performing algorithm for a dataset and even changes the hyperparameters to give the best results.
Auto-sklearn library is easy to install with just one statement.
pip install auto-sklearn
The two primary classes in the Auto-sklearn module are AutoSklearnClassifier and AutoSklearnRegressor; we use one for the Classification based Machine Learning tasks while the other for generating numerical predictions.
The official documentation of Auto-Sklearn can be found here.
AutoKeras
The motive behind the development of AutoKeras was to make computer vision tasks simpler such that it is accessible to everyone. We use AutoKeras for Deep-Learning based tasks. Neural Networks and computer vision tasks are more complex, so we use AuoKeras to make it easier to implement.
AutoKeras can generate the best neural network architectures based on the task. In Neural Networks, selecting the number of neurons, the number of hidden layers, or setting the parameters is tedious and takes up huge amounts of human effort. Fortunately, AutoKeras considers all the possibilities and outputs the best model for the dataset.
We can easily install the AutoKeras library in our python environment.
pip install autokeras
Here is a small demonstration of importing AutoKeras in the python script and using it for Image Classification.
import autokeras as autok
classifier = autok.ImageClassifier()
classifier.fit(train_data, train_labels)
result = classifier.predict(test_data)
To read the official documentation of AutoKeras, visit here.
TPOT
We use TPOT for Automating certain stages in the Machine Learning pipeline. If data preparation is the most critical objective in our task, we can use TPOT as the data preparation, and modeling algorithms stages are emphasized more with this library.
The stages that are automated with TPOT are:
- Feature Selection,
- Feature Processing,
- Feature Construction,
- Model Selection, and
- Parameter Optimization
Like most AutoML libraries, TPOT can take hours to days to produce good quality results.
We can install TPOT using the pip command.
pip install tpot
Figure: TPOT Pipeline
Source: http://epistasislab.github.io/tpot/
Here is a simple illustration to use TPOT to classify digits in the sklearn digits dataset.
from tpot import TPOTClassifier
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
digits = load_digits()
train_data, test_data, train_labels, test_labels = train_test_split(digits.data, digits.target, train_size=0.75, test_size=0.25)
classifier = TPOTClassifier(generations=5, population_size=50, verbosity=2, n_jobs=-1)
classifier.fit(train_data, train_labels)
classifier.score(test_data, test_labels)