What is Supervised Learning?
Supervised Learning is a machine learning technique in which we map the inputs against some specific output.
Some of the commonly used supervised learning algorithms are:
- Linear Regression
- KNN (K-Nearest Neighbors)
- Logistic Regression
- SVM (Support Vector Machine)
- Decision Trees
- Random Forest
Drawbacks of Supervised Learning Algorithms
- High Computational Time: The processing is very expensive computationally & training a large chunk of datasets requires a lot of time.
- Data Preprocessing: Data Preprocessing is required in order to apply the supervised learning algorithms.
- Prone to Overfitting: The supervised learning algorithms can be easily overfitted if not applied correctly.
- Labeled Data Required: We require properly labeled data (i.e. one output mapped against all inputs) to apply supervised learning.
- Cannot give new information: With unsupervised learning, we can gather new pieces of information that were unknown to us, but it is not possible with supervised learning.
- Limited Output: The output is limited to the labels already in the target feature. With Supervised Learning, we can never get a new output; the output will always be one of the labels from the target column.
- Requires Balanced Dataset: For accurate prediction, we must train supervised learning algorithms on the balanced datasets; otherwise, it can get biased for a label with more occurrences.
- Problem with Big Datasets: To apply supervised learning, we need balanced datasets (i.e., nearly the same number of rows corresponding to each label); balancing & preprocessing big datasets is a more significant challenge than making predictions.
- Limited Performance: The supervised learning algorithms are trained to replicate the training dataset; hence they can never outperform the training data.