Table of contents
1.
Introduction
2.
What is LightGBM?
3.
Leaf-wise vs Level-wise
4.
LightGBM Architecture
5.
GOSS(Gradient-Based One Side Sampling):
6.
EFB(Exclusive Feature Bundling)
7.
Applications of LightGBM
8.
Advantages of LightGBM
9.
FAQs
10.
Key takeaways
Last Updated: Mar 27, 2024

LightGBM

Author Tashmit
0 upvote
Career growth poll
Do you think IIT Guwahati certified course can help you in your career?

Introduction

Supervised learning is a vast topic to study; it involves many algorithms to solve classification and regression problems. One such algorithm is the Decision Tree. A Decision Tree uses a graph-like decision model. Each internal node represents a test on attributes, each branch represents the trial's outcome, and each leaf node represents a class label. LightGBM is a boosting framework dependent on decision trees responsible for increasing the model's efficiency and reducing memory usage. To get a deeper understanding of the decision tree, visit this article.

What is LightGBM?

LightGBM is short for Light Gradient Boosting Machine. It is an efficient and distributing boosting framework that uses tree-based learning. The trees are built in the gradient boosting framework, like the random forest. LightGBM uses leaf-wise growth to make the tree where the tree is created for each and every sample.

Leaf-wise vs Level-wise

There are two ways to compute the trees; the first is level-wise, and the other is leaf-wise. In the level-wise method, the tree grows level by level; here, each node splits the data prioritizing and nodes closer to the tree's root. 

In the leaf-wise method, the tree grows by splitting the data at the nodes with the most tremendous loss change. 

The primary difference between the two is that the Level-wise tree is balanced, whereas the leaf-wise tree is unbalanced.

Source: Link

LightGBM Architecture

The LightGBM grows the tree based on the leaf-wise technique. The leaf is chosen based on the maximum loss to grow. The leaf-wise algorithm has a lesser loss than the level-wise tree because the leaf is fixed. Therefore, the leaf-wise tree growth increases the model's complexity and sometimes leads to overfitting in small datasets. 

The LightGBM algorithm targets to reduce the complexity with the help of two techniques, namely Gradient-Based One Side Sampling and Exclusive Feature Bundling.  

GOSS(Gradient-Based One Side Sampling):

Gradient-Based One Side Sampling, as the name suggests, is the technique to downsample the instance based on the gradient. You would know that having a higher gradient means that samples are undertrained and those with low gradient are well trained. Therefore, the gradient-based one-sided sampling preserves the higher gradients cases and performs random sampling on instances with smaller gradients.

EFB(Exclusive Feature Bundling)

The LightGBM tree is formed by understanding the data and features. Therefore, to speed up the tree learning, LightGBM downsamples the element with the help of Exclusive Feature Bundling. It picks out the mutually exclusive features and bundles them into a single component to minimize the complexity. 

Source: Link

Applications of LightGBM

  • It can be used for classification and regression problems
  • Cross-entropy using the log loss objective function
  • LambdaRank- a method in which ranking is transformed into classification r regression problem

Advantages of LightGBM

  • Fast training speed and higher efficiency
  • Better accuracy than any other boosting algorithm
  • Require less memory
  • Compatible with large datasets

FAQs

  1. How is LightGBM so fast?
    LightGBM achieves the speed by downsampling the features and speeding up tree learning. It is done with the help of exclusive bunding features.
     
  2. What is N_estimators LightGBM?
    n_estimators are the Number of boosted trees to fit.
     
  3. Is LightGBM better than XGBoost?
    Light GBM is almost 7 times quicker than XGBOOST, hence a much better approach when dealing with large datasets in limited-time competitions.

Key takeaways

LightGBM is a practical boosting framework that uses leaf-wise tree-based growth. This article gave an in-depth understanding of LightGBM, its architecture, the two methods by which it processes, its application, and its advantages. To build a career in Data Science? Check out our industry-oriented machine learning course curated by our faculty from Stanford University and Industry experts.

Live masterclass