Gradient boosting machines, a tutorial

2.3kCitations
Citations of this article
3.0kReaders
Mendeley users who have this article in their library.

Abstract

Gradient boosting machines are a family of powerful machine-learning techniques that have shown considerable success in a wide range of practical applications. They are highly customizable to the particular needs of the application, like being learned with respect to different loss functions. This article gives a tutorial introduction into the methodology of gradient boosting methods with a strong focus on machine learning aspects of modeling. A theoretical information is complemented with descriptive examples and illustrations which cover all the stages of the gradient boosting model design. Considerations on handling the model complexity are discussed. Three practical examples of gradient boosting applications are presented and comprehensively analyzed. © 2013 Natekin and Knoll.

Figures

  • FIGURE 1 | Continuous loss functions: (A) L2 squared loss function; (B) L1 absolute loss function; (C) Huber loss function; (D) Quantile loss function. Demonstration of fitting a smooth GBM to a noisy sinc(x) data: (E) original sinc(x) function; (F) smooth GBM fitted with L2 and L1 loss; (G) smooth GBM fitted with Huber loss with δ = {4,2,1}; (H) smooth GBM fitted with Quantile loss with α = {0.5,0.1,0.9}.
  • FIGURE 2 | (A) Bernoulli loss function. (B) Adaboost loss function. (C) GBM 2d classification with Bernoulli loss. (D) GBM 2d classification with Adaboost loss.
  • FIGURE 3 | P-Spline GBM model for different numbers of boosts: (A) M = 1; (B) M = 10; (C) M = 50; (D) M = 100. Decision-tree based GBM model for different numbers of boosts: (E) M = 1; (F) M = 10; (G) M = 50; (H) M = 100.
  • FIGURE 4 | Examples of overfitting in GBMs on: (A) regression task; (B) classification task. Demonstration of fitting a decision-tree GBM to a noisy sinc(x) data: (C) M = 100, λ = 1; (D) M = 1000, λ = 1; (E) M = 100, λ = 0.1; (F) M = 1000, λ = 0.1.
  • FIGURE 5 | Error curves for GBM fitting on sinc(x) data: (A) training set error; (B) validation set error. Error curves for learning simulations and number of base-learners M estimation: (C) error curves for cross-validation; (D) error curves for bootstrap estimates.
  • FIGURE 6 | EMG processing: (A) raw EMG data; (B) absolute EMG sensor values; (C) chunked EMG data; (D) moving average smoothed data.
  • FIGURE 7 | Bootstrap estimates of M for the EMG robotic control data. (A) Held-out error for linear GBMs; (B) held-out error for spline GBMs; (C) held-out error for stump-based GBMs; (D) held-out error for tree-based GBMs with interaction depth d = 4. (E) Sample prediction of the additive GBMs for the EMG robotic control data; (F) sample prediction of the
  • Table 1 | Machine learning algorithm accuracy.

References Powered by Scopus

Random forests

95650Citations
N/AReaders
Get full text

Greedy function approximation: A gradient boosting machine

20080Citations
N/AReaders
Get full text

A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting

13267Citations
N/AReaders
Get full text

Cited by Powered by Scopus

Ensemble learning: A survey

2351Citations
N/AReaders
Get full text

Machine learning methods for landslide susceptibility studies: A comparative overview of algorithm performance

717Citations
N/AReaders
Get full text

Guidelines for developing and reporting machine learning predictive models in biomedical research: A multidisciplinary view

703Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Natekin, A., & Knoll, A. (2013). Gradient boosting machines, a tutorial. Frontiers in Neurorobotics, 7(DEC). https://doi.org/10.3389/fnbot.2013.00021

Readers over time

‘14‘15‘16‘17‘18‘19‘20‘21‘22‘23‘24‘2502505007501000

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 891

69%

Researcher 242

19%

Professor / Associate Prof. 89

7%

Lecturer / Post doc 69

5%

Readers' Discipline

Tooltip

Computer Science 427

44%

Engineering 356

37%

Agricultural and Biological Sciences 106

11%

Mathematics 79

8%

Article Metrics

Tooltip
Mentions
News Mentions: 1
Social Media
Shares, Likes & Comments: 7

Save time finding and organizing research with Mendeley

Sign up for free
0