An evaluation of eight machine learning regression algorithms for forest aboveground biomass estimation from multiple satellite data products

95Citations
Citations of this article
135Readers
Mendeley users who have this article in their library.

Abstract

This study provided a comprehensive evaluation of eight machine learning regression algorithms for forest aboveground biomass (AGB) estimation from satellite data based on leaf area index, canopy height, net primary production, and tree cover data, as well as climatic and topographical data. Some of these algorithms have not been commonly used for forest AGB estimation such as the extremely randomized trees, stochastic gradient boosting, and categorical boosting (CatBoost) regression. For each algorithm, its hyperparameters were optimized using grid search with cross-validation, and the optimal AGB model was developed using the training dataset (80%) and AGB was predicted on the test dataset (20%). Performance metrics, feature importance as well as overestimation and underestimation were considered as indicators for evaluating the performance of an algorithm. To reduce the impacts of the random training-test data split and sampling method on the performance, the above procedures were repeated 50 times for each algorithm under the random sampling, the stratified sampling, and separate modeling scenarios. The results showed that five tree-based ensemble algorithms performed better than the three nonensemble algorithms (multivariate adaptive regression splines, support vector regression, and multilayer perceptron), and the CatBoost algorithm outperformed the other algorithms for AGB estimation. Compared with the random sampling scenario, the stratified sampling scenario and separate modeling did not significantly improve the AGB estimates, but modeling AGB for each forest type separately provided stable results in terms of the contributions of the predictor variables to the AGB estimates. All the algorithms showed forest AGB were underestimated when the AGB values were larger than 210 Mg/ha and overestimated when the AGB values were less than 120 Mg/ha. This study highlighted the capability of ensemble algorithms to improve AGB estimates and the necessity of improving AGB estimates for high and low AGB levels in future studies.

References Powered by Scopus

Random forests

95789Citations
N/AReaders
Get full text

Greedy function approximation: A gradient boosting machine

20109Citations
N/AReaders
Get full text

WorldClim 2: new 1-km spatial resolution climate surfaces for global land areas

10578Citations
N/AReaders
Get full text

Cited by Powered by Scopus

A Review of Ensemble Learning Algorithms Used in Remote Sensing Applications

164Citations
N/AReaders
Get full text

Combination of feature selection and catboost for prediction: The first application to the estimation of aboveground biomass

141Citations
N/AReaders
Get full text

A stacking ensemble algorithm for improving the biases of forest aboveground biomass estimations from multiple remotely sensed datasets

54Citations
N/AReaders
Get full text

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Cite

CITATION STYLE

APA

Zhang, Y., Ma, J., Liang, S., Li, X., & Li, M. (2020). An evaluation of eight machine learning regression algorithms for forest aboveground biomass estimation from multiple satellite data products. Remote Sensing, 12(24), 1–26. https://doi.org/10.3390/rs12244015

Readers' Seniority

Tooltip

PhD / Post grad / Masters / Doc 44

75%

Researcher 10

17%

Lecturer / Post doc 3

5%

Professor / Associate Prof. 2

3%

Readers' Discipline

Tooltip

Environmental Science 20

38%

Earth and Planetary Sciences 11

21%

Engineering 11

21%

Agricultural and Biological Sciences 11

21%

Article Metrics

Tooltip
Mentions
Blog Mentions: 1

Save time finding and organizing research with Mendeley

Sign up for free