GS-BART: Graph split additive decision trees for classification and nonparametric regression of spatial and network data

February 23, -
Speaker(s): Huiyan Sang Professor of Statistics Texas A&M University
Ensemble decision tree methods such as XGBoost, RF, and BART have gained enormous popularity in data science for their superior performance in machine learning regression and classification tasks. In this paper, we develop a new Bayesian graph-split-based additive decision trees method, called GS-BART, to improve the performance of Bayesian additive decision trees for complex dependent data with graph relations. The new method adopts a highly flexible split rule complying with graph structure to relax the axis-parallel split rule assumption in most existing ensemble decision tree models. We design a scalable informed MCMC algorithm leveraging a gradient-based recursive algorithm on spanning trees or chains to sample the graph-split-based decision tree. The superior performance of the method over conventional ensemble tree models and gaussian process models is illustrated in various regression and classification tasks for spatial and network data analysis.

Statistical Science