Theory and Practice for Bayesian Regression Tree Ensembles

Antonio Linero, FSU

Friday, January 11, 2019 - 3:30pm

Ensembles of decision trees have become a standard component of the data analyst's toolkit; commonly used algorithms include random forests and boosted decision trees. In this talk, we investigate the properties of regression tree ensembles from a Bayesian standpoint. We focus on the interplay between theory and practice to study the properties of ensembles and obtain insights into (a)why decision tree ensembles are successful in practice and (b) where they might be improved. We provide validation for the long-held hypothesis that BART ensembles perform well due to their ability to detect low-order interactions, a property which describes many real-world settings. Further, we identify two areas in which BART ensembles can be expected to be suboptimal: under sparsity, and when the underlying regression function exhibits higher-order smoothness. We give theoretical support for these insights by establishing posterior contraction at near-optimal rates adaptively across a large family of function spaces, and provide empirical support by applying our methodology to benchmark datasets. We conclude by presenting extensions of our methodology which account for other interesting structures beyond sparsity and smoothness, and discuss how the insights we obtain can be extended to non-Bayesian decision tree ensembling methods.

Seminars generally take place in 116 Old Chemistry Building on Fridays from 3:30 - 4:30 pm. For additional information contact: karen.whitesell@duke.edu or phone 919-684-8029. Sorry, but we do not have reprints available. Please feel free to contact the authors by email for follow-up information, articles, etc. Reception following seminar in 203B Old Chemistry.

Old Chemistry 116

Location Info