Improved Feature Importance Computations for Tree Models: Shapley vs. Banzhaf
The Shapley value is one of the main tools used to explain predictions of tree ensemble models. The main alternative to the Shapley value is the Banzhaf value but the difference between both has not been understood well enough. In this paper we make a step towards filling this gap, providing both experimental and theoretical comparison of these model explanation methods. Surprisingly, we show that the Banzhaf value offers several advantages over the Shapley value while providing essentially the same explanations. We verify that the Banzhaf value for tree ensamble models:
* has a more intuitive interpretation,
* allows for more efficient algorithms,
* is much more numerically robust.
We provide an experimental evaluation of these theses. In particular, we show that on real world instances.
Additionally, from a theoretical perspective, we provide a new and improved algorithm for computing the same Shapley value-based explanations as the algorithm of Lundberg et al. [Nat. Mach. Intell. 2020]. Our algorithm runs in O(TLD+n) time, whereas the previous algorithm had O(TLD^2+n) running time bound. Here, T is the number of trees, L is the maximum number of leaves in a tree, and $D$ denotes the maximum depth of a tree in the ensemble. Using the computational techniques developed for the Shapley value, we deliver an optimal O(TL+n) time algorithm for computing the Banzhaf value-based explanations. In our experiments, these algorithms give running times smaller even by order of magnitude.
Bio
Tomasz P. Michalak is a lecturer at the Institute of Informatics, Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, and a Research Team Leader at the IDEAS NCBR research institute, Warsaw, Poland. He worked at the Department of Computer Science, University of Oxford, the School of Engineering and Computer Science, University of Southampton, the Department of Computer Science, University of Liverpool, and the Faculty of Applied Economics, University of Antwerp. He graduated from the Faculty of Economic Sciences, University of Warsaw, and holds a Ph.D. in economics from the Faculty of Applied Economics, University of Antwerp, Belgium. He has been a PI of two national and one international research projects. His results have been published by prestigious conferences and journals such as AAAI, IJCAI, IEEE ICDM, ACM EC, Nature Human Behaviour, Journal of Artificial Intelligence, and Games and Economic Behaviour.
Tomasz Michalak’s research interest focus on Artificial Intelligence, Social Networks, Computational Social Science, Multi-Agent Systems, Game Theory, Fintech, E-Commerce. His specialty is the interface of Computer Science, Game Theory, and Economics, an area often called Algorithmic Game Theory. Recently, he has been especially interested in applications of cooperative and non-cooperative game theory to networks, social networks, machine learning, and blockchain.