Posts by Collection

publications

Decision Trees That Remember: Gradient-Based Learning of Recurrent Decision Trees with Memory

Published in New Frontiers in Associative Memory workshop at ICLR 2025, 2025

We propose ReMeDe trees, a recurrent decision tree architecture with internal memory, enabling efficient learning for sequential data through hard, axis-aligned decision rules trained via gradient descent.

Recommended citation: Marton, Sascha, et al. (2025). "Decision Trees That Remember: Gradient-Based Learning of Recurrent Decision Trees with Memory." New Frontiers in Associative Memory workshop at ICLR 2025. 1(1).
Download Paper

Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization

Published in International Conference on Learning Representations (Spotlight), 2025

We propose a novel method for symbolic RL that enables end-to-end gradient-based learning of interpretable, axis-aligned decision trees, combining policy gradient optimization with symbolic decision-making.

Recommended citation: Marton, Sascha, et al. (2025). "Mitigating Information Loss in Tree-Based Reinforcement Learning via Direct Optimization." The Thirteenth International Conference on Learning Representations. 1(1).
Download Paper | Download Slides

talks

Explaining neural networks without access to training data

Published:

We consider generating explanations for neural networks in cases where the network’s training data is not accessible, for instance due to privacy or safety issues. Recently, Interpretation Nets (I-Nets) have been proposed as a sample-free approach to post-hoc, global model interpretability that does not require access to training data. They formulate interpretation as a machine learning task that maps network representations (parameters) to a representation of an interpretable function. In this paper, we extend the I-Net framework to the cases of standard and soft decision trees as surrogate models. We propose a suitable decision tree representation and design of the corresponding I-Net output layers. Furthermore, we make I-Nets applicable to real-world tasks by considering more realistic distributions when generating the I-Net’s training data. We empirically evaluate our approach against traditional global, post-hoc interpretability approaches and show that it achieves superior results when the training data is not accessible.

GRANDE: Gradient-Based Decision Tree Ensembles for Tabular Data

Published:

Despite the success of deep learning for text and image data, tree-based ensemble models are still state-of-the-art for machine learning with heterogeneous tabular data. However, there is a significant need for tabular-specific gradient-based methods due to their high flexibility. In this paper, we propose GRANDE, GRAdieNt-based Decision tree Ensembles, a novel approach for learning hard, axis-aligned decision tree ensembles using end-to-end gradient descent. GRANDE is based on a dense representation of tree ensembles, which affords to use backpropagation with a straight-through operator to jointly optimize all model parameters. Our method combines axis-aligned splits, which is a useful inductive bias for tabular data, with the flexibility of gradient-based optimization. Furthermore, we introduce an advanced instance-wise weighting that facilitates learning representations for both, simple and complex relations, within a single model. We conducted an extensive evaluation on a predefined benchmark with 19 classification datasets and demonstrate that our method outperforms existing gradient-boosting and deep learning frameworks on most datasets. The method is available under: https://github.com/s-marton/GRANDE