Published August 4, 2023
| Version v1
Publication
Variable importance for causal forests: breaking down the heterogeneity of treatment effects
Creators
Contributors
Others:
- Safran Tech
- Médecine de précision par intégration de données et inférence causale (PREMEDICAL) ; Centre Inria d'Université Côte d'Azur (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut Desbrest d'Epidémiologie et de Santé Publique (IDESP) ; Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Montpellier (UM)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Montpellier (UM)
- Institut Desbrest d'Epidémiologie et de Santé Publique (IDESP) ; Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Montpellier (UM)
Description
Causal random forests provide efficient estimates of heterogeneous treatment effects. However, forest algorithms are also well-known for their black-box nature, and therefore, do not characterize how input variables are involved in treatment effect heterogeneity, which is a strong practical limitation. In this article, we develop a new importance variable algorithm for causal forests, to quantify the impact of each input on the heterogeneity of treatment effects. The proposed approach is inspired from the drop and relearn principle, widely used for regression problems. Importantly, we show how to handle the forest retrain without a confounding variable. If the confounder is not involved in the treatment effect heterogeneity, the local centering step enforces consistency of the importance measure. Otherwise, when a confounder also impacts heterogeneity, we introduce a corrective term in the retrained causal forest to recover consistency. Additionally, experiments on simulated, semi-synthetic, and real data show the good performance of our importance measure, which outperforms competitors on several test cases. Experiments also show that our approach can be efficiently extended to groups of variables, providing key insights in practice.
Additional details
Identifiers
- URL
- https://hal.science/hal-04177493
- URN
- urn:oai:HAL:hal-04177493v1
Origin repository
- Origin repository
- UNICA