Post hoc false positive control for structured hypotheses
- Others:
- Laboratoire de Probabilités, Statistique et Modélisation (LPSM (UMR_8001)) ; Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité)
- Institut für Mathematik [Potsdam] ; University of Potsdam = Universität Potsdam
- Laboratoire de Mathématiques d'Orsay (LMO) ; Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
- Centre National de la Recherche Scientifique (CNRS)
- Understanding the Shape of Data (DATASHAPE) ; Inria Sophia Antipolis - Méditerranée (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Inria Saclay - Ile de France ; Institut National de Recherche en Informatique et en Automatique (Inria)
- Institut de Mathématiques de Toulouse UMR5219 (IMT) ; Université Toulouse 1 Capitole (UT1) ; Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse) ; Institut National des Sciences Appliquées (INSA)-Université Fédérale Toulouse Midi-Pyrénées-Institut National des Sciences Appliquées (INSA)-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3) ; Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)
- ANR-16-CE40-0019,SansSouci,Approches post hoc pour les tests multiples à grande échelle(2016)
- ANR-17-CE40-0001,BASICS,Bayésien non-paramétrique, quantification de l'incertitude et structures aléatoires(2017)
Description
In a high‐dimensional multiple testing framework, we present new confidence bounds on the false positives contained in subsets S of selected null hypotheses. These bounds are post hoc in the sense that the coverage probability holds simultaneously over all S, possibly chosen depending on the data. This article focuses on the common case of structured null hypotheses, for example, along a tree, a hierarchy, or geometrically (spatially or temporally). Following recent advances in post hoc inference, we build confidence bounds for some pre-specified forest‐structured subsets and deduce a bound for any subset S by interpolation. The proposed bounds are shown to improve substantially previous ones when the signal is locally structured. Our findings are supported both by theoretical results and numerical experiments. Moreover, our bounds can be obtained by an algorithm (with complexity bilinear in the sizes of the reference hierarchy and of the selected subset) that is implemented in the open‐source R package sansSouci available from https://github.com/pneuvial/sanssouci, making our approach operational.
Abstract
International audience
Additional details
- URL
- https://hal.archives-ouvertes.fr/hal-01829037
- URN
- urn:oai:HAL:hal-01829037v3
- Origin repository
- UNICA