False discovery proportion envelopes with m-consistency
- Creators
- Meah, Iqraa
- Blanchard, Gilles
- Roquain, Etienne
- Others:
- Laboratoire de Probabilités, Statistique et Modélisation (LPSM (UMR_8001)) ; Sorbonne Université (SU)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité)
- Laboratoire de Mathématiques d'Orsay (LMO) ; Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
- Understanding the Shape of Data (DATASHAPE) ; Inria Sophia Antipolis - Méditerranée (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Inria Saclay - Ile de France ; Institut National de Recherche en Informatique et en Automatique (Inria)
- ANR-21-CE23-0035,ASCAI,Segmentation, clustering, et seriation actifs et passifs: vers des fondations unifiées en IA(2021)
- ANR-19-CHIA-0021,BISCOTTE,Approches statistiquement et computationnellement efficicaces pour l'intelligence artificielle(2019)
Description
We provide new nonasymptotic false discovery proportion (FDP) confidence envelopes in several multiple testing settings relevant for modern high dimensional-data methods. We revisit the multiple testing scenarios considered in the recent work of Katsevich and Ramdas (2020): top-k, preordered (including knockoffs), online. Our emphasis is on obtaining FDP confidence bounds that both have nonasymptotical coverage and are asymptotically accurate in a specific sense, as the number m of tested hypotheses grows. Namely, we introduce and study the property (which we call m-consistency) that the confidence bound converges to or below the desired level α when applied to a specific reference α-level false discovery rate (FDR) controlling procedure. In this perspective, we derive new bounds that provide improvements over existing ones, both theoretically and practically, and are suitable for situations where at least a moderate number of rejections is expected. These improvements are illustrated with numerical experiments and real data examples. In particular, the improvement is significant in the knockoffs setting, which shows the impact of the method for a practical use. As side results, we introduce a new confidence envelope for the empirical cumulative distribution function of i.i.d. uniform variables and we provide new power results in sparse cases, both being of independent interest.
Additional details
- URL
- https://hal.science/hal-04727618
- URN
- urn:oai:HAL:hal-04727618v1
- Origin repository
- UNICA