High-dimensional variables clustering based on sub-asymptotic maxima of a weakly dependent random process
- Others:
- Littoral, Environment: MOdels and Numerics (LEMON) ; Inria Sophia Antipolis - Méditerranée (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Institut Montpelliérain Alexander Grothendieck (IMAG) ; Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Hydrosciences Montpellier (HSM) ; Institut de Recherche pour le Développement (IRD)-Institut national des sciences de l'Univers (INSU - CNRS)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Institut de Recherche pour le Développement (IRD)-Institut national des sciences de l'Univers (INSU - CNRS)-Centre National de la Recherche Scientifique (CNRS)
- Université Côte d'Azur (UCA)
- Laboratoire Jean Alexandre Dieudonné (JAD) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS) ; COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)
- IMAG
- Université de Montpellier (UM)
- Centre National de la Recherche Scientifique (CNRS)
Description
The dependence structure between extreme observations can be complex. For that purpose, we see clustering as a tool for learning the complexextremal dependence structure. We introduce the Asymptotic Independent block (AI-block) model, a model-based clustering where population-level clusters are clearly defined using independence of clusters' maxima of a multivariate random process. This class of models is identifiableallowing statistical inference. With a dedicated algorithm, we show that sample versions of the extremal correlation can be used to recover theclusters of variables without specifying the number of clusters. Our algorithm has a computational complexity that is polynomial in the dimensionand it is shown to be strongly consistent in growing dimensions where observations are drawn from a stationary mixing process. This implies thatgroups can be learned in a completely nonparametric inference in the study of dependent processes where block maxima are only subasymptotic,i.e., approximately extreme value distributed.
Abstract
International audience
Additional details
- URL
- https://hal.science/hal-03888395
- URN
- urn:oai:HAL:hal-03888395v1
- Origin repository
- UNICA