The density of expected persistence diagrams and its kernel based estimation
- Creators
- Chazal, Frédéric
- Divol, Vincent
- Others:
- Understanding the Shape of Data (DATASHAPE) ; Inria Sophia Antipolis - Méditerranée (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Inria Saclay - Ile de France ; Institut National de Recherche en Informatique et en Automatique (Inria)
- Model selection in statistical learning (SELECT) ; Inria Saclay - Ile de France ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire de Mathématiques d'Orsay (LMO) ; Université Paris-Sud - Paris 11 (UP11)-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11)-Centre National de la Recherche Scientifique (CNRS)
- This work was partially supported by the Advanced Grant of the European Research Council GUDHI(Geometric Understanding in Higher Dimensions) and a collaborative research agreement between Inria andFujitsu.
- European Project: 339025,EC:FP7:ERC,ERC-2013-ADG,GUDHI(2014)
Description
Persistence diagrams play a fundamental role in Topological Data Analysis where they are used as topological descriptors of filtrations built on top of data. They consist in discrete multisets of points in the plane R 2 that can equivalently be seen as discrete measures in R 2. When the data come as a random point cloud, these discrete measures become random measures whose expectation is studied in this paper. First, we show that for a wide class of filtrations, including the Čech and Rips-Vietoris filtrations, the expected persistence diagram, that is a deterministic measure on R 2 , has a density with respect to the Lebesgue measure. Second, building on the previous result we show that the persistence surface recently introduced in [Adams & al., Persistenceimages: a stable vector representation of persistent homology] can be seen as a kernel estimator of this density. We propose a cross-validation scheme for selecting an optimal bandwidth, which is proven to be a consistent procedure to estimate the density.
Abstract
Extended version of the SoCG proceedings, submitted to a journal
Abstract
International audience
Additional details
- URL
- https://hal.archives-ouvertes.fr/hal-01716181
- URN
- urn:oai:HAL:hal-01716181v3
- Origin repository
- UNICA