Published June 24, 2023
| Version v1
Conference paper
On the Validation of Gibbs Algorithms: Training Datasets, Test Datasets and their Aggregation
Contributors
Others:
- Network Engineering and Operations (NEO ) ; Inria Sophia Antipolis - Méditerranée (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
- Department of Electrical and Computer Engineering [Princeton] (ECE) ; Princeton University
- Laboratoire de Géométrie Algébrique et Applications à la Théorie de l'Information (GAATI) ; Université de la Polynésie Française (UPF)
- Department of Automatic Control and Systems Engineering [ Sheffield] (ACSE) ; University of Sheffield [Sheffield]
Description
The dependence on training data of the Gibbs algorithm (GA) is analytically characterized. By adopting the expected empirical risk as the performance metric, the sensitivity of the GA is obtained in closed-form. In this case, sensitivity is the performance difference with respect to an arbitrary alternative algorithm. This description enables the development of explicit expressions involving the training errors and test errors of GAs trained with different datasets. Using these tools, dataset aggregation is studied and different figures of merit to evaluate the generalization capabilities of GAs are introduced. For particular sizes of such datasets and parameters of the GAs, a connection between Jeffrey's divergence, training and test errors is established.
Abstract
International audienceAdditional details
Identifiers
- URL
- https://hal.science/hal-04096054
- URN
- urn:oai:HAL:hal-04096054v1
Origin repository
- Origin repository
- UNICA