A lower bound and a near-optimal algorithm for bilevel empirical risk minimization

Dagréou, Mathieu; Moreau, Thomas; Vaiter, Samuel; Ablin, Pierre

Published November 23, 2023 | Version v1

Publication Metadata-only

A lower bound and a near-optimal algorithm for bilevel empirical risk minimization

Contributors

Others:

Modèles et inférence pour les données de Neuroimagerie (MIND) ; IFR49 - Neurospin - CEA ; Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Commissariat à l'énergie atomique et aux énergies alternatives (CEA)-Inria Saclay - Ile de France ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
Centre National de la Recherche Scientifique (CNRS)
Laboratoire Jean Alexandre Dieudonné (LJAD) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)
Apple Inc
ANR-20-THIA-0013,UDOPIA,Programme Doctoral en Intelligence Artificielle de l'Université Paris-Saclay(2020)
ANR-17-CONV-0003,Institut DATAIA (I2-DRIVE),Data Science, Artificial Intelligence and Society(2017)
ANR-18-CE40-0005,GraVa,Méthodes variationnelles pour les signaux sur graphe(2018)

Bilevel optimization problems, which are problems where two optimization problems are nested, have more and more applications in machine learning. In many practical cases, the upper and the lower objectives correspond to empirical risk minimization problems and therefore have a sum structure. In this context, we propose a bilevel extension of the celebrated SARAH algorithm. We demonstrate that the algorithm requires $\mathcal{O}((n+m)^{\frac12}\varepsilon^{-1})$ gradient computations to achieve $\varepsilon$-stationarity with $n+m$ the total number of samples, which improves over all previous bilevel algorithms. Moreover, we provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem. This lower bound is attained by our algorithm, which is therefore optimal in terms of sample complexity.

Additional details

URL: https://hal.science/hal-04302861
URN: urn:oai:HAL:hal-04302861v1

Origin repository: UNICA

	All versions	This version
Views	4	4
Downloads	0	0
Data volume	0 Bytes	0 Bytes

A lower bound and a near-optimal algorithm for bilevel empirical risk minimization

Creators

Contributors

Others:

Description

Additional details

Identifiers

Origin repository