Published August 12, 2024 | Version v1
Conference paper

Scheduling Machine Learning Compressible Inference Tasks with Limited Energy Budget

Others:
Combinatorics, Optimization and Algorithms for Telecommunications (COATI) ; Inria Sophia Antipolis - Méditerranée (CRISAM) ; Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-COMmunications, Réseaux, systèmes Embarqués et Distribués (Laboratoire I3S - COMRED) ; Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)
Université Côte d'Azur (UniCA)
Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S) ; Université Nice Sophia Antipolis (1965 - 2019) (UNS)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UniCA)
ANR-22-PEFT-0002,NF-MUST,end-to-end MUlti-domain Service managemenT architectures (MUST)(2022)
ANR-15-IDEX-0001,UCA JEDI,Idex UCA JEDI(2015)
ANR-17-EURE-0004,UCA DS4H,UCA Systèmes Numériques pour l'Homme(2017)
ANR-19-CE25-0001,ARTIC,Contrôle basé sur l'Intelligence Artificielle de réseau en nuage(2019)
ANR-23-PECL-0003,CARECloud,Comprendre, Améliorer, Réduire les impacts Environnementaux du Cloud computing(2023)
European Project:

Description

Advancements in cloud computing have boosted Machine Learning as a Service (MLaaS), highlighting the challenge of scheduling tasks under latency and deadline constraints. Neural network compression offers the latency and energy consumption reduction in data centers, aligning with efforts to minimize cloud computing's carbon footprint, despite some accuracy loss.

This paper investigates the Deadline Scheduling with Compressible Tasks -Energy Aware (DSCT-EA) problem, which addresses the scheduling of compressible machine learning tasks on several machines, with different speeds and energy efficiencies, under an energy budget constraint. Solving DSCT-EA involves determining both the machine on which each task will be processed and its processing time, a problem that has been proven to be NP-Hard. We formulate DSCT-EA as a Mixed-Integer Programming (MIP) problem and also provide an approximation algorithm for solving it. The efficacy of our approach is demonstrated through extensive experimentation, revealing its superiority over traditional scheduling techniques. It allows to save up to 70% of the energy budget of image classification tasks, while only losing 2% of accuracy compared to when not using compression.

Abstract

International audience

Additional details

Created:
August 24, 2024
Modified:
August 24, 2024