BORKAR , Vivek

Last name: BORKAR

First name: Vivek

May 2022 (v1)

Journal article

Metadata-only

Whittle index based Q-learning for restless bandits with average reward

Avrachenkov, Konstantin Borkar, Vivek

A novel reinforcement learning algorithm is introduced for multiarmed restless bandits with average reward, using the paradigms of Q-learning and Whittle index. Specifically, we leverage the structure of the Whittle index policy to reduce the search space of Q-learning, resulting in major computational gains. Rigorous convergence analysis is...

Uploaded on: December 3, 2022
June 2019 (v1)

Journal article

Metadata-only

Metastability in Stochastic Replicator Dynamics

Avrachenkov, Konstantin Borkar, Vivek

We consider a novel model of stochastic replicator dynamics for potential games that converts to a Langevin equation on a sphere after a change of variables. This is distinct from the models of stochastic replicator dynamics studied earlier. In particular, it is ill-posed due to non-uniqueness of solutions, but is amenable to a natural...

Uploaded on: December 4, 2022
March 2018 (v1)

Journal article

Metadata-only

Whittle Index Policy for Crawling Ephemeral Content

Avrachenkov, Konstantin Borkar, Vivek

We consider the task of scheduling a crawler to retrieve from several sites their ephemeral content. This is content, such as news or posts at social network groups, for which a user typically loses interest after some days or hours. Thus development of a timely crawling policy for ephemeral information sources is very important. We first...

Uploaded on: December 4, 2022
September 24, 2019 (v1)

Conference paper

Metadata-only

A learning algorithm for the Whittle index policy for scheduling web crawlers

Avrachenkov, Konstantin Borkar, Vivek

We revisit the Whittle index policy for scheduling web crawlers for ephemeral content proposed in Avrachenkov and Borkar, IEEE Trans. Control of Network Systems 5(1), 2016, and develop a reinforcement learning scheme for it based on LSPE(0). The scheme leverages the known structural properties of the Whittle index policy.

Uploaded on: December 4, 2022
December 12, 2017 (v1)

Conference paper

Metadata-only

Controlling G-AIMD by Index Policy

Avrachenkov, Konstantin Borkar, Vivek Pattathil, Sarath

We consider the Generalized Additive Increase Multiplicative Decrease (G-AIMD) dynamics for resource allocation with alpha fairness utility function. This dynamics has a number of important applications such as internet congestion control, charging electric vehicles, and smart grids. We prove indexability for the special case of MIMD model and...

Uploaded on: March 25, 2023
December 20, 2021 (v1)

Conference paper

Metadata-only

Deep Reinforcement Learning for Web Crawling

Avrachenkov, Konstantin Borkar, Vivek Patil, Kishor

A search engine uses a web crawler to crawl the pages from the world wide web (WWW) and aims to maintain its local cache as fresh as possible. Unfortunately, the rates at which different pages change in WWW are highly nonuniform and also, unknown in many real-life scenarios. In addition, the finite available bandwidth and possible server...

Uploaded on: December 3, 2022
June 14, 2023 (v1)

Conference paper

Metadata-only

Full Gradient Deep Reinforcement Learning for Average-Reward Criterion

Pagare, Tejas Borkar, Vivek Avrachenkov, Konstantin

We extend the provably convergent Full Gradient DQN algorithm for discounted reward Markov decision processes from Avrachenkov et al. (2021) to average reward problems. We experimentally compare widely used RVI Q-learning with recently proposed Differential Q-learning in the neural function approximation setting with Full Gradient DQN and DQN....

Uploaded on: January 6, 2024
January 31, 2023 (v1)

Publication

Metadata-only

Special Issue : Multi-Agent Dynamic Decision Making and Learning

Avrachenkov, Konstantin Borkar, Vivek Nair, U. Jayakrishnan

No description

Uploaded on: January 6, 2024
September 19, 2022 (v1)

Conference paper

Metadata-only

Tabular and Deep Learning of Whittle Index

Robledo, Francisco Borkar, Vivek Ayesta, Urtzi

The Whittle index policy is a heuristic that has shown remarkable good performance (with guaranted asymptotic optimality) when applied to the class of problems known as Restless Multi-Armed Bandit Problems (RMABP). In this paper we present QWI and QWINN, two algorithms capable of learning the Whittle indices for the total discounted criterion....

Uploaded on: December 4, 2022
March 19, 2018 (v1)

Journal article

Metadata-only

Revisiting random walk based sampling in networks: Evasion of burn-in period and frequent regenerations

Avrachenkov, Konstantin Borkar, Vivek Kadavankandy, Arun

Background: In the framework of network sampling, random walk (RW) based estimation techniques provide many pragmatic solutions while uncovering the unknown network as little as possible. Despite several theoretical advances in this area, RW based sampling techniques usually make a strong assumption that the samples are in stationary regime,...

Uploaded on: December 4, 2022
June 5, 2021 (v1)

Book section

Metadata-only

Full Gradient DQN Reinforcement Learning: A Provably Convergent Scheme

Avrachenkov, Konstantin Borkar, Vivek Dolhare, Harsh

We analyze the DQN reinforcement learning algorithm as a stochastic approximation scheme using the o.d.e. (for 'ordinary differential equation') approach and point out certain theoretical issues. We then propose a modified scheme called Full Gradient DQN (FG-DQN, for short) that has a sound theoretical basis and compare it with the original...

Uploaded on: December 3, 2022
September 19, 2022 (v1)

Publication

Metadata-only

Tabular and Deep Learning of Whittle Index

Robledo, Francisco Ayesta, Urtzi Avrachenkov, Konstantin

- Whittle index policy is an asymptotically optimal heuristic for solving Restless Multi-Armed Bandit Problems (RMBAP). - We propose two algorithms, QWI and QWINN, for the computation of such indices. - Both employ a two timescale system for the computation of the indices and the Q-values of each state/action.

Uploaded on: December 3, 2022
June 14, 2021 (v1)

Conference paper

Metadata-only

QWI: Q-learning with Whittle Index

Robledo, Francisco Borkar, Vivek Ayesta, Urtzi

The Whittle index policy is a heuristic that has shown remarkable good performance (with guaranted asymptotic optimality) when applied to the class of problems known as multi-armed restless bandits. In this paper we develop QWI, an algorithm based on Q-learning in order to learn the Whittle indices. The key feature is the deployment of two...

Uploaded on: December 3, 2022
September 19, 2022 (v1)

Conference paper

Metadata-only

Tabular and Deep Learning of Whittle Index

Robledo, Francisco Borkar, Vivek Ayesta, Urtzi

The Whittle index policy is a heuristic that has shown remarkable good performance (with guaranted asymptotic optimality) when applied to the class of problems known as Restless Multi-Armed Bandit Problems (RMABP). In this paper we present QWI and QWINN, two algorithms capable of learning the Whittle indices for the total discounted criterion....

Uploaded on: February 22, 2023
September 21, 2021 (v1)

Journal article

Metadata-only

Dynamic social learning under graph constraints

Avrachenkov, Konstantin Borkar, Vivek Moharir, Sharayu

We introduce a model of graph-constrained dynamic choice with reinforcement modeled by positively $\alpha$-homogeneous rewards. We show that its empirical process, which can be written as a stochastic approximation recursion with Markov noise, has the same probability law as a certain vertex reinforced random walk. We use this equivalence to...

Uploaded on: December 4, 2022

BORKAR , Vivek

Recent uploads