Voici les éléments 1 - 2 sur 2
  • Publication
    Accès libre
    Bayesian Reinforcement Learning via Deep, Sparse Sampling
    (2020)
    Divya Grover
    ;
    Debabrota Basu
    ;
    We address the problem of Bayesian reinforcement learning using efficient model-based online planning. We propose an optimism-free Bayes-adaptive algorithm to induce deeper and sparser exploration with a theoretical bound on its performance relative to the Bayes optimal policy, with a lower computational complexity. The main novelty is the use of a candidate policy generator, to generate long-term options in the planning tree (over beliefs), which allows us to create much sparser and deeper trees. Experimental results on different environments show that in comparison to the state-of-the-art, our algorithm is both computationally more efficient, and obtains significantly higher reward in discrete environments.
  • Publication
    Accès libre
    Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?
    (2019)
    Debabrota Basu
    ;
    ;
    Aristide Tossou
    Based on differential privacy (DP) framework, we introduce and unify privacy definitions for the multi-armed bandit algorithms. We represent the framework with a unified graphical model and use it to connect privacy definitions. We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions. We leverage a unified proving technique to achieve all the lower bounds. We show that for all of them, the learner's regret is increased by a multiplicative factor dependent on the privacy level ϵ. We observe that the dependency is weaker when we do not require local differential privacy for the rewards.