In this talk, I will discuss a range of results—both old and new—concerning the long-run behavior of online learners that are involved in an unknown repeated game (that is, when any given player is not necessarily aware of the other players' actions or objectives). The talk will focus on a widely studied family of online learning methods known as "regularized learning", and we will pay special attention to the information available to the players—from full information, to bandit, payoff-based feedback. In this general context, I will describe a series of results characterizing the possible outcomes of the process, from convergence to a Nash equilibrium, to sets of pure strategies that are (minimally) closed under better replies.
Events
AdONE Seminar: Prof. Panayotis Mertikopoulos (French National Center for Scientific Research)