Data-driven Learning and Control Seminar: João P. Hespanha (UCSB)

Data Driven Learning and Control seminar series is organized by the Information and Decision Science Lab at Cornell University and aims to explore the latest advancements and interdisciplinary approaches to data-driven learning and control systems.

Watch on YouTube Live

Reinforcement Learning for Large-Scale Games

This talk addresses the use of reinforcement learning in two-player zero-sum Markov games with finite but large state spaces, for which the goal is to find minimax policies with “modest”‘ computation. We use the qualifier “modest” to mean that we seek to certify policies as optimal without exploring the full state-space of the game.

The approach followed is strongly motivated by Q-learning, which was proposed in the late 1980s to extend the single-player dynamic programming principle to model-free reinforcement learning by eliminating the need for a known transition model. Extensions of Q-learning to two-player zero-sum games appeared shortly after. Since then, most of the work devoted to proving correctness of Q-learning relies on establishing that its iteration converges to a unique fixed-point of a Bellman-like equation, which generally requires exploring the full state-space. We will see that, for zero-sum games, it is possible to construct provably correct optimal policies using algorithms inspired by Q-learning, without requiring convergence of the Q function over the whole state-space. In fact, the samples used to update the Q-function may not even explore the whole set of reachable states and, for certain classes of games, the fraction of explored states gets smaller and smaller as the size of the state-space increases.

Bio: JoÃÂ£o P. Hespanha received the Licenciatura in electrical and computer engineering from the Instituto Superior TÃÂ©cnico, Lisbon, Portugal in 1991 and the Ph.D. degree in electrical engineering and applied science from Yale University in 1998. From 1999 to 2001, he was assistant professor at the University of Southern California. He moved to the University of California, Santa Barbara in 2002, where he currently holds a Distinguished Professor position with the Department of Electrical and Computer Engineering.

Hespanha is the recipient of Yale University’s Henry Prentiss Becton Graduate Prize for exceptional achievement in research in Engineering and Applied Science, a National Science Foundation CAREER Award, the 2005 best paper award at the 2nd Int. Conf. on Intelligent Sensing and Information Processing, the 2005 Automatica Theory/Methodology best paper prize, the 2006 George S. Axelby Outstanding Paper Award, and the 2009 Ruberti Young Researcher Prize. Hespanha is a Fellow of the International Federation of Automatic Control (IFAC) and of the IEEE. He was an IEEE distinguished lecturer from 2007 to 2013.

His current research interests include hybrid and switched systems; multi-agent control systems; game theory; optimization; distributed control over communication networks (also known as networked control systems); the use of vision in feedback control; stochastic modeling in biology; and network security.