Visit ShanghaiTech University | 中文 | How to find us
HOME > News and Events > Events
Sophisticated Preferences in Reinforcement Learning
Date: 2018/7/2             Browse: 45

Speaker:     Prof. Paul Weng

Time:          17:10—17:50, July 2 

Location:    Room 1A-200, SIST Building

Host:          Prof. Dengji Zhao


We will give an overview of our current research focus on reinforcement learning (RL) and present a summary of some recent results we obtained for using sophisticated preferences in RL. Recall two assumptions are made in standard RL: (1) a trajectory is valued by a sum of numeric scalar rewards and (2) the value of a policy is defined as an expectation over trajectory values. Both assumptions may not be valid in practice. In some problems, valuations are vectorial (e.g., multiobjective) and in others, preferences are qualitative (e.g., trajectory A is preferred to trajectory B). We will present some theoretical results (e.g., bounds), algorithms and experimental results for those two cases.


Paul Weng is an assistant professor at University of Michigan-Shanghai Jiao Tong University Joint Institute. He was a faculty at SYSU-CMU Joint Institute of Engineering from 2015 to 2017. During 2015, he was a visiting faculty at Carnegie Mellon University (CMU). Before that, he was an associate professor in computer science at Sorbonne University (Pierre and Marie Curie University, UPMC), Paris. He received his Master in 2003 and his Ph.D. in 2006, both in artificial intelligence at UPMC. Before joining academia, he graduated from ENSAI (French National School in Statistics and Information Analysis) and worked as a financial quantitative analyst in London.

His main research work lies in artificial intelligence and machine learning. Notably, it focuses on adaptive control (reinforcement learning, Markov decision process) and multiobjective optimization (compromise programming, fair optimization).

SIST-Seminar 18060