Value-Gradient Formulation for Optimal Control Problem and its Machine-Learning Algorithm

发布时间:2021-10-14浏览次数:120

Speaker:    Prof. Xiang Zhou, City University of Hong Kong
Time:         11:00-12:00, Oct. 14. 2021
Location:   SIST 1C 101
Host:          Prof. Qifeng Liao
Abstract:
Optimal control problem is typically cast as a nonlinear Hamiltonian-Jacobi-Bellman PDE problem which the value function satisfies. In this talk,  we show motivations of focusing its gradient and derive a PDE system for the (vector-valued) gradient of the value  function  (value-gradient function), which is closed and enjoys a nice component-decoupling property. This PDE system of value-gradient can be  solved by the method of characteristics as the linear HJB equation: one curve of characteristics will produce the data for both value and value-gradient. Supplemented by this additional value-gradient data, the value function is then  computed by minimizing the sum of two mean square errors between the data and the parametric function approximations.  We show by a few numerical examples  the improvement of both robustness and accuracy when such value-gradient is taken into account.  The linear  convergence of the iterative algorithm is  proved under mild conditions.This is joint work with A. Bensoussan and P. Yam and JY Han.

Bio:
Dr Xiang Zhou received his BSc from Peking University and PhD from Princeton University. Before joining City University in 2012, he worked as a research associate at Princeton University and Brown University. His major research area is the study of rare event. His research interests include the development and analysis of algorithms for transitions in nonlinear stochastic dynamical systems, the efficient Monte Carlo simulation of rare events, the numerical methods for saddle point and the exploration of high dimensional non-convex energy landscapes in physical models and machine learning models. His research results have turned into peer-reviewed papers in SIAM journals, Journal of Computational Physics, Journal of Chemical Physics, Nonlinearity and Annals of Applied Probability, etc.