Value-Gradient Formulation for Optimal Control Problem and its Machine-Learning Algorithm

Release Time：2021-10-14Number of visits：226

Speaker: Prof. Xiang Zhou, City University of Hong Kong

Time: 11:00-12:00, Oct. 14. 2021

Location: SIST 1C 101

Host: Prof. Qifeng Liao

Abstract:

Optimal control problem is typically cast as a nonlinear Hamiltonian-Jacobi-Bellman PDE problem which the value function satisfies. In this talk, we show motivations of focusing its gradient and derive a PDE system for the (vector-valued) gradient of the value function (value-gradient function), which is closed and enjoys a nice component-decoupling property. This PDE system of value-gradient can be solved by the method of characteristics as the linear HJB equation: one curve of characteristics will produce the data for both value and value-gradient. Supplemented by this additional value-gradient data, the value function is then computed by minimizing the sum of two mean square errors between the data and the parametric function approximations. We show by a few numerical examples the improvement of both robustness and accuracy when such value-gradient is taken into account. The linear convergence of the iterative algorithm is proved under mild conditions.This is joint work with A. Bensoussan and P. Yam and JY Han.

Bio:

Dr Xiang Zhou received his BSc from Peking University and PhD from Princeton University. Before joining City University in 2012, he worked as a research associate at Princeton University and Brown University. His major research area is the study of rare event. His research interests include the development and analysis of algorithms for transitions in nonlinear stochastic dynamical systems, the efficient Monte Carlo simulation of rare events, the numerical methods for saddle point and the exploration of high dimensional non-convex energy landscapes in physical models and machine learning models. His research results have turned into peer-reviewed papers in SIAM journals, Journal of Computational Physics, Journal of Chemical Physics, Nonlinearity and Annals of Applied Probability, etc.

导航

Value-Gradient Formulation for Optimal Control Problem and its Machine-Learning Algorithm