Paul Weng: Optimizing global performance metrics in deep reinforcement learning

16 juin 25

Monday June 16 2025 at 11:00am in room D308

Speaker: Paul Weng (Duke Kunshan University)

Title: Optimizing global performance metrics in deep reinforcement learning

Summary: Deep reinforcement learning (DRL) is a generic and powerful machine learning approach, which has delivered promising results in various application domains from robotics to combinatorial optimization. However, applying DRL to solve real-life problems is not straightforward, because the decision model used in DRL, which is based on the Expected Utility criterion with utilities assumed to be temporally additively decomposable, has limited expressive power and may not be well-aligned with the global performance metric that needs to be optimized in a real-life problem. To tackle this issue, we propose to directly optimize a given global performance metric in DRL. In this setting, we present some theoretical results and discuss some DRL algorithms.