详情

Representation Learning for Deep Reinforcement Learning

申报人:PAUL AN-LIN WENG 申报日期:2023-02-27

基本情况

第二十七期“上海交通大学大学生创新实践计划”
Representation Learning for Deep Reinforcement Learning
创新训练项目
工学
电子信息类
创意类
密西根学院
PAUL AN-LIN WENG
指导教师
登录状态下查看

Reinforcement learning (RL) is a general model for adaptive control (e.g., autonomous driving, intelligent tutoring or robotics). In such a setting, an agent learns by interacting with an environment by trial and error. Recently, the combination of deep learning and reinforcement learning (called deep RL) has proved to be extremely powerful. Using such techniques, an agent can learn to play video games from visual inputs or the game of go at a superhuman level. Currently, research on RL and deep RL has become very active in the machine learning community, mainly because of the potential of this approach. Ongoing research work notably focuses on making those techniques more practical and efficient such that they could be applied to more diverse domains.

This proposed project is the continuation of an exploratory project started as a collaboration with Huawei. The goal in that collaboration was to combine reinforcement learning methods and reasoning techniques to learn decision-making policies under the form of first-order logic programs. This research was conducted under the assumption that the input of the deep reinforcement was already available in the logic form. In this proposed project, we aim to learn object-centric representations from visual inputs such that the deep reinforcement learning agents can learn from high-level and more compact representations. The potential benefits are as follows: faster deep reinforcement training, more robust solutions, or interpretable policies.


A PhD student in my team has already started some work in that direction and designed a method based on visual transformers (paper under review). The undergraduate students joining this project will collaborate with my PhD student to help further improve the current method and perform more experiments.

The expected work would be as follows (in quarters):

- Q1: Learn the basics of deep reinforcement learning and deep learning; start performing some experiments with the current code written by my PhD student

- Q2: Improve the current method to make it more generic and extract object-centric information from images; perform further experiments with new method

- Q3: Analyze experiments data and tune the new method

- Q4: Demonstrate the effectiveness of the final method on various domains; write a research paper describing it

选题成员

1

指导教师

序号 教师姓名 电子邮箱 所属学院
1 PAUL AN-LIN WENG 登录状态下查看 密西根学院 第一指导教师

选题附件

结束