Tel: 15652601306
Email: yuchao@sz.tsinghua.edu.cn
Address: Room 1108, Information Building
Chao Yu received her Ph.D degree from the Department of Electronic Engineering, Tsinghua University, in 2023. She is currently an Assistant Professor at the Shenzhen International Graduate School, Tsinghua University, and a recipient of the Young Talent Support Program of the Chinese Institute of Electronics. Her research focuses on decision intelligence based on reinforcement learning (RL). To date, she has published over 50 papers in top international conferences and journals including ICML, NeurIPS, ICLR, CVPR, ECCV, CoRL, IROS, ICRA, TMLR, and RAL, with more than 5,500 Google Scholar citations. Her representative works include the multi-agent reinforcement learning algorithm MAPPO (over 2,800 Google citations) and the large-scale reinforcement learning infra RLinf for embodied intelligence (over 2,600 GitHub stars).
2019, August-2023, July, Tsinghua University, Electronic Engineering, PhD
2016, August-2019, July, Tsinghua University, Mechanical Engineering , Master
2012, August-2016, July, Beijing institute of Technology,Automation , Bachelor
2026, January-Now, Tsinghua University, Assistant Professor
2023, July-2025, December, Tsinghua University, Postdoc
Chao Yu's research focuses on decision intelligence based on reinforcement learning (RL), including large-scale RL infra, multi-agent RL algos and embodied intelligence. To date, she has published over 50 papers in top international conferences and journals. Her work has garnered over 5.5k citations on Google Scholar. Her first-author paper on the multi-agent reinforcement learning algorithm MAPPO, published in NeurlPS 2022, has received over 2800 citations. As a co-corresponding author, her large language model alignment paper published at lCML 2024 was selected for an Oral Presentation (top 1.5%). Recently, she leads the development of RLinf, which is a flexible open-source RL infrastructure designed for embodied intelligence and has gained over 2600 stars on Github.
Chao Yu has received several honors, including Tsinghua University's Outstanding Doctoral Graduate Award, Outstanding Doctoral Thesis Award. During her postdoctoral period, she was selected for Tsinghua University's Shuimu Scholar program. She is also the principal investigator of Youth Program of National Natural Science Foundation of China (NSFC).
[1] Chao Yu*, Akash Velu*, Eugene Vinitsky, Jiaxuan Gao, Yu Wang+, Alexandre Bayen+, Yi Wu+.
The Surprising Effectiveness of PPO in Cooperative Multi-agent Games. in Advances in Neural
Information Processing Systems (NeurIPS), 2022.
[2] Chao Yu, Zuxin Liu, Xin-Jun Liu, Fugui Xie, Yi Yang, Qi Wei, Fei Qiao. DS-SLAM: A semantic
visual SLAM towards dynamic environments. In International Conference on Intelligent Robots and
Systems (IROS), 2018.
[3] Shusheng Xu , Wei Fu, Jiaxuan Gao , Wenjie Ye, Weilin Liu, Zhiyu Mei, Guangju Wang, Chao Yu+, Yi Wu+. Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study. in International Conference on Machine Learning (ICML), 2024.
[4] Tonghe Zhang, Chao Yu+, Sichang Su, Yu Wang. ReinFlow: Fine-tuning Flow Matching Policy
with Online Reinforcement Learning. in Advances in Neural Information Processing Systems (NeurIPS) 2025.
[5] Zelai Xu, Chao Yu, Fei Fang, Yu Wang+, Yi Wu+. Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game. in International Conference on Machine Learning (ICML), 2024.
[6] Chao Yu*, Jiaxuan Gao*, Weilin Liu, Botian Xu, Hao Tang, Jiaqi Yang, Yu Wang, Yi Wu. Learning Zero-Shot Cooperation with Humans, Assuming Humans Are Biased. in International Conference on Learning Representations (ICLR), 2023.
[7] Zhenggang Tang*, Chao Yu*, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu. Discovering Diverse Multi-agent Strategic Behavior Via Reward Randomization. In
International Conference on Learning Representations (ICLR), 2021.
[8] Botian Xu, Feng Gao, Chao Yu+, Ruize Zhang, Yi Wu, Yu Wang+. OmniDrones: An Efficient
and Flexible Platform for Reinforcement Learning. in Drone Control. in IEEE Robotics and
Automation Letters (RAL), 2024.
[9] Jijia Liu*, Feng Gao*, Bingwen Wei, Xinlei Chen, Qingmin Liao, Yi Wu, Chao Yu+, Yu Wang+. What Can RL Bring to VLA Generalization? An Empirical Study. in Advances in Neural Information Processing Systems (NeurIPS), 2025.
[10] Jijia Liu, Feng Gao, Qingmin Liao, Chao Yu+, Yu Wang+. Learning from Suboptimal Data in Continuous Control via Auto-Regressive Soft Q-Network. in International Conference on Machine Learning (ICML), 2025.
[11] Yixian Zhang*, Shu'ang Yu*, Tonghe Zhang, Mo Guang, Haojia Hui, Kaiwen Long, Yu Wang, Chao Yu+, Wenbo Ding+. SAC Flow: Sample-Efficient Reinforcement Learning of Flow-Based Policies via Velocity-Reparameterized Sequential Modeling. in International Conference on Learning Representations (ICLR), 2026.
[12] Chao Yu, Xinyi Yang, Jiaxuan Gao, Jiayu Chen, Yunfei Li, Jijia Liu, Yunfei Xiang, Ruixin
Huang, Huazhong Yang, Yi Wu, Yu Wang. Asynchronous Multi-Agent Reinforcement Learning for
Efficient Real-time Multi-robot Cooperative Exploration. In International Conference on Autonomous
Agents and Multi-agent Systems (AAMAS), 2023.