Jason Ma
Hi there! I'm a 4th-year PhD student at UPenn GRASP Laboratory,
where I am fortunate to be advised by Dinesh Jayaraman and Osbert Bastani.
My research focuses on reinforcement learning and robot learning.
These days I am interested in training and deploying foundation models for robotics and embodied agents, and in particular, how to enable robots to learn from internet-scale data.
Google Scholar
Github
Twitter
jasonyma at seas dot upenn dot edu

(* indicates equal contribution, † indicates equal advising)
Eureka: Human-Level Reward Design via Coding Large Language Models
Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi "Jim" Fan†, Anima Anandkumar†
Preprint
★ Oral Presentation, NeurIPS Agent Learning in Open-Endedness Workshop ★
★ Oral Presentation, CORL Toward Generalist Robots Workshop ★
★ Oral Presentation, CORL Language and Robot Learning Workshop ★
Webpage •
Arxiv •
Code
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy
Charles Zhang*, Yunshuang Li*, Osbert Bastani, Abhishek Gupta, Dinesh Jayaraman, Jason Ma†, Lucas Weihs†
Preprint
★ Best Paper Award, CORL LEAP Workshop ★
★ Oral Presentation, NeurIPS Foundation Models for Decision Making Workshop ★
Webpage •
Arxiv •
Code
LIV: Language-Image Representations and Rewards for Robotic Control
Jason Ma, Vikash Kumar, Amy Zhang, Osbert Bastani, Dinesh Jayaraman
International Conference on Machine Learning (ICML), 2023
★ Oral Presentation, RSS Workshop on Language for Robot Learning ★
Webpage •
Arxiv •
Code
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
Jason Ma, Shagun Sodhani, Dinesh Jayaraman, Osbert Bastani, Vikash Kumar†, Amy Zhang†
International Conference on Learning Representations (ICLR), 2023
★ Spotlight Presentation, ICLR ★
★ Oral Presentation, NeurIPS Deep RL, Offline RL, FMDM Workshops★
★ Best Paper Finalist, NeurIPS Deep RL Workshop ★
Webpage •
Arxiv •
Code

How Far I'll Go: Offline Goal-Conditioned RL via F-Advantage Regression
Jason Ma, Jason Yan, Dinesh Jayaraman, Osbert Bastani
Neural Information Processing Systems (NeurIPS), 2022
★ Nominated for Outstanding Paper, NeurIPS ★
★ Best Paper Finalist, RSS Workshop on Learning from Diverse, Offline Data ★
Webpage •
Arxiv •
Code

TOM: Learning Policy-Aware Models for MBRL via Transition Occupancy Matching
Jason Ma*, Kausik Sivakumar*, Jason Yan, Osbert Bastani, Dinesh Jayaraman
Learning for Decision and Control (L4DC), 2023
Webpage •
Arxiv •
Code

SMODICE: Versatile Offline Imitation from Observations and Examples
Jason Ma, Andrew Shen, Dinesh Jayaraman, Osbert Bastani
International Conference on Machine Learning (ICML), 2022
Webpage •
Arxiv •
Code
CAP: Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning
Jason Ma*, Andrew Shen*, Osbert Bastani, Dinesh Jayaraman
Association for the Advancement of Artificial Intelligence (AAAI), 2022
Arxiv •
Code

Likelihood-Based Diverse Sampling for Trajectory Forecasting
Jason Ma, Jeevana Priya Inala, Dinesh Jayaraman, Osbert Bastani
International Conference on Computer Vision (ICCV), 2021
Arxiv •
Code

Conservative Offline Distributional Reinforcement Learning
Jason Ma, Dinesh Jayaraman, Osbert Bastani
Neural Information Processing Systems (NeurIPS), 2021
Arxiv •
Code
2023
MIT IAI Lab2022
University of Edinburgh RL Seminar2023
Co-Organizer, NeurIPS Workshop on Goal-Conditioned Reinforcement Learning2023
Co-Organizer, GRASP Student, Faculty, and Industry (SFI) Seminar2021+
Reviewer, NeurIPS, ICML, ICLR, AAAI, ICRA, IROS, RA-LI am looking to mentor highly motivated students to work on research projects all year long. I especially encourage students from underrepresented groups to get involved! If you are interested, please send me an email with your CV and your research interests.
Current
William LiangPast
Kausik Sivakumar