Jason Ma

Hi there! I'm a final-year PhD student at UPenn GRASP Laboratory, where I am fortunate to be advised by Dinesh Jayaraman and Osbert Bastani. During my PhD, I have also spent time at Google DeepMind, NVIDIA AI, and Meta AI.

My research interests span robot learning, reinforcement learning, and foundation models for robotics. In particular, In particular, I research algorithms for training and leveraging foundation models from internet data to provide scalable training signals for robots to autonomously learn new tasks.

Selected recent talks:


Selected honors:

Selected media coverage:

Google Scholar     Github     Twitter

yechengma at gmail dot com

Some recent research highlights:
Selected publications as first or last author; full list on Google Scholar.

(* indicates equal contribution, † indicates equal advising)

Vision-Language Models are In-Context Value Learners

Jason Ma*, Joey Hejna, Ayzaan Wahid, Chuyuan Fu, Dhruv Shah, Jacky Liang, Zhuo Xu, Sean Kirmani, Peng Xu, Danny Driess, Ted Xiao, Jonathan Tompson, Osbert Bastani, Dinesh Jayaraman, Wenhao Yu, Tingnan Zhang, Dorsa Sadigh, Fei Xia
preprint, 2024
Webpage  •   Arxiv  •  

Eurekaverse: Environment Curriculum Generation via Large Language Models

William Liang, Sam Wang, Hungju Wang, Osbert Bastani, Dinesh Jayaraman†, Jason Ma
Conference on Robot Learning (CoRL) (Oral) , 2024
Webpage  •   Arxiv  •   Code

On-Robot Reinforcement Learning with Goal-Contrastive Rewards

Ondrej Biza, Thomas Weng, Lingfeng Sun, Karl Schmeckpeper, Tarik Kelestemur, Jason Ma†, Robert Platt†, Jan-Willem van de Meent†, Lawson L. S. Wong†
preprint, 2024
Arxiv  •  

DrEureka: Language Model Guided Sim-To-Real Transfer

Jason Ma*, William Liang*, Hungju Wang, Sam Wang, Yuke Zhu, Linxi "Jim" Fan, Osbert Bastani, Dinesh Jayaraman
Robotics: Science and Systems (RSS), 2024
Webpage  •   Arxiv  •   Code

Eureka: Human-Level Reward Design via Coding Large Language Models

Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi "Jim" Fan†, Anima Anandkumar†
International Conference on Learning Representations (ICLR), 2024
NVIDIA Top 10 Research Projects of 2023
Webpage  •   Arxiv  •   Code

Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

Charles Zhang*, Yunshuang Li*, Osbert Bastani, Abhishek Gupta, Dinesh Jayaraman, Jason Ma†, Lucas Weihs†
International Conference on Robotics and Automation (ICRA) (Best Paper Finalist) , 2024
Best Paper Award, CORL 2023 LEAP Workshop
Webpage  •   Arxiv  •   Code

LIV: Language-Image Representations and Rewards for Robotic Control

Jason Ma, Vikash Kumar, Amy Zhang, Osbert Bastani, Dinesh Jayaraman
International Conference on Machine Learning (ICML), 2023
Webpage  •   Arxiv  •   Code

VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training

Jason Ma, Shagun Sodhani, Dinesh Jayaraman, Osbert Bastani, Vikash Kumar†, Amy Zhang†
International Conference on Learning Representations (ICLR) (Spotlight) , 2023
Best Paper Finalist, NeurIPS 2022 Deep RL Workshop
Webpage  •   Arxiv  •   Code

How Far I'll Go: Offline Goal-Conditioned RL via F-Advantage Regression

Jason Ma, Jason Yan, Dinesh Jayaraman, Osbert Bastani
Neural Information Processing Systems (NeurIPS) (Nominated for Outstanding Paper) , 2022
Webpage  •   Arxiv  •   Code

TOM: Learning Policy-Aware Models for MBRL via Transition Occupancy Matching

Jason Ma*, Kausik Sivakumar*, Jason Yan, Osbert Bastani, Dinesh Jayaraman
Learning for Decision and Control (L4DC), 2023
Webpage  •   Arxiv  •   Code

SMODICE: Versatile Offline Imitation from Observations and Examples

Jason Ma, Andrew Shen, Dinesh Jayaraman, Osbert Bastani
International Conference on Machine Learning (ICML), 2022
Webpage  •   Arxiv  •   Code

CAP: Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning

Jason Ma*, Andrew Shen*, Osbert Bastani, Dinesh Jayaraman
Association for the Advancement of Artificial Intelligence (AAAI), 2022
Arxiv  •   Code

Likelihood-Based Diverse Sampling for Trajectory Forecasting

Jason Ma, Jeevana Priya Inala, Dinesh Jayaraman, Osbert Bastani
International Conference on Computer Vision (ICCV), 2021
Arxiv  •   Code

Conservative Offline Distributional Reinforcement Learning

Jason Ma, Dinesh Jayaraman, Osbert Bastani
Neural Information Processing Systems (NeurIPS), 2021
Arxiv  •   Code

2024

MIT Embodied Intelligence Seminar (Upcoming)

Brown Robotics Seminar (Upcoming)

USC (Upcoming)

The AI Institute (Upcoming)

Stanford ILIAD Lab

Amazon Robotics

Stanford Vision and Learning Lab

University of Michigan

2023

MIT IAI Lab

UIUC Robot Learning Seminar

Northwestern Ability Lab

Johns Hopkins University Neuro AI

HKUST Info. Hub Seminar

Tsinghua University Yang Gao Lab

UT Austin MIDI Group

Intel AI Seminar

2022

University of Edinburgh RL Seminar

MILA RL Seminar

UPenn GRASP SFI Seminar

Guest Lecture at UPenn CIS 519: Applied Machine Learning

2023

Co-Organizer, NeurIPS Workshop on Goal-Conditioned Reinforcement Learning

2021+

Reviewer, NeurIPS, ICML, ICLR, AAAI, ICRA, IROS, RA-L, CORL

I am looking to mentor highly motivated students to work on research projects all year long. I especially encourage students from underrepresented groups to get involved! If you are interested, please send me an email with your CV and your research interests.



Current

William Liang

Johnny Wang

Sam Wang