Jason Ma
Hi there! I'm a final-year PhD student at UPenn GRASP Laboratory,
where I am fortunate to be advised by Dinesh Jayaraman and Osbert Bastani.
During my PhD, I have also spent time at Google DeepMind, NVIDIA AI, and Meta AI.
My research interests span robot learning, reinforcement learning, and
Selected honors:
- • Apple Scholars in AI/ML PhD Fellowship, 2024
- • OpenAI Superalignment PhD Felowship, 2024
- • ICRA Best Paper Finalist in Robot Vision, 2024
- • NVIDIA Top 10 Research Project of Year, 2023
- • CORL LEAP Workshop Best Paper Award, 2023
- • The Economist (interview)
- • Fox (interview)
- • Yahoo (research coverage)
- • TechCrunch (research coverage)
Google Scholar Github Twitter
yechengma at gmail dot com
(* indicates equal contribution, † indicates equal advising)
Vision-Language Models are In-Context Value Learners
Jason Ma*, Joey Hejna, Ayzaan Wahid, Chuyuan Fu, Dhruv Shah, Jacky Liang, Zhuo Xu, Sean Kirmani, Peng Xu, Danny Driess, Ted Xiao, Jonathan Tompson, Osbert Bastani, Dinesh Jayaraman, Wenhao Yu, Tingnan Zhang, Dorsa Sadigh, Fei Xia
preprint, 2024
Webpage •
Arxiv •
Eurekaverse: Environment Curriculum Generation via Large Language Models
William Liang, Sam Wang, Hungju Wang, Osbert Bastani, Dinesh Jayaraman†, Jason Ma†
Conference on Robot Learning (CoRL) (Oral) , 2024
Webpage •
Arxiv •
Code
On-Robot Reinforcement Learning with Goal-Contrastive Rewards
Ondrej Biza, Thomas Weng, Lingfeng Sun, Karl Schmeckpeper, Tarik Kelestemur, Jason Ma†, Robert Platt†, Jan-Willem van de Meent†, Lawson L. S. Wong†
preprint, 2024
Arxiv •
DrEureka: Language Model Guided Sim-To-Real Transfer
Jason Ma*, William Liang*, Hungju Wang, Sam Wang, Yuke Zhu, Linxi "Jim" Fan, Osbert Bastani, Dinesh Jayaraman
Robotics: Science and Systems (RSS), 2024
Webpage •
Arxiv •
Code
Eureka: Human-Level Reward Design via Coding Large Language Models
Jason Ma, William Liang, Guanzhi Wang, De-An Huang, Osbert Bastani, Dinesh Jayaraman, Yuke Zhu, Linxi "Jim" Fan†, Anima Anandkumar†
International Conference on Learning Representations (ICLR), 2024
NVIDIA Top 10 Research Projects of 2023
Webpage •
Arxiv •
Code
Universal Visual Decomposer: Long-Horizon Manipulation Made Easy
Charles Zhang*, Yunshuang Li*, Osbert Bastani, Abhishek Gupta, Dinesh Jayaraman, Jason Ma†, Lucas Weihs†
International Conference on Robotics and Automation (ICRA) (Best Paper Finalist) , 2024
Best Paper Award, CORL 2023 LEAP Workshop
Webpage •
Arxiv •
Code
LIV: Language-Image Representations and Rewards for Robotic Control
Jason Ma, Vikash Kumar, Amy Zhang, Osbert Bastani, Dinesh Jayaraman
International Conference on Machine Learning (ICML), 2023
Webpage •
Arxiv •
Code
VIP: Towards Universal Visual Reward and Representation via Value-Implicit Pre-Training
Jason Ma, Shagun Sodhani, Dinesh Jayaraman, Osbert Bastani, Vikash Kumar†, Amy Zhang†
International Conference on Learning Representations (ICLR) (Spotlight) , 2023
Best Paper Finalist, NeurIPS 2022 Deep RL Workshop
Webpage •
Arxiv •
Code
How Far I'll Go: Offline Goal-Conditioned RL via F-Advantage Regression
Jason Ma, Jason Yan, Dinesh Jayaraman, Osbert Bastani
Neural Information Processing Systems (NeurIPS) (Nominated for Outstanding Paper) , 2022
Webpage •
Arxiv •
Code
TOM: Learning Policy-Aware Models for MBRL via Transition Occupancy Matching
Jason Ma*, Kausik Sivakumar*, Jason Yan, Osbert Bastani, Dinesh Jayaraman
Learning for Decision and Control (L4DC), 2023
Webpage •
Arxiv •
Code
SMODICE: Versatile Offline Imitation from Observations and Examples
Jason Ma, Andrew Shen, Dinesh Jayaraman, Osbert Bastani
International Conference on Machine Learning (ICML), 2022
Webpage •
Arxiv •
Code
CAP: Conservative and Adaptive Penalty for Model-Based Safe Reinforcement Learning
Jason Ma*, Andrew Shen*, Osbert Bastani, Dinesh Jayaraman
Association for the Advancement of Artificial Intelligence (AAAI), 2022
Arxiv •
Code
Likelihood-Based Diverse Sampling for Trajectory Forecasting
Jason Ma, Jeevana Priya Inala, Dinesh Jayaraman, Osbert Bastani
International Conference on Computer Vision (ICCV), 2021
Arxiv •
Code
Conservative Offline Distributional Reinforcement Learning
Jason Ma, Dinesh Jayaraman, Osbert Bastani
Neural Information Processing Systems (NeurIPS), 2021
Arxiv •
Code
2024
MIT Embodied Intelligence Seminar (Upcoming)2023
MIT IAI Lab2022
University of Edinburgh RL Seminar2024
Co-Organizer, RSS Workshop on Task Specification for General-Purpose Intelligent Robots2023
Co-Organizer, NeurIPS Workshop on Goal-Conditioned Reinforcement Learning2023
Co-Organizer, GRASP Student, Faculty, and Industry (SFI) Seminar2021+
Reviewer, NeurIPS, ICML, ICLR, AAAI, ICRA, IROS, RA-L, CORLI am looking to mentor highly motivated students to work on research projects all year long. I especially encourage students from underrepresented groups to get involved! If you are interested, please send me an email with your CV and your research interests.
Current
William LiangPast
Kausik Sivakumar