Abstract: Foundation models, such as GPT, have achieved remarkable progress in natural language and vision, demonstrating strong adaptability to new tasks and scenarios. Physical interaction, such as cooking, cleaning, or caregiving, remains a frontier where these models and robotic systems have yet to reach comparable levels of generalization. In this talk, I will discuss opportunities for incorporating foundation models into robotic pipelines to extend capabilities beyond those of traditional methods. The focus will be on two areas: (1) task specification and (2) task-level planning. The central idea is to translate the commonsense knowledge embedded in foundation models into structural priors that can be integrated into robot learning systems. This approach combines the strengths of different modules (for example, VLMs for task interpretation and constrained optimization for motion planning), achieving the best of both worlds. I will show how such integration enables robots to interpret free-form natural language instructions and perform a wide range of real-world manipulation tasks. I will conclude by discussing current limitations of foundation models, key challenges ahead (particularly in multi-modal sensing and world modeling), and potential avenues for progress.
Bio: Yunzhu Li is an Assistant Professor of Computer Science at Columbia University. Before joining Columbia, he was an Assistant Professor at UIUC CS and spent time as a Postdoc at Stanford, collaborating with Fei-Fei Li and Jiajun Wu. Yunzhu earned his PhD from MIT under the guidance of Antonio Torralba and Russ Tedrake. His work has been recognized with the Best Paper Award at ICRA, the Best Systems Paper Award, and as a Finalist for the Best Paper Award at CoRL. He is also a recipient of the AAAI New Faculty Highlights, the Sony Faculty Innovation Award, the Amazon Research Award, the Adobe Research Fellowship, and the First Place Ernst A. Guillemin Master’s Thesis Award in AI and Decision Making at MIT. His research has been published in top journals and conferences, including Nature and Science, and featured by major media outlets such as CNN, BBC, and The Wall Street Journal.
Event Details
Date/Time:
-
Date:Wednesday, October 15, 2025 - 12:15pm to 1:15pm
Location:
KLAUS BUILDING 1116 E&W
URL:
Extras:
Free Food
For More Information Contact
christa.ernst@research.gatech.edu