高清无码

高清无码

World Modeling from Reconstruction, Simulation and Action

发布时间:2026-04-17

时   间:13:00-15:00, Apr 21, 2026 (Tue)

地   点:Seminar Room 2, 19th Floor, Tower C, TusPark

内容:

Building world models that internally represent how the world looks, evolves, and responds to actions is increasingly recognized as a foundation for embodied intelligence. As Vincent Sitzmann suggests, traditional vision tasks like 3D reconstruction, segmentation, and detection may ultimately serve as intermediate representations within such models rather than as ends in themselves. A complete world model requires understanding 3D structure (reconstruction), predicting dynamics (simulation), and enabling interaction (action). In this talk, I present three works that address each of these pillars.

First, GCT establishes the perception backbone of the world model: a streaming perception system that replaces hand-crafted SLAM pipelines with end-to-end learned attention, encoding geometric priors as inductive biases so that 3D structure can be recovered directly from continuous data streams. Second, LingBot-World adds the dynamics layer: a high-quality interactive world simulator that employs neural simulation to move beyond static geometry and capture how the world evolves over time through physical dynamics and causal reasoning. Third, LingBot-VA directly leverages the world model for robot control: by natively unifying autoregressive world dynamics modeling and action modeling within a single framework, it closes the loop from world understanding to embodied behavior. Together, these works form a progressive path toward building powerful generative models that perceive and generate compact representations of environments, and leverage them to understand, interact with, and reason about the physical world.

个人简介:

Yinghao Xu is an Assistant Professor in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology (HKUST). Previously, he was a Staff Research Scientist at RobbyAnt, working on world models and embodied AI. Before that, he was a postdoctoral researcher at Stanford University. His research lies at the intersection of 3D computer vision, generative AI, and embodied AI, with a recent focus on building world models that unify 3D reconstruction, world simulation, and embodied action. He was the recipient of the Yunfan Award at the WAIC 2024 and was nominated for the Snap Fellowship in 2022.

返回列表
演讲人 Yinghao Xu(HKUST) 时间 13:00-15:00, Apr 21, 2026 (Tue)
地点 Seminar Room 2, 19th Floor, Tower C, TusPark EN
TOP