InSpatio-World: Real-Time 4D World Simulation via Spatiotemporal Autoregressive Modeling
The ability to simulate a 4D world — one that evolves in time and can be viewed from arbitrary perspectives — is a foundational capability for autonomous driving, robotics, and embodied AI. Existing video generation models produce visually compelling sequences but lack spatial consistency when the camera moves. 3D reconstruction methods achieve geometric fidelity but struggle with dynamic scenes and real-time performance. InSpatio-World bridges this gap through a spatiotemporal autoregressive (STAR) architecture that combines the strengths of both paradigms. ...