Vision-Language-Action Models

Trending

4papers

5.5viability

+100%30d

Papers

1–4 of 4

Research Paper·Feb 11, 2026

AugVLA-3D: Depth-Driven Feature Augmentation for Vision-Language-Action Models

Vision-Language-Action (VLA) models have recently achieved remarkable progress in robotic perception and control, yet most existing approaches primarily rely on VLM trained using 2D images, which limi...

7.0 viability

Research Paper·Feb 17, 2026

ActionCodec: What Makes for Good Action Tokenizers

Vision-Language-Action (VLA) models leveraging the native autoregressive paradigm of Vision-Language Models (VLMs) have demonstrated superior instruction-following and training efficiency. Central to ...

7.0 viability

Research Paper·Mar 3, 2026

Chain of World: World Model Thinking in Latent Motion

Vision-Language-Action (VLA) models are a promising path toward embodied intelligence, yet they often overlook the predictive and temporal-causal structure underlying visual dynamics. World-model VLAs...

5.0 viability

Research Paper·Mar 2, 2026

Pri4R: Learning World Dynamics for Vision-Language-Action Models with Privileged 4D Representation

Humans learn not only how their bodies move, but also how the surrounding world responds to their actions. In contrast, while recent Vision-Language-Action (VLA) models exhibit impressive semantic und...

3.0 viability