Robotics AI

11papers

6.8viability

-17%30d

State of the Field

Recent advancements in robotics AI are increasingly focused on enhancing the generalization and efficiency of robotic manipulation tasks. Researchers are developing frameworks that integrate multimodal inputs, such as vision and tactile feedback, to improve robots' ability to understand and interact with complex environments. For instance, new models are addressing the challenge of "Information Collapse," where robots overly rely on visual cues, by incorporating language instructions into their decision-making processes. This shift is crucial for enabling robots to follow diverse commands in real-world settings. Additionally, innovative training methods, such as using world models for reinforcement learning, are showing promise in reducing the reliance on extensive expert demonstrations and improving real-world performance. These developments not only enhance the capabilities of robots in tasks like bimanual coordination but also pave the way for more efficient and adaptable systems that can operate effectively in dynamic environments, potentially transforming applications in industries ranging from manufacturing to healthcare.

Last updated Mar 2, 2026

Papers

1–10 of 11

Research Paper·Jan 13, 2026

Generalizable Geometric Prior and Recurrent Spiking Feature Learning for Humanoid Robot Manipulation

Humanoid robot manipulation is a crucial research area for executing diverse human-level tasks, involving high-level semantic reasoning and low-level action generation. However, precise scene understa...

9.0 viability

Research Paper·Jan 21, 2026

BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries

Vision-Language-Action (VLA) models have shown promise in robot manipulation but often struggle to generalize to new instructions or complex multi-task scenarios. We identify a critical pathology in c...

8.0 viability

Research Paper·Jan 22, 2026

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Recent video generation models demonstrate remarkable ability to capture complex physical interactions and scene evolution over time. To leverage their spatiotemporal priors, robotics works have adapt...

8.0 viability

Research Paper·Feb 12, 2026

ViTaS: Visual Tactile Soft Fusion Contrastive Learning for Visuomotor Learning

Tactile information plays a crucial role in human manipulation tasks and has recently garnered increasing attention in robotic manipulation. However, existing approaches mostly focus on the alignment ...

8.0 viability

Research Paper·Mar 4, 2026

Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning

Continual learning is a long-standing challenge in robot policy learning, where a policy must acquire new skills over time without catastrophically forgetting previously learned ones. While prior work...

7.0 viability

Research Paper·Feb 2, 2026

World-Gymnast: Training Robots with Reinforcement Learning in a World Model

Robot learning from interacting with the physical world is fundamentally bottlenecked by the cost of physical interaction. The two alternatives, supervised finetuning (SFT) from expert demonstrations ...

7.0 viability

Research Paper·Jan 26, 2026

Attention-Based Neural-Augmented Kalman Filter for Legged Robot State Estimation

In this letter, we propose an Attention-Based Neural-Augmented Kalman Filter (AttenNKF) for state estimation in legged robots. Foot slip is a major source of estimation error: when slip occurs, kinema...

7.0 viability

Research Paper·Feb 9, 2026

BiManiBench: A Hierarchical Benchmark for Evaluating Bimanual Coordination of Multimodal Large Language Models

Multimodal Large Language Models (MLLMs) have significantly advanced embodied AI, and using them to benchmark robotic intelligence has become a pivotal trend. However, existing frameworks remain predo...

6.0 viability

Research Paper·Feb 11, 2026

Affordances Enable Partial World Modeling with LLMs

Full models of the world require complex knowledge of immense detail. While pre-trained large models have been hypothesized to contain similar knowledge due to extensive pre-training on vast amounts o...

5.0 viability

Research Paper·Jan 22, 2026

Off-Policy Actor-Critic with Sigmoid-Bounded Entropy for Real-World Robot Learning

Deploying reinforcement learning in the real world remains challenging due to sample inefficiency, sparse rewards, and noisy visual observations. Prior work leverages demonstrations and human feedback...

5.0 viability

Page 1 of 2

Robotics AI

State of the Field

Papers

Generalizable Geometric Prior and Recurrent Spiking Feature Learning for Humanoid Robot Manipulation

BayesianVLA: Bayesian Decomposition of Vision Language Action Models via Latent Action Queries

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

ViTaS: Visual Tactile Soft Fusion Contrastive Learning for Visuomotor Learning

Pretrained Vision-Language-Action Models are Surprisingly Resistant to Forgetting in Continual Learning

World-Gymnast: Training Robots with Reinforcement Learning in a World Model

Attention-Based Neural-Augmented Kalman Filter for Legged Robot State Estimation

BiManiBench: A Hierarchical Benchmark for Evaluating Bimanual Coordination of Multimodal Large Language Models

Affordances Enable Partial World Modeling with LLMs

Off-Policy Actor-Critic with Sigmoid-Bounded Entropy for Real-World Robot Learning

Filters