Papers
1–3 of 3Research Paper·Jan 16, 2026
Map2Thought: Explicit 3D Spatial Reasoning via Metric Cognitive Maps
We propose Map2Thought, a framework that enables explicit and interpretable spatial reasoning for 3D VLMs. The framework is grounded in two key components: Metric Cognitive Map (Metric-CogMap) and Cog...
7.0 viability
Research Paper·Mar 9, 2026
Boosting MLLM Spatial Reasoning with Geometrically Referenced 3D Scene Representations
While Multimodal Large Language Models (MLLMs) have achieved remarkable success in 2D visual understanding, their ability to reason about 3D space remains limited. To address this gap, we introduce ge...
7.0 viability
Research Paper·Mar 10, 2026
Context-Nav: Context-Driven Exploration and Viewpoint-Aware 3D Spatial Reasoning for Instance Navigation
Text-goal instance navigation (TGIN) asks an agent to resolve a single, free-form description into actions that reach the correct object instance among same-category distractors. We present \textit{Co...
3.0 viability