VLMs

VLMs is a unknown in our research taxonomy.

Related papers

Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning
Securing the Floor and Raising the Ceiling: A Merging-based Paradigm for Multi-modal Search Agents
Forest-Chat: Adapting Vision-Language Agents for Interactive Forest Change Analysis
Zero-shot adaptable task planning for autonomous construction robots: a comparative study of lightweight single and multi-AI agent systems
Multimodal Climate Disinformation Detection: Integrating Vision-Language Models with External Knowledge Sources