State of Video Understanding

4 papers · avg viability 7.0

Download CSV View topic page

MLLMs

Top papers

Contribution-aware Token Compression for Efficient Video Understanding via Reinforcement Learning(8.0)
Think-Clip-Sample: Slow-Fast Frame Selection for Video Understanding(7.0)
Hierarchical Long Video Understanding with Audiovisual Entity Cohesion and Agentic Search(7.0)
VideoThinker: Building Agentic VideoLLMs with LLM-Guided Tool Reasoning(6.0)