Papers
1–2 of 2Research Paper·Feb 8, 2026
VidVec: Unlocking Video MLLM Embeddings for Video-Text Retrieval
Recent studies have adapted generative Multimodal Large Language Models (MLLMs) into embedding extractors for vision tasks, typically through fine-tuning to produce universal representations. However,...
8.0 viability
Research Paper·Mar 12, 2026
INFACT: A Diagnostic Benchmark for Induced Faithfulness and Factuality Hallucinations in Video-LLMs
Despite rapid progress, Video Large Language Models (Video-LLMs) remain unreliable due to hallucinations, which are outputs that contradict either video evidence (faithfulness) or verifiable world kno...
4.0 viability