Visual Reasoning Comparison Hub
4 papers - avg viability 6.8
Top Papers
- VisDoT : Enhancing Visual Reasoning through Human-Like Interpretation Grounding and Decomposition of Thought(8.0)
VisDoT enhances visual reasoning in charts through human-like interpretation and decomposition of thought.
- CodePercept: Code-Grounded Visual STEM Perception for MLLMs(8.0)
CodePercept enhances visual reasoning in STEM for MLLMs by leveraging executable code as a perceptual medium.
- Through the Lens of Contrast: Self-Improving Visual Reasoning in VLMs(7.0)
Develop a visual reasoning enhancement tool for VLMs that leverages contrastive learning to improve accuracy.
- MM-CondChain: A Programmatically Verified Benchmark for Visually Grounded Deep Compositional Reasoning(4.0)
MM-CondChain is a benchmark for evaluating visually grounded deep compositional reasoning in multimodal large language models.