Image Captioning Comparison Hub
4 papers - avg viability 6.8
Top Papers
- Hyperdimensional Cross-Modal Alignment of Frozen Language and Image Models for Efficient Image Captioning(7.0)
HDFLIM offers efficient image captioning by aligning frozen language and image models through symbolic operations in a shared high-dimensional space.
- VIVECaption: A Split Approach to Caption Quality Improvement(7.0)
VIVECaption improves image-caption alignment quality using a two-sided approach of gold-standard dataset creation and model alignment, providing high-quality training data for generative models.
- Imagine How To Change: Explicit Procedure Modeling for Change Captioning(7.0)
ProCap generates change captions by modeling the dynamic procedure between two images using a two-stage encoder-decoder framework, offering a more nuanced description of visual differences.
- CCCaption: Dual-Reward Reinforcement Learning for Complete and Correct Image Captioning(6.0)
Develop enhanced image captioning models using dual-reward reinforcement learning for more complete and correct descriptions.