4 papers - avg viability 6.5
A benchmark and analysis revealing directional task interference in multimodal LLMs, enabling targeted improvements for more robust conversational AI.
An uncertainty-aware knowledge distillation framework that adaptively balances data and teacher supervision to improve multimodal vision-language models.
A method to enhance multimodal large language models' performance on visual text by addressing the modality gap.
Develop a multimodal LLM decoder enhancement tool leveraging LoRA interventions to improve emotion accessibility in outputs.