Papers
1–3 of 3Research Paper·Mar 10, 2026
ALARM: Audio-Language Alignment for Reasoning Models
Large audio language models (ALMs) extend LLMs with auditory understanding. A common approach freezes the LLM and trains only an adapter on self-generated targets. However, this fails for reasoning LL...
7.0 viability
Research Paper·Mar 15, 2026
Nudging Hidden States: Training-Free Model Steering for Chain-of-Thought Reasoning in Large Audio-Language Models
Chain-of-thought (CoT) prompting has been extended to large audio-language models (LALMs) to elicit reasoning, yet enhancing its effectiveness without training remains challenging. We study inference-...
6.0 viability
Research Paper·Mar 10, 2026
MUGEN: Evaluating and Improving Multi-audio Understanding of Large Audio-Language Models
While multi-audio understanding is critical for large audio-language models (LALMs), it remains underexplored. We introduce MUGEN, a comprehensive benchmark evaluating this capability across speech, g...
4.0 viability