LLM Calibration Comparison Hub
3 papers - avg viability 5.0
Top Papers
- Noise-Response Calibration: A Causal Intervention Protocol for LLM-Judges(7.0)
A calibration protocol for LLMs to improve their reliability as automated judges in low-label settings.
- How do LLMs Compute Verbal Confidence(5.0)
A study revealing how LLMs compute verbal confidence, enhancing our understanding of model uncertainty.
- From Entropy to Calibrated Uncertainty: Training Language Models to Reason About Uncertainty(3.0)
Develop a pipeline to efficiently infer calibrated uncertainty estimates in LLMs for high-stakes domains.