LLM Reliability Comparison Hub
3 papers - avg viability 7.0
Top Papers
- HalluGuard: Demystifying Data-Driven and Reasoning-Driven Hallucinations in LLMs(7.0)
Develop HalluGuard to achieve state-of-the-art accuracy in detecting hallucinations in LLMs by utilizing an NTK-based score.
- Mitigating LLM Hallucinations through Domain-Grounded Tiered Retrieval(7.0)
A domain-grounded retrieval system that enhances the reliability of LLMs by mitigating hallucinations through a structured verification process.
- Rewarding Intellectual Humility Learning When Not To Answer In Large Language Models(7.0)
Develop a verifiable reward training framework to enhance the reliability of large language models by promoting intellectual humility.