AI Monitoring Comparison Hub
3 papers - avg viability 5.7
Top Papers
- How does information access affect LLM monitors' ability to detect sabotage?(9.0)
Develop a robust LLM monitoring tool using the extract-and-evaluate method to detect sabotage with minimal information effectively.
- Self-Attribution Bias: When AI Monitors Go Easy on Themselves(5.0)
Develop a tool that enhances the reliability of AI monitors by mitigating self-attribution bias in agentic systems.
- Reasoning Models Struggle to Control their Chains of Thought(3.0)
Develop an evaluation suite for measuring Chain-of-Thought controllability in reasoning models to ensure monitorability.