AI Monitoring Comparison Hub

Develop a robust LLM monitoring tool using the extract-and-evaluate method to detect sabotage with minimal information effectively.

Develop a tool that enhances the reliability of AI monitors by mitigating self-attribution bias in agentic systems.

Develop an evaluation suite for measuring Chain-of-Thought controllability in reasoning models to ensure monitorability.

Reference Surfaces