AI Safety and Ethics Comparison Hub

3 papers - avg viability 3.3

Reference Surfaces

Benchmark Industry Index Database View Dataset Alternatives State Report Topic Page

Top Papers

Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts(5.0)
Investigating LLMs' inconsistent biases in decision-making involving algorithmic agents and human experts.
Position: Capability Control Should be a Separate Goal From Alignment(3.0)
Propose a defense-in-depth approach for capability control as a distinct goal from alignment in AI systems.
When Agents Persuade: Propaganda Generation and Mitigation in LLMs(2.0)
Develop tools to mitigate propaganda generation in LLMs by leveraging optimization techniques.