AI Safety and Ethics Comparison Hub
3 papers - avg viability 3.3
Top Papers
- Language Models Exhibit Inconsistent Biases Towards Algorithmic Agents and Human Experts(5.0)
Investigating LLMs' inconsistent biases in decision-making involving algorithmic agents and human experts.
- Position: Capability Control Should be a Separate Goal From Alignment(3.0)
Propose a defense-in-depth approach for capability control as a distinct goal from alignment in AI systems.
- When Agents Persuade: Propaganda Generation and Mitigation in LLMs(2.0)
Develop tools to mitigate propaganda generation in LLMs by leveraging optimization techniques.