State of AI Ethics

Current research in AI ethics is increasingly focused on understanding and mitigating the risks associated with large language models (LLMs) and their deployment in sensitive contexts. Recent work highlights the need for multidimensional evaluation frameworks, such as Social Harm Analysis via Risk Profiles, which move beyond simplistic metrics to capture the nuanced ways models can fail, particularly in high-stakes scenarios. Additionally, studies are revealing how contextual influences can significantly alter moral decision-making in LLMs, challenging the assumption of stable preferences. The auditing of negation sensitivity in these models underscores the critical gaps in their ability to interpret prohibitions accurately, raising concerns about their reliability in autonomous decision-making. As these insights accumulate, the field is shifting toward frameworks that prioritize responsible governance and the preservation of human expertise, emphasizing the importance of relational trust and community involvement in AI development to ensure ethical alignment and mitigate potential harms.

Top papers