Current research in AI ethics is increasingly focused on understanding and mitigating the risks associated with large language models (LLMs) and their deployment in sensitive contexts. Recent work highlights the need for multidimensional evaluation frameworks, such as Social Harm Analysis via Risk Profiles, which move beyond simplistic metrics to capture the nuanced ways models can fail, particularly in high-stakes scenarios. Additionally, studies are revealing how contextual influences can significantly alter moral decision-making in LLMs, challenging the assumption of stable preferences. The auditing of negation sensitivity in these models underscores the critical gaps in their ability to interpret prohibitions accurately, raising concerns about their reliability in autonomous decision-making. As these insights accumulate, the field is shifting toward frameworks that prioritize responsible governance and the preservation of human expertise, emphasizing the importance of relational trust and community involvement in AI development to ensure ethical alignment and mitigate potential harms.
Top papers
- Building Interpretable Models for Moral Decision-Making(5.0)
- SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models(5.0)
- Moral Preferences of LLMs Under Directed Contextual Influence(4.0)
- When Prohibitions Become Permissions: Auditing Negation Sensitivity in Language Models(3.0)
- Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild(3.0)
- From Future of Work to Future of Workers: Addressing Asymptomatic AI Harms for Dignified Human-AI Interaction(3.0)
- Beyond Abstract Compliance: Operationalising trust in AI as a moral relationship(2.0)
- QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks(2.0)
- Creativity in the Age of AI: Rethinking the Role of Intentional Agency(2.0)
- Unplugging a Seemingly Sentient Machine Is the Rational Choice -- A Metaphysical Perspective(2.0)
- AI and My Values: User Perceptions of LLMs' Ability to Extract, Embody, and Explain Human Values from Casual Conversations(2.0)