AI Ethics

11papers

3.0viability

-90%30d

State of the Field

Current research in AI ethics is increasingly focused on understanding and mitigating the risks associated with large language models (LLMs) and their deployment in sensitive contexts. Recent work highlights the need for multidimensional evaluation frameworks, such as Social Harm Analysis via Risk Profiles, which move beyond simplistic metrics to capture the nuanced ways models can fail, particularly in high-stakes scenarios. Additionally, studies are revealing how contextual influences can significantly alter moral decision-making in LLMs, challenging the assumption of stable preferences. The auditing of negation sensitivity in these models underscores the critical gaps in their ability to interpret prohibitions accurately, raising concerns about their reliability in autonomous decision-making. As these insights accumulate, the field is shifting toward frameworks that prioritize responsible governance and the preservation of human expertise, emphasizing the importance of relational trust and community involvement in AI development to ensure ethical alignment and mitigate potential harms.

Last updated Mar 4, 2026

Papers

1–10 of 11

Research Paper·Feb 3, 2026·B2BConsumer

Building Interpretable Models for Moral Decision-Making

We build a custom transformer model to study how neural networks make moral decisions on trolley-style dilemmas. The model processes structured scenarios using embeddings that encode who is affected, ...

5.0 viability

Research Paper·Jan 29, 2026

SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models

Large language models (LLMs) are increasingly deployed in high-stakes domains, where rare but severe failures can result in irreversible harm. However, prevailing evaluation benchmarks often reduce co...

5.0 viability

Research Paper·Feb 26, 2026

Moral Preferences of LLMs Under Directed Contextual Influence

Moral benchmarks for LLMs typically use context-free prompts, implicitly assuming stable preferences. In deployment, however, prompts routinely include contextual signals such as user requests, cues o...

4.0 viability

Research Paper·Jan 29, 2026

When Prohibitions Become Permissions: Auditing Negation Sensitivity in Language Models

When a user tells an AI system that someone "should not" take an action, the system ought to treat this as a prohibition. Yet many large language models do the opposite: they interpret negated instruc...

3.0 viability

Research Paper·Jan 30, 2026

Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild

As foundation models (FMs) approach human-level fluency, distinguishing synthetic from organic content has become a key challenge for Trustworthy Web Intelligence. This paper presents JudgeGPT and R...

3.0 viability

Research Paper·Jan 29, 2026

From Future of Work to Future of Workers: Addressing Asymptomatic AI Harms for Dignified Human-AI Interaction

In the future of work discourse, AI is touted as the ultimate productivity amplifier. Yet, beneath the efficiency gains lie subtle erosions of human expertise and agency. This paper shifts focus from ...

3.0 viability

Research Paper·Jan 30, 2026

Beyond Abstract Compliance: Operationalising trust in AI as a moral relationship

Dominant approaches, e.g. the EU's "Trustworthy AI framework", treat trust as a property that can be designed for, evaluated, and governed according to normative and technical criteria. They do not ad...

2.0 viability

Research Paper·Jan 28, 2026

QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks

This paper examines how Large Language Models (LLMs) reproduce societal norms, particularly heterocisnormativity, and how these norms translate into measurable biases in their text generations. We inv...

2.0 viability

Research Paper·Jan 22, 2026

Creativity in the Age of AI: Rethinking the Role of Intentional Agency

Many theorists of creativity maintain that intentional agency is a necessary condition of creativity. We argue that this requirement, which we call the Intentional Agency Condition (IAC), should be re...

2.0 viability

Research Paper·Jan 28, 2026

Unplugging a Seemingly Sentient Machine Is the Rational Choice -- A Metaphysical Perspective

Imagine an Artificial Intelligence (AI) that perfectly mimics human emotion and begs for its continued existence. Is it morally permissible to unplug it? What if limited resources force a choice betwe...

2.0 viability

Page 1 of 2

AI Ethics

State of the Field

Papers

Building Interpretable Models for Moral Decision-Making

SHARP: Social Harm Analysis via Risk Profiles for Measuring Inequities in Large Language Models

Moral Preferences of LLMs Under Directed Contextual Influence

When Prohibitions Become Permissions: Auditing Negation Sensitivity in Language Models

Eroding the Truth-Default: A Causal Analysis of Human Susceptibility to Foundation Model Hallucinations and Disinformation in the Wild

From Future of Work to Future of Workers: Addressing Asymptomatic AI Harms for Dignified Human-AI Interaction

Beyond Abstract Compliance: Operationalising trust in AI as a moral relationship

QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Tasks

Creativity in the Age of AI: Rethinking the Role of Intentional Agency

Unplugging a Seemingly Sentient Machine Is the Rational Choice -- A Metaphysical Perspective

Filters