Papers
1–3 of 3Research Paper·Jan 29, 2026
KID: Knowledge-Injected Dual-Head Learning for Knowledge-Grounded Harmful Meme Detection
Internet memes have become pervasive carriers of digital culture on social platforms. However, their heavy reliance on metaphors and sociocultural context also makes them subtle vehicles for harmful c...
7.0 viability
Research Paper·Feb 27, 2026
FlexGuard: Continuous Risk Scoring for Strictness-Adaptive LLM Content Moderation
Ensuring the safety of LLM-generated content is essential for real-world deployment. Most existing guardrail models formulate moderation as a fixed binary classification task, implicitly assuming a fi...
5.0 viability
Research Paper·Mar 2, 2026
GMP: A Benchmark for Content Moderation under Co-occurring Violations and Dynamic Rules
Online content moderation is essential for maintaining a healthy digital environment, and reliance on AI for this task continues to grow. Consider a user comment using national stereotypes to insult a...
5.0 viability