PDF Viewer

BUILDER'S SANDBOX

Build This Paper

Use an AI coding agent to implement this research.

OpenAI Codex
OpenAI CodexAI Agent

Lightweight coding agent in your terminal.

Claude Code
Claude CodeAI Agent

Agentic coding tool for terminal workflows.

AntiGravity IDE
AntiGravity IDEScaffolding

AI agent mindset installer and workflow scaffolder.

Cursor
CursorIDE

AI-first code editor built on VS Code.

VS Code
VS CodeIDE

Free, open-source editor by Microsoft.

MVP Investment

$9K - $13K
6-10 weeks
Engineering
$8,000
GPU Compute
$800
SaaS Stack
$300
Domain & Legal
$100

6mo ROI

0.5-1x

3yr ROI

6-15x

GPU-heavy products have higher costs but premium pricing. Expect break-even by 12mo, then 40%+ margins at scale.

Talent Scout

Y

Yaocong Li

Beijing University of Posts and Telecommunications

L

Le Zhang

Beijing Information Science and Technology University

L

Leihan Zhang

Beijing University of Posts and Telecommunications

Q

Qiang Yan

Beijing University of Posts and Telecommunications

Find Similar Experts

Content experts on LinkedIn & GitHub

References (30)

[1]
Bridging the gap, not forcing the tie: dual-space alignment and fusion framework for toxic memes detection
2026
[2]
Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities
2025
[3]
Robust Adaptation of Large Multimodal Models for Retrieval Augmented Hateful Meme Detection
2025
[4]
Towards Comprehensive Detection of Chinese Harmful Memes
2024
[5]
MemeCLIP: Leveraging CLIP Representations for Multimodal Meme Classification
2024
[6]
KERMIT: Knowledge-EmpoweRed Model In harmful meme deTection
2024
[7]
Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning
2023
[8]
BanglaAbuseMeme: A Dataset for Bengali Abusive Meme Classification
2023
[9]
Pro-Cap: Leveraging a Frozen Vision-Language Model for Hateful Meme Detection
2023
[10]
QLoRA: Efficient Finetuning of Quantized LLMs
2023
[11]
Visual Instruction Tuning
2023
[12]
Vision-Language Models for Vision Tasks: A Survey
2023
[13]
MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action
2023
[14]
Prompting for Multimodal Hateful Meme Classification
2023
[15]
Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features
2022
[16]
Flamingo: a Visual Language Model for Few-Shot Learning
2022
[17]
Chain of Thought Prompting Elicits Reasoning in Large Language Models
2022
[18]
SemEval-2022 Task 5: Multimedia Automatic Misogyny Identification
2022
[19]
SRCB at SemEval-2022 Task 5: Pretraining Based Image to Text Late Sequential Fusion System for Multimodal Misogynous Meme Identification
2022
[20]
Detecting Harmful Memes and Their Targets
2021

Showing 20 of 30 references

Founder's Pitch

"KID is an AI tool for detecting harmful memes by grounding external knowledge in multimodal contexts, achieving SOTA performance."

Content ModerationScore: 7View PDF ↗

Commercial Viability Breakdown

0-10 scale

High Potential

3/4 signals

7.5

Quick Build

4/4 signals

10

Series A Potential

3/4 signals

7.5

Sources used for this analysis

arXiv Paper

Full-text PDF analysis of the research paper

GitHub Repository

Code availability, stars, and contributor activity

Citation Network

Semantic Scholar citations and co-citation patterns

Community Predictions

Crowd-sourced unicorn probability assessments

Analysis model: GPT-4o · Last scored: 1/29/2026

🔭 Research Neighborhood

Generating constellation...

~3-8 seconds

Why It Matters

Detecting harmful memes is crucial for content moderation on social platforms, where memes are increasingly used to convey implicit toxic messages. KID's approach enhances understanding of these messages, improving automated moderation.

Product Angle

Create a SaaS platform offering an API for automated detection of harmful memes, utilizing the dual-head learning mechanism to deliver real-time analyses for social media and online community managers.

Disruption

KID could replace existing simplistic models that fail to accurately identify context-dependent harmful content in memes by providing a more nuanced understanding through knowledge injection and dual-head learning.

Product Opportunity

Social media companies and online platforms face challenges with harmful content moderation. Integrating with platforms like Facebook, Instagram, TikTok, or gaming communities presents a significant market, where platform owners will pay for robust moderation solutions.

Use Case Idea

A commercial tool for social media platforms to automatically detect and flag harmful memes for moderation, integrating seamlessly with existing content management systems.

Science

KID uses a dual-head learning framework involving a label-constrained distillation process to break down meme understanding into visual evidence, background knowledge, and classification labels. It introduces knowledge injection to ground external knowledge explicitly in meme contexts, enhancing both semantic generation and classification.

Method & Eval

KID was extensively tested on five multilingual datasets, showing superior performance compared to previous methods by 2.1%--19.7% in harmful meme detection tasks. Ablation studies confirmed the utility of knowledge injection and dual-head learning.

Caveats

The model could struggle with new cultural contexts not covered in training data, and the need for consistent dataset updates to include emerging memes and symbols. Biases inherent in training data could affect results.

Author Intelligence

Yaocong Li

Beijing University of Posts and Telecommunications

Le Zhang

Beijing Information Science and Technology University

Leihan Zhang

Beijing University of Posts and Telecommunications
zhangleihan@gmail.com

Qiang Yan

Beijing University of Posts and Telecommunications