Papers
1–5 of 5Agentic Planning with Reasoning for Image Styling via Offline RL
Direct prompt-based editing often fails on complex transformations because vague and subjective prompts often require nuanced understanding of what should be changed in the image. Our core intuition i...
ImageEdit-R1: Boosting Multi-Agent Image Editing via Reinforcement Learning
With the rapid advancement of commercial multi-modal models, image editing has garnered significant attention due to its widespread applicability in daily life. Despite impressive progress, existing i...
A$^2$-Edit: Precise Reference-Guided Image Editing of Arbitrary Objects and Ambiguous Masks
We propose \textbf{A$^2$-Edit}, a unified inpainting framework for arbitrary object categories, which allows users to replace any target region with a reference object using only a coarse mask. To add...
GRADE: Benchmarking Discipline-Informed Reasoning in Image Editing
Unified multimodal models target joint understanding, reasoning, and generation, but current image editing benchmarks are largely confined to natural images and shallow commonsense reasoning, offering...
InEdit-Bench: Benchmarking Intermediate Logical Pathways for Intelligent Image Editing Models
Multimodal generative models have made significant strides in image editing, demonstrating impressive performance on a variety of static tasks. However, their proficiency typically does not extend to ...