View PDF ↗
PDF Viewer

Loading PDF...

This may take a moment

BUILDER'S SANDBOX

Core Pattern

AI-generated implementation pattern based on this paper's core methodology.

Understand the commercial significance and market impact.

7-day free trial. Cancel anytime.

Estimated $9K - $13K over 6-10 weeks.

See exactly what it costs to build this -- with 3 comparable funded startups.

7-day free trial. Cancel anytime.

Discover the researchers behind this paper and find similar experts.

7-day free trial. Cancel anytime.

Founder's Pitch

"Develop a diffusion-native latent reward model for more efficient and effective preference optimization in vision-language tasks."

AI AlignmentScore: 5View PDF ↗

Commercial Viability Breakdown

0-10 scale

High Potential

2/4 signals

5

Quick Build

2/4 signals

5

Series A Potential

2/4 signals

5

Explore the full citation network and related research.

7-day free trial. Cancel anytime.

Understand the commercial significance and market impact.

7-day free trial. Cancel anytime.

Get detailed profiles of the research team.

7-day free trial. Cancel anytime.

References (52)

[1]
MIND-V: Hierarchical Video Generation for Long-Horizon Robotic Manipulation with RL-based Physical Alignment
2025Ruichen Zhang, Mingyang Zhang et al.
[2]
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer
2025Z-Image Team, Huanqia Cai et al.
[3]
Video Generation Models Are Good Latent Reward Models
2025Xiaoyue Mi, Wenqing Yu et al.
[4]
GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping
2025Jing Wang, Jiajun Liang et al.
[5]
Generative Universal Verifier as Multimodal Meta-Reasoner
2025Xinchen Zhang, Xiaoying Zhang et al.
[6]
VR-Thinker: Boosting Video Reward Models through Thinking-with-Image Reasoning
2025Qunzhong Wang, Jie Liu et al.
[7]
Seedream 4.0: Toward Next-generation Multimodal Image Generation
2025Yunpeng Chen, Yu Gao et al.
[8]
RewardDance: Reward Scaling in Visual Generation
2025Jie Wu, Yu Gao et al.
[9]
OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning
2025Yuan Gong, Xionghui Wang et al.
[10]
HPSv3: Towards Wide-Spectrum Human Preference Score
2025Yuhang Ma, Xiaoshi Wu et al.
[11]
Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis
2025Yanzuo Lu, Yuxi Ren et al.
[12]
VisualQuality-R1: Reasoning-Induced Image Quality Assessment via Reinforcement Learning to Rank
2025Tianhe Wu, Jian Zou et al.
[13]
Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
2025Zihan Qiu, Zekun Wang et al.
[14]
Flow-GRPO: Training Flow Matching Models via Online RL
2025Jie Liu, Gongye Liu et al.
[15]
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning
2025Yibin Wang, Zhimin Li et al.
[16]
Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption
2025Du Chen, Tianhe Wu et al.
[17]
Unified Reward Model for Multimodal Understanding and Generation
2025Yibin Wang, Yuhang Zang et al.
[18]
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
2025Tao Zhang, Cheng Da et al.
[19]
Improving Video Generation with Human Feedback
2025Jie Liu, Gongye Liu et al.
[20]
Flow Matching Guide and Code
2024Y. Lipman, Marton Havasi et al.

Showing 20 of 52 references