Papers
1–3 of 3Research Paper·Jan 30, 2026
DINO-SAE: DINO Spherical Autoencoder for High-Fidelity Image Reconstruction and Generation
Recent studies have explored using pretrained Vision Foundation Models (VFMs) such as DINO for generative autoencoders, showing strong generative performance. Unfortunately, existing approaches often ...
8.0 viability
Research Paper·Mar 8, 2026
How Long Can Unified Multimodal Models Generate Images Reliably? Taming Long-Horizon Interleaved Image Generation via Context Curation
Unified multimodal models hold the promise of generating extensive, interleaved narratives, weaving text and imagery into coherent long-form stories. However, current systems suffer from a critical re...
7.0 viability
Research Paper·Mar 7, 2026
Variational Flow Maps: Make Some Noise for One-Step Conditional Generation
Flow maps enable high-quality image generation in a single forward pass. However, unlike iterative diffusion models, their lack of an explicit sampling trajectory impedes incorporating external constr...
7.0 viability