4 papers - avg viability 7.0
A novel discrete diffusion model for high-dimensional visual generation that unifies understanding and generation tasks, with code available.
A neuro-symbolic framework that uses structured knowledge to bridge the simulation-to-reality gap in image translation, enabling data-efficient zero-shot transfer.
Repurpose geometric foundation models to accelerate and improve multi-view image generation for novel view synthesis.
GenMask enables direct generative training for segmentation tasks by adapting Diffusion Transformer (DiT) to generate masks alongside images, achieving state-of-the-art performance without complex feature extraction pipelines.