State of the Field
Recent advancements in diffusion models are focusing on improving efficiency and accessibility while addressing inherent challenges in generative tasks. A unified framework for diffusion language modeling has emerged, streamlining the training and deployment processes, which could significantly enhance the reproducibility of research and facilitate broader adoption in commercial applications. Additionally, novel decoding strategies are being developed to optimize the generation process, balancing quality and speed, which is crucial for real-time applications. Researchers are also tackling scalability issues by introducing dynamic mechanisms that adapt to the complexity of the content being generated, effectively reducing computational costs. Furthermore, methods that enforce hard constraints during generation are gaining traction, particularly for safety-critical applications. This collective effort to refine diffusion models not only enhances their performance but also opens avenues for their integration into diverse industries, from content creation to automated reasoning systems.
Papers
1–10 of 15dLLM: Simple Diffusion Language Modeling
Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases o...
Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models
Diffusion Language Models (DLMs) generate text by iteratively denoising a masked sequence, repeatedly deciding which positions to commit at each step. Standard decoding follows a greedy rule: unmask t...
Error as Signal: Stiffness-Aware Diffusion Sampling via Embedded Runge-Kutta Guidance
Classifier-Free Guidance (CFG) has established the foundation for guidance mechanisms in diffusion models, showing that well-designed guidance proxies significantly improve conditional generation and ...
Fast and Scalable Analytical Diffusion
Analytical diffusion models offer a mathematically transparent path to generative modeling by formulating the denoising score as an empirical-Bayes posterior mean. However, this interpretability comes...
DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers
Diffusion Transformers (DiTs) have achieved state-of-the-art performance in image and video generation, but their success comes at the cost of heavy computation. This inefficiency is largely due to th...
One Token Is Enough: Improving Diffusion Language Models with a Sink Token
Diffusion Language Models (DLMs) have emerged as a compelling alternative to autoregressive approaches, enabling parallel text generation with competitive performance. Despite these advantages, there ...
EntRGi: Entropy Aware Reward Guidance for Diffusion Language Models
Reward guidance has been applied to great success in the test-time adaptation of continuous diffusion models; it updates each denoising step using the gradients from a downstream reward model. We stud...
Conditional Diffusion Guidance under Hard Constraint: A Stochastic Analysis Approach
We study conditional generation in diffusion models under hard constraints, where generated samples must satisfy prescribed events with probability one. Such constraints arise naturally in safety-crit...
Coupled Inference in Diffusion Models for Semantic Decomposition
Many visual scenes can be described as compositions of latent factors. Effective recognition, reasoning, and editing often require not only forming such compositional representations, but also solving...
CoDAR: Continuous Diffusion Language Models are More Powerful Than You Think
We study why continuous diffusion language models (DLMs) have lagged behind discrete diffusion approaches despite their appealing continuous generative dynamics. Under a controlled token--recovery stu...