State of the Field
Recent advancements in generative models are focusing on enhancing the quality and diversity of generated outputs while addressing inherent biases and inefficiencies. Techniques such as adversarial training in diffusion models are enabling the decomposition of complex data into reusable components, which can significantly improve the synthesis of diverse samples across various domains, including robotics and image generation. Additionally, methods like Bi-stage Flow Refinement are refining generative outputs without introducing noise, achieving higher fidelity with fewer computational resources. The integration of multi-source datasets through Wasserstein GANs is also addressing the limitations of traditional sequential approaches, enhancing the feasibility of synthetic data for applications in urban planning and agent-based modeling. Furthermore, frameworks like Ambient Dataloops are iteratively refining datasets to improve model training, while conformal prediction methods are introducing calibrated uncertainty estimates, crucial for high-stakes applications. Collectively, these developments are steering the field toward more efficient, reliable, and interpretable generative systems, with significant implications for commercial applications in data synthesis and simulation.
Papers
1–10 of 16Rethinking Refinement: Correcting Generative Bias without Noise Injection
Generative models, including diffusion and flow-based models, often exhibit systematic biases that degrade sample quality, particularly in high-dimensional settings. We revisit refinement methods and ...
Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models
Recent advancements in Generative Reward Models (GRMs) have demonstrated that scaling the length of Chain-of-Thought (CoT) reasoning considerably enhances the reliability of evaluation. However, curre...
Unsupervised Decomposition and Recombination with Discriminator-Driven Diffusion Models
Decomposing complex data into factorized representations can reveal reusable components and enable synthesizing new samples via component recombination. We investigate this in the context of diffusion...
JANUS: Structured Bidirectional Generation for Guaranteed Constraints and Analytical Uncertainty
High-stakes synthetic data generation faces a fundamental Quadrilemma: achieving Fidelity to the original distribution, Control over complex logical constraints, Reliability in uncertainty estimation,...
FlashBlock: Attention Caching for Efficient Long-Context Block Diffusion
Generating long-form content, such as minute-long videos and extended texts, is increasingly important for modern generative models. Block diffusion improves inference efficiency via KV caching and bl...
Enhancing Diversity and Feasibility: Joint Population Synthesis from Multi-source Data Using Generative Models
Generating realistic synthetic populations is essential for agent-based models (ABM) in transportation and urban planning. Current methods face two major limitations. First, many rely on a single data...
Ambient Dataloops: Generative Models for Dataset Refinement
We propose Ambient Dataloops, an iterative framework for refining datasets that makes it easier for diffusion models to learn the underlying data distribution. Modern datasets contain samples of highl...
Better Source, Better Flow: Learning Condition-Dependent Source Distribution for Flow Matching
Flow matching has recently emerged as a promising alternative to diffusion-based generative models, particularly for text-to-image generation. Despite its flexibility in allowing arbitrary source dist...
Conformal Prediction for Generative Models via Adaptive Cluster-Based Density Estimation
Conditional generative models map input variables to complex, high-dimensional distributions, enabling realistic sample generation in a diverse set of domains. A critical challenge with these models i...
Path-Guided Flow Matching for Dataset Distillation
Dataset distillation compresses large datasets into compact synthetic sets with comparable performance in training models. Despite recent progress on diffusion-based distillation, this type of method ...