U-Turn Diffusion
Abstract
:1. Introduction
1.1. Our Contributions
1.2. Related Work
2. Technical Introduction: Score-Based Diffusion
2.1. Choice of SBD—Time-Dependent Brownian Diffusion
Pre-Trained ImageNet-64 and CIFAR-10
3. Analysis of Basic SBD Model
3.1. Kolmogorov–Smirnov Test
3.2. Average of the Normalized Score Function 2-Norm
3.3. Insensitivity to Reverse Process Initialization
4. U-Turn
Algorithm 1: U-Turn |
Require: – NN approximation of defined by Equation (5); , .
|
4.1. Visual Examination on the ImageNet Model
4.2. FID Test: U-Turn vs. Artificial Initialization
4.3. Auto-Correlation Function of U-Turn
4.3.1. Gaussian Approximation for the U-Turn Auto-Correlation Function
5. Multi-Class—The Case of CIFAR-10
5.1. Visual Inspection and FID
- Both and vary depending on the initial GT sample.
- The region between and appears somewhat blurry, with the potential for some newly generated samples to be noisy or ambiguous (a mix of images).
- There may be more than one Speciation Transition—possibly a hierarchy of transitions, each corresponding to diffusion from the domain of the initial GT sample to a domain associated with a different class.
5.2. Conditional U-Turn: Quantitative Analysis
5.3. From Sample-Conditional to Averaged U-Turn
6. Gaussian Analysis and Gaussian- (G-) Turn
Algorithm 2: G-Turn |
Require: : NN approximation of from Equation (5); .
|
7. U-Turn for Deterministic Samplers
- Lack of randomness in the output images: Examining the deterministic sampler for CIFAR-10 (Figure 15) and FFHQ (Figure 16), we observe that when the U-Turn occurs at , there is no randomness (uncertainty) in the output image. For deterministic reverse processes, the resulting image remains unchanged with further increases in (at least for a sufficiently large ). This is in sharp contrast to stochastic reverse processes, which are the primary focus of this paper, where images generated for different values of exhibit noticeable variation.
- Absence of Speciation Transitions in some cases: In some instances, no Speciation Transition is observed (e.g., in the case of a car, as shown in the third column of Figure 15). This suggests that Speciation Transitions may not always manifest in individual dynamics but could emerge in some form of averaged behavior.
- Intermediate changes: in several cases (though not all), there is a range of U-Turn times during which changes in the output image are observed, despite the reverse dynamics being deterministic.
- Class changes with increasing U-Turn time: in some instances, we observe that classes change multiple times as increases, potentially due to spontaneous and dynamic symmetry breaking.
8. Conclusions and Future Work
- Computational efficiency: by reducing the duration of both the forward and reverse diffusion processes, the U-Turn model significantly lowers computational overhead while maintaining high image quality.
- Controlled diversity: The U-Turn framework enables a controlled transition from memorization to speciation. This mechanism allows the generation of diverse synthetic images that preserve key features of the ground truth, thereby enhancing the overall utility of the generated data.
- Integration with feature-based albumentation: Our method can be readily integrated into feature-based albumentation pipelines. By increasing the diversity and quality of augmented datasets, the U-Turn model has the potential to improve downstream tasks such as image classification and object detection.
- Robustness across settings: extensive experiments on datasets such as ImageNet and CIFAR-10 demonstrate that the U-Turn approach consistently improves performance across different settings and model architectures.
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Quality Metrics
Appendix A.1. The Fréchet Inception Distance
Appendix A.2. Kolmogorov–Smirnov Test
- (1% significance level).
- (5% significance level).
- (10% significance level).
Sample Size n | |||
---|---|---|---|
10 | 0.368 | 0.410 | 0.490 |
50 | 0.230 | 0.192 | 0.172 |
100 | 0.122 | 0.136 | 0.163 |
1000 | 0.038 | 0.043 | 0.052 |
Appendix A.3. Average of the Score Function 2-Norm
Appendix B. Brownian Diffusion
References
- Song, Y.; Sohl-Dickstein, J.; Kingma, D.P.; Kumar, A.; Ermon, S.; Poole, B. Score-Based Generative Modeling through Stochastic Differential Equations. In Proceedings of the International Conference on Learning Representations, Vienna, Austria, 4 May 2021. [Google Scholar]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10684–10695. [Google Scholar]
- Ho, J.; Jain, A.; Abbeel, P. Denoising Diffusion Probabilistic Models. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 6–12 December 2020; Volume 33, pp. 6840–6851. [Google Scholar]
- Anderson, B.D.O. Reverse-time diffusion equation models. Stoch. Process. Their Appl. 1982, 12, 313–326. [Google Scholar]
- Karras, T.; Aittala, M.; Aila, T.; Laine, S. Elucidating the Design Space of Diffusion-Based Generative Models. In Proceedings of the Advances in Neural Information Processing Systems, New Orleans, LA, USA, 28 November–9 December 2022. [Google Scholar]
- Biroli, G.; Bonnaire, T.; de Bortoli, V.; Mézard, M. Dynamical regimes of diffusion models. Nat. Commun. 2024, 15, 9957. [Google Scholar] [CrossRef] [PubMed]
- Raya, G.; Ambrogioni, L. Spontaneous symmetry breaking in generative diffusion models. In Proceedings of the Thirty-Seventh Conference on Neural Information Processing Systems, Orleans, LA, USA, 10–16 December 2023. [Google Scholar]
- Biroli, G.; Mézard, M. Generative diffusion in very large dimensions. J. Stat. Mech. Theory Exp. 2023, 2023, 093402. [Google Scholar] [CrossRef]
- Sclocchi, A.; Favero, A.; Wyart, M. A phase transition in diffusion models reveals the hierarchical nature of data. arXiv 2024, arXiv:2402.16991. [Google Scholar] [CrossRef] [PubMed]
- Luzi, L.; Mayer, P.M.; Casco-Rodriguez, J.; Siahkoohi, A.; Baraniuk, R. Boomerang: Local sampling on image manifolds using diffusion models. Trans. Mach. Learn. Res. 2024. [Google Scholar] [CrossRef]
- Sohl-Dickstein, J.; Weiss, E.; Maheswaranathan, N.; Ganguli, S. Deep Unsupervised Learning using Nonequilibrium Thermodynamics. In Proceedings of the 32nd International Conference on Machine Learning, Lille, France, 7–9 July 2015; pp. 2256–2265. [Google Scholar]
- Song, Y.; Ermon, S. Generative Modeling by Estimating Gradients of the Data Distribution. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Wang, B.; Vastola, J. The unreasonable effectiveness of Gaussian score approximation for diffusion models and its applications. Trans. Mach. Learn. Res. 2024. [Google Scholar] [CrossRef]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. Adv. Neural Inf. Process. Syst. (NeurIPS) 2017, 30, 6626–6637. [Google Scholar]
- Massey, F.J., Jr. The Kolmogorov-Smirnov test for goodness of fit. J. Am. Stat. Assoc. 1951, 46, 68–78. [Google Scholar] [CrossRef]
- Smirnov, N.V. Table for Estimating the Goodness of Fit of Empirical Distributions. Ann. Math. Stat. 1948, 19, 279–281. [Google Scholar] [CrossRef]
- Kolmogorov, A.N. Sulla determinazione empirica di una legge di distribuzione. Giornale Dell’Istituto Ital. Degli Attuari 1933, 4, 83–91. [Google Scholar]
- Pierret, E.; Galerne, B. Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors. arXiv 2024, arXiv:2405.14250. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Behjoo, H.; Chertkov, M. U-Turn Diffusion. Entropy 2025, 27, 343. https://doi.org/10.3390/e27040343
Behjoo H, Chertkov M. U-Turn Diffusion. Entropy. 2025; 27(4):343. https://doi.org/10.3390/e27040343
Chicago/Turabian StyleBehjoo, Hamidreza, and Michael Chertkov. 2025. "U-Turn Diffusion" Entropy 27, no. 4: 343. https://doi.org/10.3390/e27040343
APA StyleBehjoo, H., & Chertkov, M. (2025). U-Turn Diffusion. Entropy, 27(4), 343. https://doi.org/10.3390/e27040343