Isotropic Reconstruction of Anisotropic vEM Volumes with ViT-Guided Diffusion
Abstract
1. Introduction
- We propose a two-stage training framework for cross-axial super-resolution in vEM. It completes axial details while enhancing cross-slice structural consistency. It effectively reduces biologically implausible pseudo-textures.
- We perform self-supervised pretraining on large-scale high-resolution slices. This yields representation priors adapted to vEM texture statistics and ultrastructural patterns.
- We validate the framework’s effectiveness on vEM datasets through 3D reconstruction experiments. Results demonstrate improvements in both quantitative metrics and visual quality.
2. Related Work
3. Preliminaries
4. Methodology
4.1. Framework Overview
- Axial information loss represents irreversible systematic degradation. Section thickness, PSF, and sampling intervals jointly cause low-pass blurring in directions. This makes boundaries blunt and structures discontinuous.
- Cross-slice consistency serves as a critical constraint. Inconsistent generated details between adjacent slices cause organelle boundary jumps. They also break thin elongated structures, creating 3D artifacts that affect segmentation and tracing.
- Domain distribution differs significantly from natural images. vEM images exhibit distinct contrast, noise patterns, and texture statistics. Directly transferring generic perceptual networks often fails to provide reliable structural priors. It may even amplify pseudo-textures.
- Conditional diffusion model enables multi-solution generation through progressive denoising while providing multi-scale local inductive bias (via the denoiser backbone). This alleviates over-smoothing and artifact accumulation from single-step reconstruction. It also enhances the perceptual authenticity of fine details.
- Self-supervised ViT features serve as domain priors. They construct perceptual constraints in feature space. These constraints suppress biologically implausible hallucinated details. They also reduce cross-slice drift by enforcing structural consistency.
4.2. Self-Supervised ViT Feature Pretraining
4.3. Conditional Diffusion Denoising Reconstruction with ViT Perceptual Constraints
| Algorithm 1 Two-stage Training for Cross-axial SR |
|
| Algorithm 2 Inference (Cross-axial SR via Reverse Diffusion) |
|
5. Experiments
5.1. Experimental Setup
5.2. Quantitative Results and Comparison
| Model | Parameters (M) | PSNR (↑) | SSIM (↑) | MSE (↓) | MAE (↓) |
|---|---|---|---|---|---|
| Bicubic | / | 12.9804 ± 0.4148 | 0.2799 ± 0.0213 | 0.3766 ± 0.0366 | 0.4894 ± 0.0252 |
| SRCNN [23] | 0.0573 | 15.4597 ± 0.4392 | 0.3669 ± 0.0226 | 0.2008 ± 0.0161 | 0.3500 ± 0.0165 |
| Subpixel CNN [24] | 0.2270 | 15.4310 ± 0.4087 | 0.3646 ± 0.0214 | 0.2017 ± 0.0147 | 0.3506 ± 0.0154 |
| FNO [37] | 4.7520 | 14.5686 ± 0.2274 | 0.2967 ± 0.0146 | 0.2405 ± 0.0059 | 0.3880 ± 0.0078 |
| EDSR [35] | 1.3676 | 15.5603 ± 0.3947 | 0.3737 ± 0.0208 | 0.1957 ± 0.0132 | 0.3452 ± 0.0144 |
| WDSR [36] | 1.3345 | 15.4533 ± 0.3863 | 0.3651 ± 0.0204 | 0.2002 ± 0.0133 | 0.3496 ± 0.0143 |
| Ours | 15.677 | 20.2509 ± 0.9655 | 0.4510 ± 0.0394 | 0.0387 ± 0.0091 | 0.1546 ± 0.0137 |
5.3. Qualitative Visual Analysis
5.4. Ablation Study
5.5. ViT Feature Similarity Heatmap Visualization
5.6. Training Dynamics Analysis
6. Limitations
7. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Denk, W.; Horstmann, H. Serial block-face scanning electron microscopy to reconstruct three-dimensional tissue nanostructure. Plos Biol. 2004, 2, e329. [Google Scholar] [CrossRef]
- Peddie, C.J.; Collinson, L.M. Exploring the third dimension: Volume electron microscopy comes of age. Micron 2014, 61, 9–19. [Google Scholar] [CrossRef]
- Lichtman, J.W.; Pfister, H.; Shavit, N. The big data challenges of connectomics. Nat. Neurosci. 2014, 17, 1448–1454. [Google Scholar] [CrossRef]
- Kasthuri, N.; Hayworth, K.J.; Berger, D.R.; Schalek, R.L.; Conchello, J.A.; Knowles-Barley, S.; Lee, D.; Vázquez-Reina, A.; Kaynig, V.; Jones, T.R.; et al. Saturated reconstruction of a volume of neocortex. Cell 2015, 162, 648–661. [Google Scholar] [CrossRef] [PubMed]
- Helmstaedter, M.; Briggman, K.L.; Turaga, S.C.; Jain, V.; Seung, H.S.; Denk, W. Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature 2013, 500, 168–174. [Google Scholar] [CrossRef]
- Motta, A.; Berning, M.; Boergens, K.M.; Staffler, B.; Beining, M.; Loomba, S.; Hennig, P.; Wissler, H.; Helmstaedter, M. Dense connectomic reconstruction in layer 4 of the somatosensory cortex. Science 2019, 366, eaay3134. [Google Scholar] [CrossRef] [PubMed]
- Lichtman, J.W.; Denk, W. The big and the small: Challenges of imaging the brain’s circuits. Science 2011, 334, 618–623. [Google Scholar] [CrossRef]
- Helmstaedter, M. Cellular-resolution connectomics: Challenges of dense neural circuit reconstruction. Nat. Methods 2013, 10, 501–507. [Google Scholar] [CrossRef] [PubMed]
- Hua, Y.; Laserstein, P.; Helmstaedter, M. Large-volume en-bloc staining for electron microscopy-based connectomics. Nat. Commun. 2015, 6, 7923. [Google Scholar] [CrossRef]
- Briggman, K.L.; Bock, D.D. Volume electron microscopy for neuronal circuit reconstruction. Curr. Opin. Neurobiol. 2012, 22, 154–161. [Google Scholar] [CrossRef]
- Hayworth, K.J.; Xu, C.S.; Lu, Z.; Knott, G.W.; Fetter, R.D.; Tapia, J.C.; Lichtman, J.W.; Hess, H.F. Ultrastructurally smooth thick partitioning and volume stitching for large-scale connectomics. Nat. Methods 2015, 12, 319–322. [Google Scholar] [CrossRef]
- Keys, R. Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. Speech Signal Process. 1981, 29, 1153–1160. [Google Scholar] [CrossRef]
- Unser, M. Splines: A perfect fit for signal and image processing. IEEE Signal Process. Mag. 2002, 16, 22–38. [Google Scholar] [CrossRef]
- Saalfeld, S.; Fetter, R.; Cardona, A.; Tomancak, P. Elastic volume reconstruction from series of ultra-thin microscopy sections. Nat. Methods 2012, 9, 717–720. [Google Scholar] [CrossRef]
- Deng, S.; Fu, X.; Xiong, Z.; Chen, C.; Liu, D.; Chen, X.; Ling, Q.; Wu, F. Isotropic reconstruction of 3D EM images with unsupervised degradation learning. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Berlin/Heidelberg, Germany, 2020; pp. 163–173. [Google Scholar]
- Pan, M.; Gan, Y.; Zhou, F.; Liu, J.; Zhang, Y.; Wang, A.; Zhang, S.; Li, D. DiffuseIR: Diffusion models for isotropic reconstruction of 3D microscopic images. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Berlin/Heidelberg, Germany, 2023; pp. 323–332. [Google Scholar]
- Yang, H.; Wei, Q.; Sang, Y. Transform Domain Based GAN with Deep Multi-Scale Features Fusion for Medical Image Super-Resolution. Electronics 2025, 14, 3726. [Google Scholar] [CrossRef]
- Liu, Q.; Chen, L.; Sun, Y.; Liu, L. SwinT-SRGAN: Swin Transformer Enhanced Generative Adversarial Network for Image Super-Resolution. Electronics 2025, 14, 3511. [Google Scholar] [CrossRef]
- Lu, C.; Chen, K.; Qiu, H.; Chen, X.; Chen, G.; Qi, X.; Jiang, H. Diffusion-based deep learning method for augmenting ultrastructural imaging and volume electron microscopy. Nat. Commun. 2024, 15, 4677. [Google Scholar] [CrossRef]
- Kazimi, B.; Ruzaeva, K.; Sandfeld, S. Self-supervised learning with generative adversarial networks for electron microscopy. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 71–81. [Google Scholar]
- Siméoni, O.; Vo, H.V.; Seitzer, M.; Baldassarre, F.; Oquab, M.; Jose, C.; Khalidov, V.; Szafraniec, M.; Yi, S.; Ramamonjisoa, M.; et al. Dinov3. arXiv 2025, arXiv:2508.10104. [Google Scholar] [PubMed]
- Lee, K.; Jeong, W.K. Reference-free isotropic 3d em reconstruction using diffusion models. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention; Springer: Berlin/Heidelberg, Germany, 2023; pp. 235–245. [Google Scholar]
- Dong, C.; Loy, C.C.; He, K.; Tang, X. Image super-resolution using deep convolutional networks. IEEE Trans. Pattern Anal. Mach. Intell. 2015, 38, 295–307. [Google Scholar] [CrossRef]
- Shi, W.; Caballero, J.; Huszár, F.; Totz, J.; Aitken, A.P.; Bishop, R.; Rueckert, D.; Wang, Z. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 1874–1883. [Google Scholar]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; Timofte, R. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 1833–1844. [Google Scholar]
- Huang, H.; Abbas, H. Cross-Modality Guided Super-Resolution for Weak-Signal Fluorescence Imaging via a Multi-Channel SwinIR Framework. Electronics 2026, 15, 204. [Google Scholar] [CrossRef]
- Shou, J.; Xiao, Z.; Deng, S.; Huang, W.; Shi, P.; Zhang, R.; Xiong, Z.; Wu, F. Learning large-factor EM image super-resolution with generative priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 11313–11322. [Google Scholar]
- Ferede, F.A.; Khalighifar, A.; John, J.; Venkataraman, K.; Khairy, K. Z-upscaling: Optical Flow Guided Frame Interpolation for Isotropic Reconstruction of 3D EM Volumes. In Proceedings of the 2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI); IEEE: Piscataway, NJ, USA, 2025; pp. 1–5. [Google Scholar]
- Troidl, J.; Liang, Y.; Beyer, J.; Tavakoli, M.; Danzl, J.; Hadwiger, M.; Pfister, H.; Tompkin, J. niiv: Interactive Self-supervised Neural Implicit Isotropic Volume Reconstruction. In Proceedings of the International Workshop on Efficient Medical Artificial Intelligence; Springer: Berlin/Heidelberg, Germany, 2025; pp. 257–267. [Google Scholar]
- He, Y.; Zhou, Z.; Zheng, Y.; Liang, C.; Wang, Y.; Yang, X. EMGauss: Continuous Slice-to-3D Reconstruction via Dynamic Gaussian Modeling in Volume Electron Microscopy. arXiv 2025, arXiv:2512.06684. [Google Scholar]
- Zhang, Y.; Zhen, J.; Sun, S.; Liu, T.; Huo, L.; Wang, T. SCAFNet: A Semantic Compensated Adaptive Fusion Network for Remote Sensing Images Change Detection. IEEE Geosci. Remote Sens. Lett. 2026, 23, 6003405. [Google Scholar] [CrossRef]
- Zhang, Y.; Wang, T.; Xue, L.; Lian, W.; Tao, R. ORSI Salient Object Detection via Progressive Interaction and Saliency-Guided Enhancement. IEEE Geosci. Remote Sens. Lett. 2025, 23, 6002105. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, T.; Zhen, J.; Kang, Y.; Cheng, Y. Adaptive downsampling and scale enhanced detection head for tiny object detection in remote sensing image. IEEE Geosci. Remote Sens. Lett. 2025, 22, 6003605. [Google Scholar] [CrossRef]
- Phelps, J.S.; Hildebrand, D.G.C.; Graham, B.J.; Kuan, A.T.; Thomas, L.A.; Nguyen, T.M.; Buhmann, J.; Azevedo, A.W.; Sustar, A.; Agrawal, S.; et al. Reconstruction of motor control circuits in adult Drosophila using automated transmission electron microscopy. Cell 2021, 184, 759–774. [Google Scholar] [CrossRef]
- Lim, B.; Son, S.; Kim, H.; Nah, S.; Mu Lee, K. Enhanced deep residual networks for single image super-resolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 136–144. [Google Scholar]
- Fan, Y.; Yu, J.; Huang, T.S. Wide-activated deep residual networks based restoration for bpg-compressed images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 2621–2624. [Google Scholar]
- Li, Z.; Kovachki, N.; Azizzadenesheli, K.; Liu, B.; Bhattacharya, K.; Stuart, A.; Anandkumar, A. Fourier neural operator for parametric partial differential equations. arXiv 2020, arXiv:2010.08895. [Google Scholar]







| Method | PSNR ↑ | SSIM ↑ | LPIPS ↓ | DISTS ↓ |
|---|---|---|---|---|
| MSE only | 19.8774 | 0.2550 | 0.5407 | 0.3682 |
| MSE + (Ours) | 19.9742 | 0.4060 | 0.4627 | 0.2930 |
| Method | MAE ↓ | MSE ↓ | PSNR ↑ | SSIM ↑ | FSIM ↑ | LPIPS ↓ | DISTS ↓ |
|---|---|---|---|---|---|---|---|
| , | 0.1594 | 0.0406 | 19.9318 | 0.1991 | 0.7018 | 0.5579 | 0.3677 |
| , | 0.2063 | 0.0716 | 17.4730 | 0.1375 | 0.6462 | 0.6252 | 0.4098 |
| , (Ours) | 0.1644 | 0.0413 | 19.9742 | 0.4060 | 0.7006 | 0.4627 | 0.2930 |
| , | 0.1560 | 0.0404 | 19.9602 | 0.2217 | 0.7270 | 0.5561 | 0.3659 |
| , | 0.1458 | 0.0342 | 20.6844 | 0.2752 | 0.7552 | 0.5395 | 0.3624 |
| , | 0.1610 | 0.0424 | 19.7508 | 0.2492 | 0.7403 | 0.5164 | 0.3368 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Qiu, J.; Wan, G.; Zhou, Z.; Liao, M.; Liu, X.; Li, X.; Du, B. Isotropic Reconstruction of Anisotropic vEM Volumes with ViT-Guided Diffusion. Electronics 2026, 15, 1181. https://doi.org/10.3390/electronics15061181
Qiu J, Wan G, Zhou Z, Liao M, Liu X, Li X, Du B. Isotropic Reconstruction of Anisotropic vEM Volumes with ViT-Guided Diffusion. Electronics. 2026; 15(6):1181. https://doi.org/10.3390/electronics15061181
Chicago/Turabian StyleQiu, Junchao, Guojia Wan, Zhengyun Zhou, Minghui Liao, Xiangdong Liu, Xinyuan Li, and Bo Du. 2026. "Isotropic Reconstruction of Anisotropic vEM Volumes with ViT-Guided Diffusion" Electronics 15, no. 6: 1181. https://doi.org/10.3390/electronics15061181
APA StyleQiu, J., Wan, G., Zhou, Z., Liao, M., Liu, X., Li, X., & Du, B. (2026). Isotropic Reconstruction of Anisotropic vEM Volumes with ViT-Guided Diffusion. Electronics, 15(6), 1181. https://doi.org/10.3390/electronics15061181

