LD-DEM: Latent Diffusion with Conditional Decoding for High-Precision Planetary DEM Generation from RGB Satellite Images
Abstract
1. Introduction
- We propose LD-DEM, a highly accurate DEM generation model based on latent space diffusion models. It uses a controllable denoising algorithm to ensure stable training and significantly enhance DEM accuracy. Experiments show that our algorithm reduces average absolute error by 17.8–37.5% compared to the baseline algorithms, demonstrating superior performance.
- We developed a conditional decoder module that integrates RGB image features to enhance terrain detail, balancing reconstruction quality and computational efficiency.
- We constructed datasets for DEM generation on the Moon and Mars, addressing mismatches between satellite images and DEM data, normalizing surface heights, and selecting representative terrains to support this research and future studies.
2. Related Works
3. Methods
3.1. Denoising Diffusion Probabilistic Model
3.2. Latent Diffusion-Based DEM Generation Algorithm
3.3. Translator
3.4. Denoiser
3.5. Conditional Decoder
3.6. Model Training
Algorithm 1. Inference process | |
|
|
4. Experiment and Results
4.1. Experimental Settings and Environment
4.1.1. Dataset
4.1.2. Latent Vector Data Generation
4.1.3. Environment
4.1.4. Evaluation Metrics
4.2. Results and Analysis
Ablation Experiments
5. Summary
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
DEM | Digital Elevation Model |
CNNs | Convolutional Neural Networks |
GANs | Generative Adversarial Networks |
LD–DEM | Latent Diffusion-based DEM |
DDPM | Denoising Diffusion Probabilistic Model |
VAE | Variational Autoencoder |
DDIM | Denoising Diffusion Implicit Model |
References
- Tong, X.; Feng, Y.; Ye, Z. Illumination Robust Landing Point Visual Localization for Lunar Lander With High-Resolution Map Generation. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2024, 17, 1577–1591. [Google Scholar] [CrossRef]
- Cao, W.; Xiao, Z.; Luo, F.; Ma, Y.; Ouyang, H.; Xu, R. Emplacement mechanism of ponded light plains on the moon: Insight from topography roughness. Icarus 2024, 415, 116071. [Google Scholar] [CrossRef]
- Elias, M.; Isfort, S.; Eltner, A. UAS Photogrammetry for Precise Digital Elevation Models of Complex Topography: A Strategy Guide. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2024, 10, 57–64. [Google Scholar] [CrossRef]
- Smith, D.E.; Zuber, M.T.; Jackson, G.B.; Cavanaugh, J.F.; Neumann, G.A.; Riris, H.; Sun, X.; Zellar, R.S.; Coltharp, C.; Connelly, J.; et al. The lunar orbiter laser altimeter investigation on the lunar reconnaissance orbiter mission. Space Sci. Rev. 2010, 150, 209–241. [Google Scholar] [CrossRef]
- Uwe, L. Laserscanning For DEM Generation. WIT Trans. Inf. Commun. Technol. 1998, 21, 7. [Google Scholar]
- Escobar Villanueva, J.R.; Iglesias Martinez, L.; Perez Montiel, J.I. DEM generation from fixed-wing UAV imaging and LiDAR-derived ground control points for flood estimations. Sensors 2019, 19, 3205. [Google Scholar] [CrossRef] [PubMed]
- Feng, S.; Lin, Y.; Wang, Y. DEM generation with a scale factor using multi-aspect SAR imagery applying radargrammetry. Remote Sens. 2020, 12, 556. [Google Scholar] [CrossRef]
- Sefercik, U.G. Comparison of DEM accuracies generated by various methods. In Proceedings of the 3rd International Conference on Recent Advances in Space Technologies, Istanbul, Turkey, 14–16 June 2007; pp. 379–382. [Google Scholar]
- Tao, L.; Zhong, X.; Wu, T. DEM Generation with High-Resolution Repeat-Pass Interferometry for Airborne Squinted SAR Acquisitions. In Proceedings of the 6th Asia-Pacific Conference on Synthetic Aperture Radar (APSAR), Xiamen, China, 26–29 November 2019; pp. 1–5. [Google Scholar]
- Daying, Q.; Feitao, R.; Xiaofeng, W.; Mengdao, X.; Ning, J.; Dongping, Z. WVD-GAN: A Wigner-Ville distribution enhancement method based on generative adversarial network. IET Radar Sonar Navig. 2024, 18, 849–865. [Google Scholar]
- Beckham, C.; Pal, C. A Step Towards Procedural Terrain Generation with GANs. arXiv 2017, arXiv:1707.03383. [Google Scholar] [CrossRef]
- Yilin, Z.; Lingmin, H.; Xiangping, W.; Chen, P. Self-training and Multi-level Adversarial Network for Domain Adaptive Remote Sensing Image Segmentation. Neural Process. Lett. 2021, 55, 10197–10216. [Google Scholar]
- Voulgaris, G.; Mademlis, I.; Pitas, I. Procedural terrain generation using generative adversarial networks. In Proceedings of the 29th European Signal Processing Conference (EUSIPCO), Dublin, Ireland, 23–27 August 2021; pp. 686–690. [Google Scholar]
- Demiray, B.Z.; Sit, M.; Demir, I. D-SRGAN: DEM Super-Resolution with Generative Adversarial Networks. SN Comput. Sci. 2021, 2, 48. [Google Scholar] [CrossRef]
- Chen, Z.; Wu, B.; Liu, W.C. Mars3DNet: CNN-based high-resolution 3D reconstruction of the Martian surface from single images. Remote Sens. 2021, 13, 839. [Google Scholar] [CrossRef]
- Yang, L.; Zhu, Z.; Sun, L. Global attention-based DEM: A planet surface digital elevation model-generation method combined with a global attention mechanism. Aerospace 2024, 11, 529. [Google Scholar] [CrossRef]
- Chen, H.; Gläser, P.; Hu, X. ELunarDTMNet: Efficient reconstruction of high-resolution lunar DTM from single-view orbiter images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–12. [Google Scholar] [CrossRef]
- Zhou, A.; Chen, Y.; Wilson, J.P. A multi-terrain feature-based deep convolutional neural network for constructing super-resolution DEMs. Int. J. Appl. Earth Obs. Geoinf. 2023, 120, 103338. [Google Scholar] [CrossRef]
- Luo, S.; Tan, Y.; Huang, S.; Li, J.; Zhao, H. Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference. Comput. Vis. Pattern Recognit. 2024, 8, 1–15. [Google Scholar]
- Zhu, M.; Xu, Z.; Wang, X.; Chen, Y.; Li, H.; Zhang, Q. LatentExplainer: Explaining Latent Representations in Deep Generative Models with Multimodal Large Language Models. arXiv 2025, arXiv:2406.14862. [Google Scholar]
- Pan, Y.; Zhang, L.; Wang, H. Style-Guided Text-to-Image Diffusion Models with Reference Image Support. Artif. Intell. Rev. 2023, 56, 12345–12360. [Google Scholar]
- Rombach, R.; Blattmann, A.; Lorenz, D. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 10684–10695. [Google Scholar]
- Saxena, S.; Kar, A.; Norouzi, M. Monocular Depth Estimation Using Diffusion Models. arXiv 2017, arXiv:2302.14816. [Google Scholar]
- Duan, Y.; Guo, X.; Zhu, Z. Diffusiondepth: Diffusion denoising approach for monocular depth estimation. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; pp. 432–449. [Google Scholar]
- Patni, S.; Agarwal, A.; Arora, C. Ecodepth: Effective conditioning of diffusion models for monocular depth estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 28285–28295. [Google Scholar]
- Tosi, F.; Ramirez, P.Z.; Poggi, M. Diffusion models for monocular depth estimation: Overcoming challenging conditions. In Proceedings of the European Conference on Computer Vision, Milan, Italy, 29 September–4 October 2024; pp. 236–257. [Google Scholar]
- Chen, Y.; Yu, L.; Wang, Z. DiffDRNet: A Latent Diffusion Model-Based Method for Depth Completion. In Proceedings of the 10th International Conference on Systems and Informatics (ICSAI), Shanghai, China, 13–15 November 2024; pp. 1–6. [Google Scholar]
- Hu, Z.; Hu, K.; Mo, C. Terrain Diffusion Network: Climatic-Aware Terrain Generation with Geological Sketch Guidance. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; pp. 12565–12573. [Google Scholar]
- Panagiotou, E.; Chochlakis, G.; Grammatikopoulos, L. Generating Elevation Surface from a Single RGB Remotely Sensed Image Using Deep Learning. Remote Sens. 2020, 12, 2002. [Google Scholar] [CrossRef]
- Chen, H.; Hu, X.; Oberst, J. Pixel-resolution DTM generation for the lunar surface based on a combined deep learning and shape-from-shading (SFS) approach. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2022, 3, 511–516. [Google Scholar] [CrossRef]
- Bertone, S.; Barker, M.K.; Mazarico, E. Large-Scale Elevation Models to Support Optical Navigation to the Lunar Surface. 2023. Available online: https://zenodo.org/records/10258683 (accessed on 21 July 2025.).
- Wang, Y.; Jin, S.; Yang, Z. TTSR: A transformer-based topography neural network for digital elevation model super-resolution. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–19. [Google Scholar] [CrossRef]
- Al-Fugara, A.; Almomani, M.H.; Zitar, R.A. Enhanced deep learning network for accurate digital elevation model generation from LiDAR data. Autom. Constr. 2024, 167, 105708. [Google Scholar] [CrossRef]
- Paul, S.; Ashutosh, G.A. Regularized Adversarial Network for Image Guided DEM Super-resolution Using Frequency Selective Hybrid Graph Transformer. In International Conference on Pattern Recognition; Springer: Cham, Switzerland, 2025; pp. 389–405. [Google Scholar]
- Ramos, N.; Santos, P.; Dias, J. Dual critic conditional Wasserstein GAN for height-map generation. In Proceedings of the 18th International Conference on the Foundations of Digital Games, Lisbon, Portugal, 11–14 April 2023; pp. 1–4. [Google Scholar]
- Huang, Y.L.; Yuan, X.F. StyleTerrain: A novel disentangled generative model for controllable high-quality procedural terrain generation. Comput. Graph. 2023, 116, 373–382. [Google Scholar] [CrossRef]
- Zhao, Y.; Wu, B.; Kong, G. Generating high-resolution DEMs in mountainous regions using ICESat-2/ATLAS photons. Int. J. Appl. Earth Obs. Geoinf. 2025, 138, 104461. [Google Scholar] [CrossRef]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
- Zhang, D.; Li, J.; Chen, Z.; Zou, Y. Efficient image generation with Contour Wavelet Diffusion. Comput. Graph. 2024, 124, 839. [Google Scholar] [CrossRef]
- De Groot, S.R.; Mazur, P. Non-Equilibrium Thermodynamics, 1st ed.; Courier Corporation: Mineola, NY, USA, 2013. [Google Scholar]
- Blumenthal, R.M.; Getoor, R.K. Markov Processes and Potential Theory, 1st ed.; Courier Corporation: Mineola, NY, USA, 2007. [Google Scholar]
- NASA. Available online: https://svs.gsfc.nasa.gov/ (accessed on 18 June 2025).
- Lunar and Planetary Data Release System. Available online: http://moon.bao.ac.cn (accessed on 18 June 2025).
Dataset | Training Data | Testing Data |
---|---|---|
Moon | 3049 | 338 |
Mars | 9615 | 2410 |
Dataset | Metrics | pix2pix | VAE-LD | LD–DEM (Ours) |
---|---|---|---|---|
Lunar | MAE (m) | 2.978 | 2.264 | 1.861 |
RMSE (m) | 3.165 | 2.719 | 2.256 | |
Mars | MAE (m) | 2.953 | 2.714 | 2.158 |
RMSE (m) | 3.045 | 3.134 | 2.507 |
Decoder | Metrics | 50 Times Iteration | 100 Times Iteration |
---|---|---|---|
VAE-LD | MAE (m) ↓ | ||
RMSE (m) ↓ | |||
LD-DEM | MAE (m) ↓ | ||
RMSE (m) ↓ |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, L.; Zhou, H.; Yang, L.; Zhao, D.; Zhang, D. LD-DEM: Latent Diffusion with Conditional Decoding for High-Precision Planetary DEM Generation from RGB Satellite Images. Aerospace 2025, 12, 658. https://doi.org/10.3390/aerospace12080658
Sun L, Zhou H, Yang L, Zhao D, Zhang D. LD-DEM: Latent Diffusion with Conditional Decoding for High-Precision Planetary DEM Generation from RGB Satellite Images. Aerospace. 2025; 12(8):658. https://doi.org/10.3390/aerospace12080658
Chicago/Turabian StyleSun, Long, Haonan Zhou, Li Yang, Dengyang Zhao, and Dongping Zhang. 2025. "LD-DEM: Latent Diffusion with Conditional Decoding for High-Precision Planetary DEM Generation from RGB Satellite Images" Aerospace 12, no. 8: 658. https://doi.org/10.3390/aerospace12080658
APA StyleSun, L., Zhou, H., Yang, L., Zhao, D., & Zhang, D. (2025). LD-DEM: Latent Diffusion with Conditional Decoding for High-Precision Planetary DEM Generation from RGB Satellite Images. Aerospace, 12(8), 658. https://doi.org/10.3390/aerospace12080658