DiGS: Depth-Initialized Gaussian Splatting for Single-Object Reconstruction
Abstract
1. Introduction
- We provide a systematic analysis of the impact of initialization in 3D Gaussian Splatting, demonstrating its critical role in both early-stage optimization and final reconstruction quality.
- We propose DiGS, a depth-based initialization pipeline that leverages RGB-D data to generate dense, scale-consistent, and noise-filtered point clouds, improving the quality of the initial Gaussian distribution.
- We introduce an initialization strategy that operates independently of the optimization process, enabling seamless integration with existing 3DGS pipelines without additional training overhead.
- We design a pipeline tailored for single-object reconstruction, combining segmentation and depth filtering to reduce background artifacts and improve geometric consistency.
- We conduct extensive experiments on both synthetic and real-world datasets, including a user study, showing that our method significantly improves reconstruction quality in early iterations while maintaining comparable final performance.
2. Related Work
3. Proposed Method
3.1. Image Preprocessing
3.2. Camera Pose Estimation
- Feature detection and matching: detecting and matching local descriptors, e.g., SIFT and ORB [73], to establish correspondences across image pairs.
- Initial reconstruction: estimating the relative pose of an initial image pair and triangulating an initial sparse point cloud using robust estimation, e.g., RANSAC [74].
- Incremental registration: adding new views via Perspective-n-Point (PnP) techniques [75] and expanding the 3D point cloud.
- Bundle adjustment: refining all camera parameters and 3D point locations by minimizing the reprojection error:where denotes projection to image coordinates. For readability, the solutions of the optimization problem (2) are referred to as .
3.3. Initialization
3.3.1. Depth-Based Initialization
3.3.2. Fusion Initialization
3.4. 3DGS Optimization
4. Experiments and Evaluation
4.1. Experimental Setup
- Mask threshold is set as .
- Laplacian kernel size equals 5.
- The parameter is set to of the standard deviation of image gradients.
- KDE scale factor precision equals .
- The k-means weights used for centroid assignment equal and .
4.2. Evaluation Metrics
4.3. Results and Discussion
5. Limitations and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
| DiGS | Depth-initialized Gaussian Splatting |
| SfM | Structure from Motion |
| 3DGS | 3D Gaussian Splatting |
| SuGaR | Surface-aligned Gaussian Splatting |
| NeRF | Neural Radiance Field |
| KDE | Kernel Density Estimation |
| RGB | Red Green Blue |
| RGB-D | Red Green Blue Depth |
| SIFT | Scale-invariant transform function |
| ORB | Orientated FAST and Robust BRIEF |
| RANSAC | RANdom SAmple Consensus |
| PSNR | Peak Signal-to-Noise Ratio |
| LPIPS | Learned Perceptual Image Patch Similarity |
| SSIM | Structural Similarity Index Measure |
| RAM | Random Access Memory |
| GPU | Graphics Processing Unit |
| ADC | Adaptive Density Control |
| RaDe-GS | Rasterizing Depth in Gaussian Splatting |
| DN-Splatter | Depth and Normal Priors for Gaussian Splatting and Meshing |
| RAIN-GS | Relaxing Accurate Initialization Constraint for 3D Gaussian Splatting |
| MVS | Multi-view Stereo |
| CAD | Computer-aided Design |
| DCC | Digital Content Creation |
References
- Li, K.; Cui, Y.; Li, W.; Lv, T.; Yuan, X.; Li, S.; Ni, W.; Simsek, M.; Dressler, F. When internet of things meets metaverse: Convergence of physical and cyber worlds. IEEE Internet Things J. 2022, 10, 4148–4173. [Google Scholar] [CrossRef]
- Visconti, R.M. From physical reality to the Metaverse: A Multilayer Network Valuation. J. Metaverse 2022, 2, 16–22. [Google Scholar] [CrossRef]
- Vallasciani, G.; Stacchio, L.; Cascarano, P.; Marfia, G. CreAIXR: Fostering creativity with generative AI in XR environments. In Proceedings of the 2024 IEEE International Conference on Metaverse Computing, Networking, and Applications (MetaCom); IEEE: Piscataway, NJ, USA, 2024; pp. 1–8. [Google Scholar]
- Hajahmadi, S.; Calvi, I.; Stacchiotti, E.; Cascarano, P.; Marfia, G. Heritage elements and Artificial Intelligence as storytelling tools for virtual retail environments. Digit. Appl. Archaeol. Cult. Herit. 2024, 34, e00368. [Google Scholar] [CrossRef]
- Hajahmadi, S.; Stacchio, L.; Giacché, A.; Cascarano, P.; Marfia, G. Investigating extended reality-powered digital twins for sequential instruction learning: The case of the rubik’s cube. In Proceedings of the 2024 IEEE International Symposium on Mixed and Augmented Reality (ISMAR); IEEE: Piscataway, NJ, USA, 2024; pp. 259–268. [Google Scholar]
- Rodríguez-García, B.; Guillen-Sanz, H.; Checa, D.; Bustillo, A. A systematic review of virtual 3D reconstructions of Cultural Heritage in immersive Virtual Reality. Multimed. Tools Appl. 2024, 83, 89743–89793. [Google Scholar] [CrossRef]
- Phang, J.T.S.; Lim, K.H.; Chiong, R.C.W. A review of three dimensional reconstruction techniques. Multimed. Tools Appl. 2021, 80, 17879–17891. [Google Scholar] [CrossRef]
- Cascarano, P.; Meglioraldi, J.; Vallasciani, G.; Armandi, V.; Augello, G.; Carradori, S.; Hajahmadi, S.; Marfia, G. A Comparative Analysis of 3D Modeling Methods for Integration into an Extended Reality Platform. In Proceedings of the 2025 IEEE International Conference on Artificial Intelligence and Etended and Virtual Reality (AIxVR); IEEE: Piscataway, NJ, USA, 2025; pp. 213–217. [Google Scholar]
- Bruno, F.; Bruno, S.; De Sensi, G.; Luchi, M.L.; Mancuso, S.; Muzzupappa, M. From 3D reconstruction to virtual reality: A complete methodology for digital archaeological exhibition. J. Cult. Herit. 2010, 11, 42–49. [Google Scholar] [CrossRef]
- Collins, J.; Goel, S.; Deng, K.; Luthra, A.; Xu, L.; Gundogdu, E.; Zhang, X.; Vicente, T.F.Y.; Dideriksen, T.; Arora, H.; et al. Abo: Dataset and benchmarks for real-world 3d object understanding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 21126–21136. [Google Scholar]
- Calli, B.; Singh, A.; Walsman, A.; Srinivasa, S.; Abbeel, P.; Dollar, A.M. The ycb object and model set: Towards common benchmarks for manipulation research. In Proceedings of the 2015 International Conference on Advanced Robotics (ICAR); IEEE: Piscataway, NJ, USA, 2015; pp. 510–517. [Google Scholar]
- Agnew, W.; Xie, C.; Walsman, A.; Murad, O.; Wang, Y.; Domingos, P.; Srinivasa, S. Amodal 3d reconstruction for robotic manipulation via stability and connectivity. In Proceedings of the Conference on Robot Learning; PMLR: Cambridge, MA, USA, 2021; pp. 1498–1508. [Google Scholar]
- Iwase, S.; Irshad, Z.; Liu, K.; Guizilini, V.; Lee, R.; Ikeda, T.; Amma, A.; Nishiwaki, K.; Kitani, K.; Ambrus, R.; et al. ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping. arXiv 2025, arXiv:2504.10857. [Google Scholar]
- Thrun, S. Robotic mapping: A survey. In Exploring Artificial Intelligence in the New Millennium; Morgan Kaufmann Publishers: San Francisco, CA, USA, 2002; Volume 1, pp. 1–35. [Google Scholar]
- Wang, T.W.; Huang, H.P.; Zhao, Y.L. Vision-Guided Autonomous Robot Navigation in Realistic 3D Dynamic Scenarios. Appl. Sci. 2025, 15, 2323. [Google Scholar] [CrossRef]
- Xu, Z.; Zhan, X.; Chen, B.; Xiu, Y.; Yang, C.; Shimada, K. A real-time dynamic obstacle tracking and mapping system for UAV navigation and collision avoidance with an RGB-D camera. In Proceedings of the 2023 IEEE International Conference on Robotics and Automation (ICRA); IEEE: Piscataway, NJ, USA, 2023; pp. 10645–10651. [Google Scholar]
- Gomes, L.; Bellon, O.R.P.; Silva, L. 3D reconstruction methods for digital preservation of cultural heritage: A survey. Pattern Recognit. Lett. 2014, 50, 3–14. [Google Scholar] [CrossRef]
- Kargas, A.; Karitsioti, N.; Loumos, G. Reinventing museums in 21st century: Implementing augmented reality and virtual reality technologies alongside social Media’s logics. In Virtual and Augmented Reality in Education, Art, and Museums; IGI Global Scientific Publishing: Hershey, PA, USA, 2020; pp. 117–138. [Google Scholar]
- Kantaros, A.; Ganetsos, T.; Petrescu, F.I.T. Three-dimensional printing and 3D scanning: Emerging technologies exhibiting high potential in the field of cultural heritage. Appl. Sci. 2023, 13, 4777. [Google Scholar] [CrossRef]
- Wachowiak, M.J.; Karas, B.V. 3D scanning and replication for museum and cultural heritage applications. J. Am. Inst. Conserv. 2009, 48, 141–158. [Google Scholar] [CrossRef]
- Weng, J.; Sun, J. Green landscape 3D reconstruction and VR interactive art design experience using digital entertainment technology and entertainment gesture robots. Entertain. Comput. 2025, 52, 100854. [Google Scholar] [CrossRef]
- Zioulis, N.; Alexiadis, D.; Doumanoglou, A.; Louizis, G.; Apostolakis, K.; Zarpalas, D.; Daras, P. 3D tele-immersion platform for interactive immersive experiences between remote users. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP); IEEE: Piscataway, NJ, USA, 2016; pp. 365–369. [Google Scholar]
- Li, L.; Carnell, S.; Harris, K.; Walters, L.; Reiners, D.; Cruz-Neira, C. LIFT-A System to Create Mixed 360 Video and 3D Content for Live Immersive Virtual Field Trip. In Proceedings of the 2023 ACM International Conference on Interactive Media Experiences, Nantes, France, 12–15 June 2023; pp. 83–93. [Google Scholar]
- Richlan, F.; Weiß, M.; Kastner, P.; Braid, J. Virtual training, real effects: A narrative review on sports performance enhancement through interventions in virtual reality. Front. Psychol. 2023, 14, 1240790. [Google Scholar] [CrossRef] [PubMed]
- Huang, X.; Yin, M.; Xia, Z.; Xiao, R. VirtualNexus: Enhancing 360-Degree Video AR/VR Collaboration with Environment Cutouts and Virtual Replicas. In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, Pittsburgh, PA, USA, 13–16 October 2024; pp. 1–12. [Google Scholar]
- Wu, Y.; Yi, A.; Ma, C.; Chen, L. Artificial intelligence for video game visualization, advancements, benefits and challenges. Math. Biosci. Eng. 2023, 20, 15345–15373. [Google Scholar] [CrossRef]
- Huang, Y. 3D special effects modelling based on computer graphics technology. Appl. Comput. Eng. 2024, 50, 106–112. [Google Scholar] [CrossRef]
- Gui, Z.; Jha, S.; Delbos, B.; Moreau, R.; Chalard, R.; Lelevé, A.; Cheng, I. Interactive Manipulation and Visualization of 3D Brain MRI for Surgical Training. In Proceedings of the 2024 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC); IEEE: Piscataway, NJ, USA, 2024; pp. 1–4. [Google Scholar]
- Pathak, K.; Saikia, R.; Das, A.; Das, D.; Islam, M.A.; Pramanik, P.; Parasar, A.; Borthakur, P.P.; Sarmah, P.; Saikia, M.; et al. 3D printing in biomedicine: Advancing personalized care through additive manufacturing. Explor. Med. 2023, 4, 1135–1167. [Google Scholar] [CrossRef]
- Clarke, E. Virtual reality simulation—The future of orthopaedic training? A systematic review and narrative analysis. Adv. Simul. 2021, 6, 2. [Google Scholar] [CrossRef] [PubMed]
- Sarmah, M.; Neelima, A.; Singh, H.R. Survey of methods and principles in three-dimensional reconstruction from two-dimensional medical images. Vis. Comput. Ind. Biomed. Art 2023, 6, 15. [Google Scholar] [CrossRef]
- Bhuskute, H.; Shende, P.; Prabhakar, B. 3D printed personalized medicine for cancer: Applications for betterment of diagnosis, prognosis and treatment. AAPS PharmSciTech 2022, 23, 8. [Google Scholar] [CrossRef]
- Europe, A. Artec 3D Portable Scanners. 2024. Available online: https://www.artec3d.com (accessed on 20 November 2024).
- Haleem, A.; Javaid, M.; Singh, R.P.; Rab, S.; Suman, R.; Kumar, L.; Khan, I.H. Exploring the potential of 3D scanning in Industry 4.0: An overview. Int. J. Cogn. Comput. Eng. 2022, 3, 161–171. [Google Scholar] [CrossRef]
- Rieke-Zapp, D.; Royo, S. Structured light 3D scanning. In Digital Techniques for Documenting and Preserving Cultural Heritage; Arc Humanities Press: Yorkshire, UK, 2017; pp. 247–251. [Google Scholar]
- Scaniverse Review: Free 3D Laser Scans with Your iPhone—Structural Basics—Structuralbasics.com. Available online: https://www.structuralbasics.com/scaniverse-review/ (accessed on 25 July 2024).
- Goesele, M.; Snavely, N.; Curless, B.; Hoppe, H.; Seitz, S.M. Multi-view stereo for community photo collections. In Proceedings of the 2007 IEEE 11th International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2007; pp. 1–8. [Google Scholar]
- Brière-Côté, A.; Rivest, L.; Maranzana, R. Comparing 3D CAD models: Uses, methods, tools and perspectives. Comput. Aided Des. Appl. 2012, 9, 771–794. [Google Scholar] [CrossRef]
- Samavati, T.; Soryani, M. Deep learning-based 3D reconstruction: A survey. Artif. Intell. Rev. 2023, 56, 9175–9219. [Google Scholar] [CrossRef]
- Tachella, J.; Altmann, Y.; Mellado, N.; McCarthy, A.; Tobin, R.; Buller, G.S.; Tourneret, J.Y.; McLaughlin, S. Real-time 3D reconstruction from single-photon lidar data using plug-and-play point cloud denoisers. Nat. Commun. 2019, 10, 4984. [Google Scholar] [CrossRef]
- Yin, X.; He, J.; Cheng, Z. Efficient and lightweight 3D building reconstruction from drone imagery using sparse line and point clouds. Virtual Real. Intell. Hardw. 2025, 7, 111–126. [Google Scholar] [CrossRef]
- Sitzmann, V.; Martel, J.; Bergman, A.; Lindell, D.; Wetzstein, G. Implicit neural representations with periodic activation functions. Adv. Neural Inf. Process. Syst. 2020, 33, 7462–7473. [Google Scholar]
- Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 2021, 65, 99–106. [Google Scholar] [CrossRef]
- Kerbl, B.; Kopanas, G.; Leimkühler, T.; Drettakis, G. 3d gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. 2023, 42, 139. [Google Scholar] [CrossRef]
- Wu, T.; Yuan, Y.J.; Zhang, L.X.; Yang, J.; Cao, Y.P.; Yan, L.Q.; Gao, L. Recent advances in 3d gaussian splatting. Comput. Vis. Media 2024, 10, 613–642. [Google Scholar] [CrossRef]
- Fei, B.; Xu, J.; Zhang, R.; Zhou, Q.; Yang, W.; He, Y. 3d gaussian splatting as new era: A survey. IEEE Trans. Vis. Comput. Graph. 2024, 31, 4429–4449. [Google Scholar] [CrossRef] [PubMed]
- Schonberger, J.L.; Frahm, J.M. Structure-from-motion revisited. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2016; pp. 4104–4113. [Google Scholar]
- Chang, A.X.; Funkhouser, T.; Guibas, L.; Hanrahan, P.; Huang, Q.; Li, Z.; Savarese, S.; Savva, M.; Song, S.; Su, H.; et al. Shapenet: An information-rich 3d model repository. arXiv 2015, arXiv:1512.03012. [Google Scholar]
- Wang, X.; Li, P. Extraction of urban building damage using spectral, height and corner information from VHR satellite images and airborne LiDAR data. ISPRS J. Photogramm. Remote Sens. 2020, 159, 322–336. [Google Scholar] [CrossRef]
- Altuntas, C. Review of Scanning and Pixel Array-Based LiDAR Point-Cloud Measurement Techniques to Capture 3D Shape or Motion. Appl. Sci. 2023, 13, 6488. [Google Scholar] [CrossRef]
- Xu, J.; Xi, N.; Zhang, C.; Zhao, J.; Gao, B.; Shi, Q. Rapid 3D surface profile measurement of industrial parts using two-level structured light patterns. Opt. Lasers Eng. 2011, 49, 907–914. [Google Scholar] [CrossRef]
- Weinmann, M.; Schwartz, C.; Ruiters, R.; Klein, R. A multi-camera, multi-projector super-resolution framework for structured light. In Proceedings of the 2011 International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission; IEEE: Piscataway, NJ, USA, 2011; pp. 397–404. [Google Scholar]
- Geiger, A.; Ziegler, J.; Stiller, C. Stereoscan: Dense 3d reconstruction in real-time. In Proceedings of the 2011 IEEE Intelligent Vehicles Symposium (IV); IEEE: Piscataway, NJ, USA, 2011; pp. 963–968. [Google Scholar]
- Furukawa, Y.; Hernández, C. Multi-view stereo: A tutorial. Found. Trends Comput. Graph. Vis. 2015, 9, 1–148. [Google Scholar] [CrossRef]
- Khot, T.; Agrawal, S.; Tulsiani, S.; Mertz, C.; Lucey, S.; Hebert, M. Learning unsupervised multi-view stereopsis via robust photometric consistency. arXiv 2019, arXiv:1905.02706. [Google Scholar] [CrossRef]
- Yu, A.; Ye, V.; Tancik, M.; Kanazawa, A. pixelnerf: Neural radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021; pp. 4578–4587. [Google Scholar]
- Barron, J.T.; Mildenhall, B.; Verbin, D.; Srinivasan, P.P.; Hedman, P. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 5470–5479. [Google Scholar]
- Li, Z.; Müller, T.; Evans, A.; Taylor, R.H.; Unberath, M.; Liu, M.Y.; Lin, C.H. Neuralangelo: High-fidelity neural surface reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 8456–8465. [Google Scholar]
- Garbin, S.J.; Kowalski, M.; Johnson, M.; Shotton, J.; Valentin, J. Fastnerf: High-fidelity neural rendering at 200 fps. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Online, 11–17 October 2021; pp. 14346–14355. [Google Scholar]
- Deng, K.; Liu, A.; Zhu, J.Y.; Ramanan, D. Depth-supervised nerf: Fewer views and faster training for free. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 12882–12891. [Google Scholar]
- Kwak, M.S.; Song, J.; Kim, S. GeCoNeRF: Few-shot Neural Radiance Fields via Geometric Consistency. In Proceedings of the International Conference on Machine Learning; PMLR: Cambridge, MA, USA, 2023; pp. 18023–18036. [Google Scholar]
- Takikawa, T.; Litalien, J.; Yin, K.; Kreis, K.; Loop, C.; Nowrouzezahrai, D.; Jacobson, A.; McGuire, M.; Fidler, S. Neural geometric level of detail: Real-time rendering with implicit 3d shapes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Online, 19–25 June 2021; pp. 11358–11367. [Google Scholar]
- Hu, T.; Liu, S.; Chen, Y.; Shen, T.; Jia, J. Efficientnerf efficient neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 12902–12911. [Google Scholar]
- Yuan, Y.J.; Sun, Y.T.; Lai, Y.K.; Ma, Y.; Jia, R.; Gao, L. Nerf-editing: Geometry editing of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–24 June 2022; pp. 18353–18364. [Google Scholar]
- Ye, V.; Li, R.; Kerr, J.; Turkulainen, M.; Yi, B.; Pan, Z.; Seiskari, O.; Ye, J.; Hu, J.; Tancik, M.; et al. gsplat: An open-source library for Gaussian splatting. J. Mach. Learn. Res. 2025, 26, 1–17. [Google Scholar]
- Guédon, A.; Lepetit, V. Sugar: Surface-aligned gaussian splatting for efficient 3d mesh reconstruction and high-quality mesh rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 5354–5363. [Google Scholar]
- Zhang, B.; Fang, C.; Shrestha, R.; Liang, Y.; Long, X.; Tan, P. Rade-gs: Rasterizing depth in gaussian splatting. arXiv 2024, arXiv:2406.01467. [Google Scholar] [CrossRef]
- Turkulainen, M.; Ren, X.; Melekhov, I.; Seiskari, O.; Rahtu, E.; Kannala, J. Dn-splatter: Depth and normal priors for gaussian splatting and meshing. In Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV); IEEE: Piscataway, NJ, USA, 2025; pp. 2421–2431. [Google Scholar]
- Chung, J.; Oh, J.; Lee, K.M. Depth-regularized optimization for 3d gaussian splatting in few-shot images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 811–820. [Google Scholar]
- Thai, A.; Peng, S.; Genova, K.; Guibas, L.; Funkhouser, T. Splattalk: 3d vqa with gaussian splatting. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Honolulu, HI, USA, 19–23 October 2025; pp. 4712–4721. [Google Scholar]
- Jung, J.; Han, J.; An, H.; Kang, J.; Park, S.; Kim, S. Relaxing accurate initialization constraint for 3d gaussian splatting. arXiv 2024, arXiv:2403.09413. [Google Scholar] [CrossRef]
- Sauvalle, B.; de La Fortelle, A. Autoencoder-based background reconstruction and foreground segmentation with background noise estimation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Vancouver, BC, Canada, 17–24 June 2023; pp. 3244–3255. [Google Scholar]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2011; pp. 2564–2571. [Google Scholar]
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Lu, X.X. A review of solutions for perspective-n-point problem in camera pose estimation. J. Phys. Conf. Ser. 2018, 1087, 052009. [Google Scholar] [CrossRef]
- Mustaniemi, J.; Kannala, J.; Särkkä, S.; Matas, J.; Heikkilä, J. Inertial-based scale estimation for structure from motion on mobile devices. In Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: Piscataway, NJ, USA, 2017; pp. 4394–4401. [Google Scholar]
- Wkeglarczyk, S. Kernel density estimation and its application. ITM Web Conf. 2018, 23, 00037. [Google Scholar]
- Mullen, T. Mastering Blender; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
- Padberg, T.; Heikkonen, J.; Kanth, R. Study on Stereo AI Based Zed-2i Camera. In Proceedings of the International Conference on Information Technology & Systems; Springer: Berlin/Heidelberg, Germany, 2024; pp. 46–56. [Google Scholar]
- Qin, X.; Dai, H.; Hu, X.; Fan, D.P.; Shao, L.; Van Gool, L. Highly accurate dichotomous image segmentation. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2022; pp. 38–56. [Google Scholar]
- Setiadi, D.R.I.M. PSNR vs. SSIM: Imperceptibility quality assessment for image steganography. Multimed. Tools Appl. 2021, 80, 8423–8444. [Google Scholar] [CrossRef]









| FS | ME | IS | RGB-D | SOR | RTR | |
|---|---|---|---|---|---|---|
| NeRF | ||||||
| Barron et al. [57] | ✘ | ✘ | ✘ | ✘ | ✘ | ✘ |
| Garbin et al. [59] | ✘ | ✘ | ✘ | ✘ | ✘ | ✔ |
| Li et al. [58] | ✘ | ✔ | ✘ | ✘ | ✘ | ✘ |
| Dent et al. [60] | ✘ | ✘ | ✘ | ✔ | ✘ | ✘ |
| Kwak et al. [61] | ✔ | ✘ | ✘ | ✔ | ✘ | ✘ |
| 3DGS | ||||||
| Guédon et al. [66] | ✘ | ✔ | ✘ | ✘ | ✘ | ✔ |
| Zhang et al. [67] | ✘ | ✔ | ✘ | ✘ | ✘ | ✔ |
| Turkulainen et al. [68] | ✘ | ✔ | ✘ | ✔ | ✘ | ✔ |
| Chung et al. [69] | ✔ | ✘ | ✘ | ✔ | ✘ | ✔ |
| Jung et al. [71] | ✘ | ✘ | ✔ | ✘ | ✘ | ✔ |
| Ours | ✘ | ✘ | ✔ | ✔ | ✔ | ✔ |
| Iterations | 100 | 500 | 1000 | ||||||
|---|---|---|---|---|---|---|---|---|---|
| PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | |
| Default | 17.553 | 0.9027 | 0.1510 | 23.206 | 0.9300 | 0.0995 | 24.540 | 0.9419 | 0.0833 |
| Depth | 22.673 | 0.9275 | 0.1031 | 24.376 | 0.9400 | 0.0859 | 24.901 | 0.9450 | 0.0793 |
| Highdepth | 23.629 | 0.9363 | 0.0909 | 24.922 | 0.9462 | 0.0777 | 25.235 | 0.9490 | 0.0740 |
| Fusion | 22.885 | 0.9308 | 0.1023 | 24.711 | 0.9438 | 0.0810 | 25.130 | 0.9477 | 0.0753 |
| Random | 8.466 | 0.8019 | 0.3970 | 16.542 | 0.8977 | 0.1373 | 22.088 | 0.9229 | 0.1112 |
| Iterations | 3000 | 7000 | 12,000 | ||||||
| Default | 25.697 | 0.9525 | 0.0696 | 26.675 | 0.9590 | 0.0611 | 27.393 | 0.9632 | 0.0564 |
| Depth | 25.848 | 0.9527 | 0.0688 | 26.764 | 0.9587 | 0.0608 | 27.519 | 0.9626 | 0.0563 |
| Highdepth | 25.986 | 0.9544 | 0.0657 | 26.860 | 0.9602 | 0.0581 | 27.581 | 0.9636 | 0.0539 |
| Fusion | 25.963 | 0.9540 | 0.0664 | 26.811 | 0.9597 | 0.0592 | 27.552 | 0.9633 | 0.0550 |
| Random | 25.207 | 0.9476 | 0.0767 | 26.563 | 0.9572 | 0.0636 | 27.392 | 0.9617 | 0.0581 |
| Moschino Dataset | ||||||
| Iterations | Highdepth | Default | ||||
| PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | |
| 100 | 19.243 | 0.8949 | 0.1798 | 15.045 | 0.8648 | 0.2204 |
| 500 | 20.011 | 0.9004 | 0.1607 | 19.630 | 0.8941 | 0.1742 |
| 1000 | 20.140 | 0.9023 | 0.1569 | 20.087 | 0.8993 | 0.1656 |
| 3000 | 20.280 | 0.9068 | 0.1488 | 20.372 | 0.9054 | 0.1547 |
| 7000 | 20.498 | 0.9141 | 0.1382 | 20.630 | 0.9115 | 0.1455 |
| 12,000 | 20.685 | 0.9193 | 0.1302 | 20.851 | 0.9175 | 0.1379 |
| Synthetic Dataset | ||||||
| Iterations | Highdepth | Default | ||||
| PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | |
| 100 | 24.944 | 0.9487 | 0.0642 | 18.306 | 0.9140 | 0.1302 |
| 500 | 26.395 | 0.9599 | 0.0528 | 24.279 | 0.9408 | 0.0770 |
| 1000 | 26.764 | 0.9630 | 0.0491 | 25.8763 | 0.9547 | 0.0586 |
| 3000 | 27.698 | 0.9687 | 0.0408 | 27.295 | 0.9666 | 0.0441 |
| 7000 | 28.769 | 0.9740 | 0.0341 | 28.488 | 0.9733 | 0.0357 |
| 12,000 | 29.650 | 0.9769 | 0.0311 | 29.356 | 0.9769 | 0.0320 |
| Default | Depth | Highdepth | Fusion | Random | ||
|---|---|---|---|---|---|---|
| Processing (Colmap & Masking) | 759 ± 60 s | 759 ± 60 s | 759 ± 60 s | 759 ± 60 s | 759 ± 60 s | |
| Initialization | − | 12 ± 1 s | 13 ± 1 s | 13 ± 1 s | 0 s | |
| Iteration | 100 | 4 ± 2 s | 2 ± 1 s | 2 ± 1 s | 2 ± 1 s | 10 ± 3 s |
| 300 | 10 ± 5 s | 6 ± 3 s | 8 ± 4 s | 7 ± 4 s | 26 ± 9 s | |
| 500 | 16 ± 8 s | 10 ± 5 s | 13 ± 6 s | 11 ± 6 s | 37 ± 13 s | |
| 1000 | 27 ± 11 s | 20 ± 10 s | 25 ± 13 s | 22 ± 11 s | 48 ± 17 s | |
| 3000 | 67 ± 26 s | 60 ± 28 s | 70 ± 38 s | 63 ± 29 s | 86 ± 31 s | |
| 7000 | 152 ± 54 s | 144 ± 60 s | 163 ± 79 s | 149 ± 62 s | 171 ± 60 s | |
| 12,000 | 271 ± 96 s | 262 ± 104 s | 290 ± 134 s | 268 ± 107 s | 290 ± 103 s | |
| Total time | 1030 ± 121 s | 1035 ± 127 s | 1064 ± 154 s | 1041 ± 131 s | 1049 ± 126 s | |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Meglioraldi, J.; Cascarano, P.; Marfia, G. DiGS: Depth-Initialized Gaussian Splatting for Single-Object Reconstruction. J. Imaging 2026, 12, 183. https://doi.org/10.3390/jimaging12050183
Meglioraldi J, Cascarano P, Marfia G. DiGS: Depth-Initialized Gaussian Splatting for Single-Object Reconstruction. Journal of Imaging. 2026; 12(5):183. https://doi.org/10.3390/jimaging12050183
Chicago/Turabian StyleMeglioraldi, Jacopo, Pasquale Cascarano, and Gustavo Marfia. 2026. "DiGS: Depth-Initialized Gaussian Splatting for Single-Object Reconstruction" Journal of Imaging 12, no. 5: 183. https://doi.org/10.3390/jimaging12050183
APA StyleMeglioraldi, J., Cascarano, P., & Marfia, G. (2026). DiGS: Depth-Initialized Gaussian Splatting for Single-Object Reconstruction. Journal of Imaging, 12(5), 183. https://doi.org/10.3390/jimaging12050183

