Gaussian Splatting-Based Color and Shape Deformation Fields for Dynamic Scene Reconstruction
Abstract
:1. Introduction
- We propose a framework called GBC (Gaussian Splatting-Based Color and Shape Deformation Fields), which integrates motion deformation fields, color deformation fields, and 3DGS to handle dynamic scenes with material color variations.
- We have successfully constructed a multi-stage training model and designed dynamic components and color components, which enables us to flexibly decompose and specialize in various different environments.
- Our approach achieves real-time rendering while maintaining high-fidelity 4D scene reconstruction.
2. Related Works
2.1. Novel View Synthesis
2.2. Dynamic Scene Representation
3. Method
3.1. Preliminary
3.1.1. Original 3DGS
3.1.2. Geometric Normal Optimization
3.2. Multi-Stage Network Architecture
3.2.1. First Stage with 3DGS
3.2.2. Dynamic Stage with Dynamic Component
3.2.3. Color Stage with Color Component
3.2.4. Shape Disruption
3.3. Loss Function
4. Results and Discussion
4.1. Experimental Details
4.2. Experimental Results
4.2.1. Comparisons on Synthetic Dataset
4.2.2. Comparisons on Real-World Dataset
4.2.3. Depth and Normal Maps Visualization
4.3. Ablation Study
4.4. Case Study
4.5. Analysis and Limitations
4.6. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Barron, J.T.; Mildenhall, B.; Tancik, M.; Hedman, P.; Martin-Brualla, R.; Srinivasan, P.P. Mip-nerf: A multiscale representation for anti-aliasing neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Virtual, 11–17 October 2021; pp. 5855–5864. [Google Scholar]
- Barron, J.T.; Mildenhall, B.; Verbin, D.; Srinivasan, P.P.; Hedman, P. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5470–5479. [Google Scholar]
- Zhang, J.; Yao, Y.; Li, S.; Liu, J.; Fang, T.; McKinnon, D.; Tsin, Y.; Quan, L. Neilf++: Inter-reflectable light fields for geometry and material estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 3601–3610. [Google Scholar]
- Yao, Y.; Zhang, J.; Liu, J.; Qu, Y.; Fang, T.; McKinnon, D.; Tsin, Y.; Quan, L. Neilf: Neural incident light field for physically-based material estimation. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 700–716. [Google Scholar]
- Liu, Y.; Wang, P.; Lin, C.; Long, X.; Wang, J.; Liu, L.; Komura, T.; Wang, W. Nero: Neural geometry and brdf reconstruction of reflective objects from multiview images. ACM Trans. Graph. 2023, 42, 1–22. [Google Scholar] [CrossRef]
- Luiten, J.; Kopanas, G.; Leibe, B.; Ramanan, D. Dynamic 3d gaussians: Tracking by persistent dynamic view synthesis. In Proceedings of the 2024 International Conference on 3D Vision (3DV), Davos, Switzerland, 18–21 March 2024; pp. 800–809. [Google Scholar]
- Yang, Z.; Gao, X.; Zhou, W.; Jiao, S.; Zhang, Y.; Jin, X. Deformable 3d gaussians for high-fidelity monocular dynamic scene reconstruction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 20331–20341. [Google Scholar]
- Yan, Z.; Li, C.; Lee, G.H. Nerf-ds: Neural radiance fields for dynamic specular objects. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 8285–8295. [Google Scholar]
- Jiang, Y.; Tu, J.; Liu, Y.; Gao, X.; Long, X.; Wang, W.; Ma, Y. Gaussianshader: 3d gaussian splatting with shading functions for reflective surfaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 16–22 June 2024; pp. 5322–5332. [Google Scholar]
- Wood, D.N.; Azuma, D.I.; Aldinger, K.; Curless, B.; Duchamp, T.; Salesin, D.H.; Stuetzle, W. Surface light fields for 3D photography. In Seminal Graphics Papers: Pushing the Boundaries; Association for Computing Machinery: New York, NY, USA, 2023; Volume 2, pp. 487–496. [Google Scholar]
- Lombardi, S.; Simon, T.; Saragih, J.; Schwartz, G.; Lehrmann, A.; Sheikh, Y. Neural volumes: Learning dynamic renderable volumes from images. ACM Trans. Graph. 2019, 38, 65. [Google Scholar] [CrossRef]
- Flynn, J.; Broxton, M.; Debevec, P.; DuVall, M.; Fyffe, G.; Overbeck, R.; Snavely, N.; Tucker, R. Deepview: View synthesis with learned gradient descent. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2367–2376. [Google Scholar]
- Mildenhall, B.; Srinivasan, P.P.; Tancik, M.; Barron, J.T.; Ramamoorthi, R.; Ng, R. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 2021, 65, 99–106. [Google Scholar] [CrossRef]
- Yu, A.; Ye, V.; Tancik, M.; Kanazawa, A. pixelnerf: Neural radiance fields from one or few images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 4578–4587. [Google Scholar]
- Chen, A.; Xu, Z.; Zhao, F.; Zhang, X.; Xiang, F.; Yu, J.; Su, H. Mvsnerf: Fast generalizable radiance field reconstruction from multi-view stereo. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 14124–14133. [Google Scholar]
- Wang, P.; Liu, L.; Liu, Y.; Theobalt, C.; Komura, T.; Wang, W. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. In Proceedings of the 35th International Conference on Neural Information Processing Systems (NIPS’21), Red Hook, NY, USA, 6–14 December 2021. [Google Scholar]
- Oechsle, M.; Peng, S.; Geiger, A. Unisurf: Unifying neural implicit surfaces and radiance fields for multi-view reconstruction. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 5589–5599. [Google Scholar]
- Fridovich-Keil, S.; Yu, A.; Tancik, M.; Chen, Q.; Recht, B.; Kanazawa, A. Plenoxels: Radiance fields without neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 5501–5510. [Google Scholar]
- Müller, T.; Evans, A.; Schied, C.; Keller, A. Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans. Graph. 2022, 41, 1–15. [Google Scholar] [CrossRef]
- Schwarz, K.; Liao, Y.; Niemeyer, M.; Geiger, A. Graf: Generative radiance fields for 3d-aware image synthesis. Adv. Neural Inf. Process. Syst. 2020, 33, 20154–20166. [Google Scholar]
- Xie, J.; Ouyang, H.; Piao, J.; Lei, C.; Chen, Q. High-fidelity 3d gan inversion by pseudo-multi-view optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 321–331. [Google Scholar]
- Lin, C.H.; Gao, J.; Tang, L.; Takikawa, T.; Zeng, X.; Huang, X.; Kreis, K.; Fidler, S.; Liu, M.Y.; Lin, T.Y. Magic3d: High-resolution text-to-3d content creation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 300–309. [Google Scholar]
- Ouyang, H.; Zhang, B.; Zhang, P.; Yang, H.; Yang, J.; Chen, D.; Chen, Q.; Wen, F. Real-time neural character rendering with pose-guided multiplane images. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 192–209. [Google Scholar]
- Wang, C.; Chai, M.; He, M.; Chen, D.; Liao, J. Clip-nerf: Text-and-image driven manipulation of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 3835–3844. [Google Scholar]
- Kerbl, B.; Kopanas, G.; Leimkühler, T.; Drettakis, G. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph. 2023, 42, 139. [Google Scholar] [CrossRef]
- Yifan, W.; Serena, F.; Wu, S.; Öztireli, C.; Sorkine-Hornung, O. Differentiable surface splatting for point-based geometry processing. ACM Trans. Graph. 2019, 38, 230. [Google Scholar] [CrossRef]
- Park, K.; Sinha, U.; Hedman, P.; Barron, J.T.; Bouaziz, S.; Goldman, D.B.; Martin-Brualla, R.; Seitz, S.M. HyperNeRF: A higher-dimensional representation for topologically varying neural radiance fields. ACM Trans. Graph. 2021, 40, 238. [Google Scholar] [CrossRef]
- Pumarola, A.; Corona, E.; Pons-Moll, G.; Moreno-Noguer, F. D-nerf: Neural radiance fields for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 10318–10327. [Google Scholar]
- Park, K.; Sinha, U.; Barron, J.T.; Bouaziz, S.; Goldman, D.B.; Seitz, S.M.; Martin-Brualla, R. Nerfies: Deformable neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 1–17 October 2021; pp. 5865–5874. [Google Scholar]
- Fang, J.; Yi, T.; Wang, X.; Xie, L.; Zhang, X.; Liu, W.; Nießner, M.; Tian, Q. Fast dynamic radiance fields with time-aware neural voxels. In Proceedings of the SIGGRAPH Asia 2022 Conference Papers, Daegu, Republic of Korea, 6–9 December 2022; pp. 1–9. [Google Scholar]
- Guo, X.; Chen, G.; Dai, Y.; Ye, X.; Sun, J.; Tan, X.; Ding, E. Neural deformable voxel grid for fast optimization of dynamic view synthesis. In Proceedings of the Asian Conference on Computer Vision, Macao, China, 4–8 December 2022; pp. 3757–3775. [Google Scholar]
- Liu, J.W.; Cao, Y.P.; Mao, W.; Zhang, W.; Zhang, D.J.; Keppo, J.; Shan, Y.; Qie, X.; Shou, M.Z. Devrf: Fast deformable voxel radiance fields for dynamic scenes. Adv. Neural Inf. Process. Syst. 2022, 35, 36762–36775. [Google Scholar]
- Song, L.; Chen, A.; Li, Z.; Chen, Z.; Chen, L.; Yuan, J.; Xu, Y.; Geiger, A. Nerfplayer: A streamable dynamic scene representation with decomposed neural radiance fields. IEEE Trans. Vis. Comput. Graph. 2023, 29, 2732–2742. [Google Scholar] [CrossRef] [PubMed]
- Fridovich-Keil, S.; Meanti, G.; Warburg, F.R.; Recht, B.; Kanazawa, A. K-planes: Explicit radiance fields in space, time, and appearance. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 12479–12488. [Google Scholar]
- Cao, A.; Johnson, J. Hexplane: A fast representation for dynamic scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 130–141. [Google Scholar]
- Duan, Y.; Wei, F.; Dai, Q.; He, Y.; Chen, W.; Chen, B. 4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic Scenes. In Proceedings of the ACM SIGGRAPH 2024 Conference Papers (SIGGRAPH’24), New York, NY, USA, 27 July– 1 August 2024. [Google Scholar] [CrossRef]
- Bosch, M.T. N-dimensional rigid body dynamics. ACM Trans. Graph. 2020, 39, 55. [Google Scholar] [CrossRef]
- Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. Pytorch: An imperative style, high-performance deep learning library. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
Bouncing Balls | Hell Warrior | Hook | Jumping Jacks | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Method | PSNR↑ | SSIM↑ | LPIPS↓ | PSNR↑ | SSIM↑ | LPIPS↓ | PSNR↑ | SSIM↑ | LPIPS↓ | PSNR↑ | SSIM↑ | LPIPS↓ |
3D-GS | 23.20 | 0.9591 | 0.0600 | 24.53 | 0.9336 | 0.0580 | 21.71 | 0.8876 | 0.1034 | 20.64 | 0.9297 | 0.0828 |
D-NeRF | 38.17 | 0.9891 | 0.0323 | 24.06 | 0.9440 | 0.0707 | 29.02 | 0.9595 | 0.0546 | 32.70 | 0.9779 | 0.0388 |
K-Planes | 40.05 | 0.9934 | 0.0322 | 24.58 | 0.9520 | 0.0824 | 28.12 | 0.9489 | 0.0662 | 31.11 | 0.9708 | 0.0468 |
HexPlane | 39.69 | 0.9915 | 0.0323 | 24.24 | 0.9443 | 0.0732 | 28.71 | 0.9572 | 0.0505 | 31.65 | 0.9729 | 0.0398 |
Deformable3DGS | 41.01 | 0.9953 | 0.0093 | 41.54 | 0.9873 | 0.0234 | 37.42 | 0.9867 | 0.0144 | 37.72 | 0.9897 | 0.0126 |
Ours | 43.98 | 0.9972 | 0.0108 | 41.60 | 0.9815 | 0.0223 | 37.77 | 0.9883 | 0.0120 | 39.02 | 0.9914 | 0.0103 |
Lego | Mutant | Stand Up | Trex | |||||||||
Method | PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ | PSNR ↑ | SSIM ↑ | LPIPS ↓ |
3D-GS | 22.10 | 0.9384 | 0.0607 | 24.53 | 0.9336 | 0.0580 | 21.91 | 0.9301 | 0.0785 | 21.93 | 0.9539 | 0.0487 |
D-NeRF | 25.56 | 0.9363 | 0.8210 | 30.31 | 0.9672 | 0.0392 | 33.13 | 0.9781 | 0.0355 | 30.61 | 0.9671 | 0.0535 |
K-Planes | 28.91 | 0.9695 | 0.0331 | 32.50 | 0.9713 | 0.0362 | 33.10 | 0.9793 | 0.0310 | 30.43 | 0.9737 | 0.0343 |
HexPlane | 25.22 | 0.9388 | 0.0437 | 33.79 | 0.9802 | 0.0261 | 34.36 | 0.9839 | 0.0261 | 30.67 | 0.9749 | 0.0273 |
Deformable3DGS | 33.07 | 0.9794 | 0.0183 | 42.63 | 0.9951 | 0.0052 | 44.62 | 0.9951 | 0.0063 | 38.10 | 0.9933 | 0.0098 |
Ours | 26.60 | 0.9484 | 0.0507 | 41.18 | 0.9921 | 0.0170 | 42.61 | 0.9958 | 0.0052 | 39.13 | 0.9940 | 0.0248 |
Model | PSNR ↑ | SSIM ↑ | LPIPS ↓ | Time ↓ | FPS ↑ |
---|---|---|---|---|---|
3DGS | 23.19 | 0.93 | 0.08 | 10 min | 170 |
K-Planes | 31.61 | 0.97 | 0.03 | 52 min | 0.97 |
HexPlane | 31.04 | 0.97 | 0.04 | 12 min | 2.5 |
Deformable3DGS | 33.29 | 0.98 | 0.02 | 12 min | 65.8 |
Ours | 34.15 | 0.98 | 0.02 | 10 min | 80 |
Model | PSNR ↑ | SSIM ↑ | LPIPS ↓ | Time ↓ |
---|---|---|---|---|
Ours w/o Dynamic stage | 28.48 | 0.95 | 0.04 | 7 min |
Ours w/o Color stage | 33.18 | 0.97 | 0.03 | 9 min |
Ours w/o only Shape disruption | 35.60 | 0.98 | 0.02 | 12 min |
Ours w/o only Normal | 36.29 | 0.98 | 0.02 | 12 min |
Ours | 36.46 | 0.99 | 0.02 | 14 min |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Bao, K.; Wu, W.; Hao, Y. Gaussian Splatting-Based Color and Shape Deformation Fields for Dynamic Scene Reconstruction. Electronics 2025, 14, 2347. https://doi.org/10.3390/electronics14122347
Bao K, Wu W, Hao Y. Gaussian Splatting-Based Color and Shape Deformation Fields for Dynamic Scene Reconstruction. Electronics. 2025; 14(12):2347. https://doi.org/10.3390/electronics14122347
Chicago/Turabian StyleBao, Kaibin, Wei Wu, and Yongtao Hao. 2025. "Gaussian Splatting-Based Color and Shape Deformation Fields for Dynamic Scene Reconstruction" Electronics 14, no. 12: 2347. https://doi.org/10.3390/electronics14122347
APA StyleBao, K., Wu, W., & Hao, Y. (2025). Gaussian Splatting-Based Color and Shape Deformation Fields for Dynamic Scene Reconstruction. Electronics, 14(12), 2347. https://doi.org/10.3390/electronics14122347