SA-Encoder: A Learnt Spatial Autocorrelation Representation to Inform 3D Geospatial Object Detection
Abstract
1. Introduction
- Enhanced geospatial object detection by spatial autocorrelation: The study underscores the effectiveness of spatial autocorrelation features for identifying diverse geospatial objects, embracing geographic theories, statistical methods, and deep learning advancements. It demonstrates the pivotal role of spatial theories in enriching AI technologies for geospatial object detection from complex environments.
- Automated spatial autocorrelation representation extraction: By developing SA-Encoder, which effectively extracts spatially explicit contextual representations, this research showcases an innovative integration of AI in geospatial analysis. This approach simplifies the application of traditional semivariance estimations, offering a streamlined and dataset-specific learning mechanism that informs the existing deep neural network for geospatial object detection.
2. Literature Review
3. Methodology
3.1. Spatial Autocorrelation and Lag-Ordered Pairwise Differences
3.2. Architecture Design of the Spatial Autocorrelation Encoder
3.3. Feature Grouping to Embed the Encoder in a Neural Network Architecture
4. Experiments
4.1. Dataset
4.2. Experiment Design
5. Results and Discussion
5.1. Investigating the Effectiveness of Lag-Ordered Pairwise Differences
5.2. Performance of the Spatial Autocorrelation Encoder
5.3. Comparative Ablation Analysis
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Treat | OA | mIoU | Man-Made Terrain | Natural Terrain | High Vegetation | Low Vegetation | Buildings | Hardscape | Scanning Artefacts |
1 | 85.7% | 57.7% | 93.0% | 79.6% | 67.0% | 28.4% | 84.3% | 28.3% | 26.4% |
2 | 85.6% | 57.3% | 92.5% | 78.8% | 66.2% | 28.3% | 84.1% | 28.9% | 25.2% |
3 | 85.8% | 58.4% | 92.8% | 80.4% | 66.8% | 29.7% | 84.4% | 28.3% | 26.9% |
4 | 85.6% | 57.5% | 92.8% | 78.4% | 66.5% | 25.8% | 84.0% | 27.2% | 28.6% |
5 | 85.3% | 57.4% | 92.7% | 76.4% | 66.3% | 25.7% | 83.8% | 27.6% | 29.0% |
6 | 85.4% | 56.9% | 92.4% | 78.0% | 68.3% | 25.7% | 84.7% | 26.1% | 27.4% |
7 | 85.4% | 57.6% | 92.8% | 77.3% | 63.9% | 31.2% | 84.2% | 28.3% | 27.0% |
8 | 85.1% | 57.1% | 92.5% | 77.2% | 66.0% | 29.4% | 83.8% | 26.5% | 25.8% |
9 | 85.8% | 58.0% | 93.1% | 80.4% | 64.8% | 33.2% | 83.0% | 28.3% | 23.9% |
10 | 85.7% | 57.9% | 93.1% | 80.2% | 65.4% | 30.8% | 84.1% | 28.0% | 28.1% |
References
- Guo, H.; Goodchild, M.; Annoni, A. Manual of Digital Earth; Springer Nature: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
- Qin, K.; Li, J.; Zlatanova, S.; Wu, H.; Gao, Y.; Li, Y.; Shen, S.; Qu, X.; Yang, Z.; Zhang, Z. Novel UAV-based 3Dreconstruction using dense LiDAR point cloud and imagery: A geometry-aware 3D gaussian splatting approach. Int. J. Appl. Earth Obs. Geoinf. 2025, 140, 104590. [Google Scholar]
- Goodchild, M.F. Elements of an infrastructure for big urban data. Urban Inform. 2022, 1, 3. [Google Scholar] [CrossRef]
- Batty, M. Digital Twins in City Planning. Nat. Comput. Sci. 2023, 4, 192–199. [Google Scholar] [CrossRef]
- Syafiq, M.; Azri, S.; Ujang, U. CityJSON Management Using Multi-Model Graph Database to Support 3D Urban Data Management. In Proceedings of the 13th International Conference on Geographic Information Science (GIScience 2025), Christchurch, New Zealand, 26–29 August 2025; Schloss Dagstuhl–Leibniz-Zentrum für Informatik: Wadern, Germany, 2025. [Google Scholar]
- Goodchild, M.F. Introduction to urban big data infrastructure. In Urban Informatics; Springer: Berlin/Heidelberg, Germany, 2021; pp. 543–545. [Google Scholar]
- Batty, M. Agents, Models, and Geodesign. ArcNews 2013, 35, 1–4. [Google Scholar]
- Zbirovský, S.; Nežerka, V. Open-source automatic pipeline for efficient conversion of large-scale point clouds to IFCformat. Autom. Constr. 2025, 177, 106303. [Google Scholar] [CrossRef]
- Kwan, M.-P.; Lee, J. Emergency response after 9/11: The potential of real-time 3D GIS for quick emergency response inmicro-spatial environments. Comput. Environ. Urban Syst. 2005, 29, 93–113. [Google Scholar] [CrossRef]
- Evans, S.; Hudson-Smith, A.; Batty, M. 3-D GIS: Virtual London and beyond. An exploration of the 3-D GIS experience involved in the creation of virtual London. Cybergeo Eur. J. Geogr. 2006. [Google Scholar] [CrossRef]
- Batty, M.; Hudson-Smith, A. Urban simulacra: London. Archit. Des. 2005, 75, 42–47. [Google Scholar] [CrossRef]
- Batty, M. The new urban geography of the third dimension. Environ. Plan. B-Plan. Des. 2000, 27, 483–484. [Google Scholar] [CrossRef]
- Arav, R.; Wittich, D.; Rottensteiner, F. Evaluating saliency scores in point clouds of natural environments by learning surface anomalies. ISPRS J. Photogramm. Remote Sens. 2025, 224, 235–250. [Google Scholar] [CrossRef]
- Gao, K.; Lu, D.; He, H.; Xu, L.; Li, J.; Gong, Z. Enhanced 3D Urban Scene Reconstruction and Point Cloud Densification using Gaussian Splatting and Google Earth Imagery. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4701714. [Google Scholar]
- Qi, C.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Xie, J.; Xu, Y.; Zheng, Z.; Zhu, S.-C.; Wu, Y.N. Generative pointnet: Deep energy-based learning on unordered point sets for 3d generation, reconstruction and classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021. [Google Scholar]
- Wu, C.; Pfrommer, J.; Beyerer, J.; Li, K.; Neubert, B. Object detection in 3D point clouds via local correlation-aware point embedding. In Proceedings of the 2020 Joint 9th International Conference on Informatics, Electronics & Vision (ICIEV) and 2020 4th International Conference on Imaging, Vision & Pattern Recognition (icIVPR), Kitakyushu, Japan, 26–29 August 2020; IEEE: Piscataway, NJ, USA, 2020. [Google Scholar]
- Ren, D.; Ma, Z.; Chen, Y.; Peng, W.; Liu, X.; Zhang, Y.; Guo, Y. Spiking PointNet: Spiking neural networks for point clouds. Adv. Neural Inf. Process. Syst. 2024, 36, 41797–41808. [Google Scholar]
- Qian, G.C.; Li, Y.C.; Peng, H.W.; Mai, J.J.; Hammoud, H.A.A.K.; Elhoseiny, M.; Ghanem, B. PointNeXt: Revisiting PointNet plus plus with Improved Training and Scaling Strategies. Adv. Neural Inf. Process. Syst. 2022, 35, 23192–23204. [Google Scholar]
- Goodchild, M.F. The Openshaw effect. Int. J. Geogr. Inf. Sci. 2022, 36, 1697–1698. [Google Scholar] [CrossRef]
- Li, W. GeoAI and Deep Learning. In The International Encyclopedia of Geography; John Wiley & Sons: Hoboken, NJ, USA, 2021; pp. 1–6. [Google Scholar]
- Matheron, G. Principles of geostatistics. Econ. Geol. 1963, 58, 1246–1266. [Google Scholar] [CrossRef]
- Dowd, P.A. The Variogram and Kriging: Robust and Resistant Estimators. In Geostatistics for Natural Resources Characterization; Verly, G., David, M., Journel, A.G., Marechal, A., Eds.; Springer: Dordrecht, The Netherlands, 1984; pp. 91–106. [Google Scholar]
- Chen, T.; Tang, W.; Allan, C.; Chen, S.-E. Explicit Incorporation of Spatial Autocorrelation in 3D Deep Learning for Geospatial Object Detection. Ann. Am. Assoc. Geogr. 2024, 114, 2297–2316. [Google Scholar] [CrossRef]
- Miranda, F.P.; Carr, J.R. Application of the semivariogram textural classifier (STC) for vegetation discrimination using SIR-B data of the guiana shield, northwestern brazil. Remote Sens. Rev. 1994, 10, 155–168. [Google Scholar] [CrossRef]
- Miranda, F.P.; Fonseca, L.E.N.; Carr, J.R. Semivariogram textural classification of JERS-1 (Fuyo-1) SAR data obtained over a flooded area of the Amazon rainforest. Int. J. Remote Sens. 1998, 19, 549–556. [Google Scholar] [CrossRef]
- Miranda, F.P.; Fonseca, L.E.N.; Carr, J.R.; Taranik, J.V. Analysis of JERS-1 (Fuyo-1) SAR data for vegetation discrimination in northwestern Brazil using the semivariogram textural classifier (STC). Int. J. Remote Sens. 1996, 17, 3523–3529. [Google Scholar] [CrossRef]
- Miranda, F.P.; Macdonald, J.A.; Carr, J.R. Application of the semivariogram textural classifier (STC) for vegetation discrimination using SIR-B data of Borneo. Int. J. Remote Sens. 1992, 13, 2349–2354. [Google Scholar] [CrossRef]
- Wu, X.; Peng, J.; Shan, J.; Cui, W. Evaluation of semivariogram features for object-based image classification. Geo-Spat. Inf. Sci. 2015, 18, 159–170. [Google Scholar] [CrossRef]
- Mottaghi, R.; Chen, X.; Liu, X.; Cho, N.-G.; Lee, S.-W.; Fidler, S.; Urtasun, R.; Yuille, A. The role of context for object detection and semantic segmentation in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014. [Google Scholar]
- Pohlen, T.; Hermans, A.; Mathias, M.; Leibe, B. Full-resolution residual networks for semantic segmentation in street scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Engelmann, F.; Kontogianni, T.; Hermans, A.; Leibe, B. Exploring spatial context for 3D semantic segmentation of point clouds. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy, 22–29 October 2017. [Google Scholar]
- Wan, J.; Xu, Y.; Qiu, Q.; Xie, Z. A geometry-aware attention network for semantic segmentation of MLS point clouds. Int. J. Geogr. Inf. Sci. 2023, 37, 138–161. [Google Scholar] [CrossRef]
- Wang, J.; Fan, W.; Song, X.; Yao, G.; Bo, M.; Liu, Z. NLA-GCL-Net: Semantic segmentation of large-scale surveying point clouds based on neighborhood label aggregation (NLA) and global context learning (GCL). Int. J. Geogr. Inf. Sci. 2024, 38, 2325–2347. [Google Scholar] [CrossRef]
- Charles, R.Q.; Liu, W.; Wu, C.; Su, H.; Leonidas, J.G. Frustum pointnets for 3D object detection from RGB-D data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Klemmer, K.; Safir, N.S.; Neill, D.B. Positional encoder graph neural networks for geographic data. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Valencia, Spain, 25–27 April 2023. [Google Scholar]
- Fan, S.; Dong, Q.; Zhu, F.; Lv, Y.; Ye, P.; Wang, F.-Y. SCF-net: Learning spatial contextual features for large-scale point cloud segmentation. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 14504–14513. [Google Scholar]
- Chu, J.; Li, W.; Wang, X.; Ning, K.; Lu, Y.; Fan, X. Digging into Intrinsic Contextual Information for High-fidelity 3D Point Cloud Completion. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February 2025. [Google Scholar]
- Li, Y.; Ye, Z.; Huang, X.; HeLi, Y.; Shuang, F. LCL_FDA: Local context learning and full-level decoder aggregation network for large-scale point cloud semantic segmentation. Neurocomputing 2025, 621, 129321. [Google Scholar] [CrossRef]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. PointNet plus plus: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Boulch, A. ConvPoint: Continuous convolutions for point cloud processing. Comput. Graph. 2020, 88, 24–34. [Google Scholar] [CrossRef]
- Viswanath, K.; Jiang, P.; Saripalli, S. Reflectivity is all you need! Advancing LiDAR semantic segmentation. IFAC-PapersOnLine 2025, 59, 43–48. [Google Scholar]
- Yang, L.; Hu, P.; Yuan, S.; Zhang, L.; Liu, J.; Shen, H.; Zhu, X. Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather. In Proceedings of the Computer Vision and Pattern Recognition Conference, Nashville, TN, USA, 11–15 June 2025. [Google Scholar]
- Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, 6, 610–621. [Google Scholar] [CrossRef]
- Tso, B.; Olsen, R.C. Scene classification using combined spectral, textural and contextual information. Algorithms Technol. Multispectral Hyperspectral Ultraspectral Imag. X 2004, 5425, 135–146. [Google Scholar]
- Takhtkeshha, N.; Bocaux, L.; Ruoppa, L.; Remondino, F.; Mandlburger, G.; Kukko, A.; Hyyppä, J. 3D forest semantic segmentation using multispectral LiDAR and 3D deep learning. arXiv 2025, arXiv:2507.08025. [Google Scholar]
- Li, Z.; Hodgson, M.E.; Li, W. A general-purpose framework for parallel processing of large-scale LiDAR data. Int. J. Digit. Earth 2018, 11, 26–47. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Hackel, T.; Savinov, N.; Ladicky, L.; Wegner, J.D.; Schindler, K.; Pollefeys, M. Semantic3D.net: A new large-scale point cloud classification benchmark. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2017, V-1-W1, 91–98. [Google Scholar] [CrossRef]
- Li, W.; Batty, M.; Goodchild, M.F. Real-time GIS for smart cities. Int. J. Geogr. Inf. Sci. 2020, 34, 311–324. [Google Scholar] [CrossRef]
- Batty, M. Virtual Reality in Geographic Information Systems. In Handbook of Geographic Information Science; Blackwell Publishing: Oxford, UK, 2008; pp. 317–334. [Google Scholar]
- Tang, W.; Chen, S.-E.; Diemer, J.; Allan, C.; Chen, T.; Slocum, Z.; Shukla, T.; Chavan, V.S.; Shanmugam, N.S. DeepHyd: A Deep Learning-Based Artificial Intelligence Approach for the Automated Classification of Hydraulic Structures from LiDAR and Sonar Data; North Carolina Department of Transportation, Research and Development Unit: Raleigh, NC, USA, 2022. [Google Scholar]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
- Myint, S.W. Fractal approaches in texture analysis and classification of remotely sensed data: Comparisons with spatial autocorrelation techniques and simple descriptive statistics. Int. J. Remote Sens. 2003, 24, 1925–1947. [Google Scholar] [CrossRef]
Measurements | Values |
---|---|
OA | 63% |
mIOU | 28% |
Man-made terrain | 39% |
Natural terrain | 38% |
High vegetation | 54% |
Low vegetation | 8% |
Buildings | 63% |
Hardscape | 12% |
Scanning artefacts | 0% |
Cars | 11% |
Statistics | Mean | Std. | Max | Min |
---|---|---|---|---|
OA | 85.5% | 0.2% | 85.1% | 85.8% |
mIoU | 57.6% | 0.4% | 56.9% | 58.4% |
Man-made terrain | 92.8% | 0.3% | 92.4% | 93.1% |
Natural terrain | 78.7% | 1.4% | 76.4% | 80.4% |
High vegetation | 66.1% | 1.2% | 63.9% | 68.3% |
Low vegetation | 28.8% | 2.6% | 25.7% | 33.2% |
Buildings | 84.0% | 0.4% | 83.0% | 84.7% |
Hardscape | 27.7% | 0.9% | 26.1% | 28.9% |
Scanning artefacts | 26.8% | 1.6% | 23.9% | 29.0% |
Cars | 55.6% | 1.8% | 52.6% | 57.8% |
Statistics | Baseline | Matheron’s Semivariance (Gain, p Value) | SA-Encoder Informed (Gain, p Value) |
---|---|---|---|
OA | 81.95% | 83.32% (+1.37%, 0.01) | 85.55% (+3.60%, <0.01) |
mIOU | 51.58% | 54.05% (+2.47%, 0.00) | 57.58% (+6.00%, <0.01) |
Man-made terrain | 90.69% | 91.02% (+0.33%, 0.19) | 92.78% (+2.09%, <0.01) |
Natural terrain | 72.59% | 73.45% (+0.86%, 0.29) | 78.67% (+6.08%, <0.01) |
High vegetation | 57.01% | 60.15% (+3.14%, 0.01) | 66.10% (+9.09%, <0.01) |
Low vegetation | 24.12% | 26.41% (+2.29%, <0.01) | 28.81% (+4.69%, <0.01) |
Buildings | 79.85% | 81.38% (+1.53%, 0.03) | 84.05% (+4.20%, <0.01) |
Hardscape | 21.55% | 23.19% (+1.64%, 0.07) | 27.73% (+6.18%, <0.01) |
Scanning artefacts | 20.25% | 25.10% (+4.85%, <0.01) | 26.83% (+6.58%, <0.01) |
Cars | 46.55% | 51.72% (+5.17%, 0.01) | 55.64% (+9.09%, <0.01) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, T.; Tang, W.; Chen, S.-E.; Allan, C. SA-Encoder: A Learnt Spatial Autocorrelation Representation to Inform 3D Geospatial Object Detection. Remote Sens. 2025, 17, 3124. https://doi.org/10.3390/rs17173124
Chen T, Tang W, Chen S-E, Allan C. SA-Encoder: A Learnt Spatial Autocorrelation Representation to Inform 3D Geospatial Object Detection. Remote Sensing. 2025; 17(17):3124. https://doi.org/10.3390/rs17173124
Chicago/Turabian StyleChen, Tianyang, Wenwu Tang, Shen-En Chen, and Craig Allan. 2025. "SA-Encoder: A Learnt Spatial Autocorrelation Representation to Inform 3D Geospatial Object Detection" Remote Sensing 17, no. 17: 3124. https://doi.org/10.3390/rs17173124
APA StyleChen, T., Tang, W., Chen, S.-E., & Allan, C. (2025). SA-Encoder: A Learnt Spatial Autocorrelation Representation to Inform 3D Geospatial Object Detection. Remote Sensing, 17(17), 3124. https://doi.org/10.3390/rs17173124