Hierarchical Object-Focused and Grid-Based Deep Unsupervised Segmentation Method for High-Resolution Remote Sensing Images
Abstract
:1. Introduction
- (1).
- Aiming at a practical use, we analyze the obstacles encountered when attempting to use a UDNN to obtain unsupervised remote sensing image segmentation results.
- (2).
- A hierarchical object-focused and grid-based method is proposed to address these obstacles. Specifically, lazy and iterative processes are used to gradually achieve unsupervised segmentation targets, rather than pursuing a powerful UDNN that can obtain excellent results in a single shot.
- (3).
- The unsupervised classification ability of UDNNs is transformed into a controllable and stable segmentation ability, making UDNN models suitable for practical remote sensing applications.
- (4).
- The proposed method can serve as a new framework for solving the problem of deep unsupervised segmentation in remote sensing. The corresponding source code is shared in order to motivate and facilitate further research on deep unsupervised remote sensing image segmentation.
2. Related Work
2.1. Unsupervised Segmentation
2.2. Semantic Segmentation
- (i)
- Accurate boundary information and few cross-boundary segments
- (ii)
- Reasonably sized segments containing spatial information
3. Obstacles Hindering Deep Unsupervised Remote Sensing Image Segmentation
4. Methodology
4.1. Overall Design of the Method
- (1)
- Hierarchical: one may consider distinguishing the most obvious objects first, then further separating the most obvious region for each object and continuing this process until all boundary details can be distinguished. In this hierarchical iterative process, it is not necessary to recognize all segments in a single shot, it is only necessary to be able to subdivide the segments generated in the previous round.
- (2)
- Superpixel/grid-based: segment labels can be assigned to units based on superpixels or grid cells rather than single pixels. This can effectively prevent the formation of fragments that are too small, and the establishment of superpixel/grid boundaries can also suppress the emergence of jagged borders caused by the resizing process. At present, it is easy to obtain oversegmented results of uniform size using shallow unsupervised segmentation methods, and these results can be directly used as a set of superpixels or grid cells for this purpose.
- (1)
- Uncontrollable refinement process: as the iterative process proceeds, Ri is refined on the basis of Ri−1, and the number of segments in Ri must be greater than or equal to the number in Ri−1. Therefore, although the number of segments identified by a UDNN is uncontrollable, the segmentation process of HOFG must be progressively refined as the number of iterations increases. This successfully solves Obstacle 1.
- (2)
- Excessive fragmentation at the border: due to the introduction of the grid-based strategy, the segment label of each grid cell is determined by the label of the majority of its internal pixels, and small pixel-level fragments do not appear in the results. Therefore, an excessive fragmentation at the border does not occur, overcoming Obstacle 2.
- (3)
- Excessive computing resource requirements: the use of downscaled input images in the LDSM can prevent UDNN from receiving excessively large images. This prevents the introduction of excessively large deep models during HOFG iterations, and the demand for GPU memory is always controlled within a reasonable range. Additionally, the segment boundaries in the output are directly based on the grid cells rather than the pixel-level UDNN results. Therefore, the use of downscaled input images in the UDNN (jagged boundary result) has little impact on the final results. These characteristics directly help address the challenges presented by Obstacle 3.
4.2. Construction of the Grid and Initialization of the Set of Segmentation Results
Algorithm 1: HOFG initialization algorithm (HOFG-Init) |
Input: Iimg Output: Rgrid, R0 Begin Cinit-seg= Segment Iimg via SLIC with segments= Pslicnum and compactness= Pcompact; labelid=0; Rgrid=ø; R0=ø; foreach ci in Cinit-seg ri={labelid, {pixel positions of ci}}; Rgrid←ri; labelid= labelid+1; r1={0, {all pixel positions of Iimg}}; R0←r1; return Rgrid, R0; End |
4.3. Lazy Deep Segmentation Method (LDSM)
4.3.1. Unsupervised Segmentation by a Deep Neural Network
- (1)
- Feature extraction component: this component contains N groups of layers, where each group consists of a convolutional layer (filter size is 3 × 3, padding is ‘same’ and the number of output channels is Nmax-category), a layer that applies the rectified linear unit (ReLU) activation function, and a batch normalization layer. Through this component, the high-level features of the input image can be extracted.
- (2)
- Output component: this component contains a convolutional layer (filter size is 1 × 1, padding is ‘same’ and the number of output channels is Nmax-category) and a batch normalization layer. Through the processing of these layers, a Widthsub × Heightsub × Nmax-category matrix MXsub-map is obtained, in which a number between 0 and 1 is used to express the degree to which each pixel belongs to each category.
- (3)
- Argmax component: this component uses the argmax function to process the matrix MXsub-map from the previous component to obtain Csub-label, which is a Widthsub × Heightsub × 1 matrix containing a discrete class label for each pixel.
Algorithm 2: Unsupervised classification model training (UCM-Train) |
Input:Isub, Nmax-train, Nmin-category, Nmax-category Output:Muc Begin Muc =Create model based on the size of Isub and Nmax-category; while (Nmax-train >0) [MXsub-map, MXsub-label]=Process Isub with Muc; loss= loss(MXsub-map, MXsub-label); Update weights in Muc←backpropagation with loss; Nmax-train= Nmax-train-1; Ncurrent-category=number of categories in MXsub-label; if (Nmin-category >= Ncurrent-category) break; return Muc; End |
Algorithm 3: Subimage segmentation by a deep neural network (SUB-Seg) |
Input:Isub Output:Rsub Begin Muc =UCM-Train(Isub, Nmax-train, Nmin-category, Nmax-category); MXsub-label=Use Muc to process Isub; Rsub=ø; segid=1; seedpixel=first pixel in MXsub-label with label≠-1; while (seedpixel is not null) pixelpositions=Perform flood filling at seedpixel based on the same category label; MXsub-label[pixelpositions]=-1; Rsub ←{segid,{pixelpositions}} seedpixel=first pixel in Rsubi-classify with label≠-1; segid= segid+1; return Rsub End |
4.3.2. LDSM Process
- (1)
- Low segmentation requirements: the LDSM focuses on the refinement of each segment obtained from the previous step of HOFG. Therefore, in the current step, SUB-Seg does not need to perform a perfect and thorough segmentation. Instead, for a separable ri, it is sufficient to find the most obvious different objects in it and separate it into >= 2 segments accordingly. This significantly reduces the difficulty of completing the SUB-Seg task.
- (2)
- Low segment border and pixel precision requirements: the basic segmentation units of the LDSM are the grid cells from Rgrid. Because a pixel-level fragmentation and the formation of jagged boundaries during the resizing are prevented by the grid-based strategy, the LDSM does not need to pursue precision at the level of all borders and pixels.
- (3)
- Low image size requirements: due to the relatively loose requirements of (1) and (2), when a large segment/image needs to be processed, the LDSM can directly downscale the image before processing it. This makes the LDSM easier to adapt to larger remote sensing images.
- (1)
- Processing of all segments
- (2)
- Correction based on Rgrid
Algorithm 4: Lazy deep segmentation method (LDSM) |
Input:Iimg, Rgrid, Ri−1 Output:Ri Begin #Processing of all segments Iseg-label= Empty label image; globallabel=1; foreach ri in Ri−1 if (ri is too small) Iseg-label[pixel locations of ri]= globallabel++; continue; Isub-ori= Cut a subimage from Iimg based on a rectangle completely containing ri; Isub= Downscale Isub-ori if Isub-ori is larger than Pmax-size; Rsub= SUB-Seg(Isub); Rsub-ori= Upscale Rsub to the original size; foreach ri′ in Rsub-ori Iseg-label[pixel locations of ri′]= globallabel++; # Correction based on Rgrid Ri= ø; globallabel=1; foreach ri in Rgrid mlabel=Obtain the majority label among the pixels from Iseg-label[pixel locations of ri]; Iseg-label[pixel locations of ri]=mlabel; labellist=unique(Iseg-label); foreach li in labellist pixelpositions=find pixels in Iseg-label where label= li; Ri←{globallabel++,{pixelpositions}}; return Ri; End |
4.4. Hierarchical Object-Focused Segmentation Process
Algorithm 5: Hierarchical object-focused and grid-based deep unsupervised segmentation method for high-resolution remote sensing images (HOFG) |
Input: Iimg, Nmax-iteration Begin:Rresult [R0, Rgrid]=HOFG-Init(Iimg); Ri−1=R0; inum=1; while (inum< Nmax-iteration) Ri=LDSM(Iimg, Ri−1, Rgrid); if (Ri has not changed compared to Ri−1) break; Ri−1=Ri; inum= inum+1; Rresult= Ri; return Rresult; End |
5. Experiments
5.1. Method Implementation
- (1)
- SLIC: SLIC works in the CIELAB color space and can quickly gain momentum. SLIC is a relatively easy-to-control traditional shallow segmentation method, in which the target number of segments and the compactness of the segments can be specified [52].
- (2)
- Watershed: the watershed algorithm uses a gradient image as a landscape, which is then flooded from given markers. The markers parameter determines the initial number of markers, which, in turn, determines the final output results [53].
- (3)
- Felzenszwalb: the Felzenszwalb algorithm is a graph-based image segmentation method, in which the scale parameter influences the size of the segments in the results [54].
- (4)
- UDNN: a UDNN is used to classify a remote sensing image in an unsupervised manner [16,17], and segment labels are then assigned via the flood fill algorithm. As analyzed in Section 3, a UDNN cannot handle remote sensing images that are too large. Therefore, for large remote sensing images, we resize them to a smaller size that is suitable for processing on our computer and then restore the results to the original size before assigning the segment labels.
- (5)
- HOFG: for the method proposed in this article, the maximum input size Pmax-size of the LDSM is set to 600 × 600, and the maximum number of iterations, Nmax-iteration, is specified as five.
5.2. Reference-Based Evaluation Strategy
5.3. Iterative Process of HOFG and Comparison of Results
5.4. Performing Segmentation on Larger Remote Sensing Images
5.5. Execution Time Comparison
5.6. Experiments on More Remote Sensing Images
6. Conclusions
- (1)
- Inheriting the recognition ability of deep models: in place of shallow models, a UDNN is adopted in HOFG to identify and segment objects based on the high-level features of remote sensing images, enabling the generation of larger and more complete segments to represent typical objects such as buildings, roads and trees.
- (2)
- Enabling a controllable refinement process: with the direct use of a UDNN, it is difficult to control the oversegmentation or undersegmentation of the results; consequently, a UDNN is unable to achieve a given OA target. In HOFG, the LDSM is used in each iteration to refine the results from the previous iteration; this ensures that the segments can be progressively refined as the iterative process proceeds, thus making the HOFG segmentation process controllable.
- (3)
- Reducing fragmentation at the border: a reference grid is used in the LDSM, and the UDNN results are indirectly transformed into segments on the basis of this grid. In this way, segments that are too small, stripe shaped or inappropriately connected can be filtered out.
- (4)
- Providing the ability to process large remote sensing images: as the input image size increases, the corresponding UDNN model will also increase in size, making it difficult for ordinary GPUs to load. Consequently, the direct use of a UDNN requires a resizing of the input image, which can lead to jagged segment boundaries. The grid-based strategy of the LDSM and the iterative improvement process in HOFG reduces the detrimental influence of resizing, allowing HOFG to obtain high-quality segmentation results even when processing very large images.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Pan, X.; Zhao, J.; Xu, J. An object-based and heterogeneous segment filter convolutional neural network for high-resolution remote sensing image classification. Int. J. Remote Sens. 2019, 40, 5892–5916. [Google Scholar] [CrossRef]
- Castillejo-González, I.L.; López-Granados, F.; García-Ferrer, A.; Peña-Barragán, J.M.; Jurado-Expósito, M.; de la Orden, M.S.; González-Audicana, M. Object-and pixel-based analysis for mapping crops and their agro-environmental associated measures using QuickBird imagery. Comput. Electron. Agric. 2009, 68, 207–215. [Google Scholar] [CrossRef]
- Castilla, G.; Hay, G.G.; Ruiz-Gallardo, J.R. Size-constrained region merging (SCRM). Photogramm. Eng. Remote Sens. 2008, 74, 409–419. [Google Scholar] [CrossRef] [Green Version]
- Blaschke, T. Object based image analysis for remote sensing. ISPRS J. Photogramm. Remote Sens. 2010, 65, 2–16. [Google Scholar] [CrossRef] [Green Version]
- Blaschke, T.; Hay, G.J.; Kelly, M.; Lang, S.; Hofmann, P.; Addink, E.; Feitosa, R.Q.; van der Meer, F.; van der Werff, H.; van Coillie, F. Geographic object-based image analysis–towards a new paradigm. ISPRS J. Photogramm. Remote Sens. 2014, 87, 180–191. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Fu, B.; Wang, Y.; Campbell, A.; Li, Y.; Zhang, B.; Yin, S.; Xing, Z.; Jin, X. Comparison of object-based and pixel-based Random Forest algorithm for wetland vegetation mapping using high spatial resolution GF-1 and SAR data. Ecol. Indic. 2017, 73, 105–117. [Google Scholar] [CrossRef]
- Gong, M.; Zhan, T.; Zhang, P.; Miao, Q. Superpixel-based difference representation learning for change detection in multispectral remote sensing images. IEEE Trans. Geosci. Remote Sens. 2017, 55, 2658–2673. [Google Scholar] [CrossRef]
- Pekkarinen, A. A method for the segmentation of very high spatial resolution images of forested landscapes. Int. J. Remote Sens. 2002, 23, 2817–2836. [Google Scholar] [CrossRef]
- Wang, M.; Li, R. Segmentation of high spatial resolution remote sensing imagery based on hard-boundary constraint and two-stage merging. IEEE Trans. Geosci. Remote Sens. 2014, 52, 5712–5725. [Google Scholar] [CrossRef]
- Dey, V.; Zhang, Y.; Zhong, M. A Review on Image Segmentation Techniques with Remote Sensing Perspective. In Proceedings of the ISPRS TC VII Symposium–100 Years ISPRS, Vienna, Austria, 5–7 July 2010; Volume 38. [Google Scholar]
- Yi, L.; Zhang, G.; Wu, Z. A scale-synthesis method for high spatial resolution remote sensing image segmentation. IEEE Trans. Geosci. Remote Sens. 2012, 50, 4062–4070. [Google Scholar] [CrossRef]
- Wang, F.; Piao, S.; Xie, J. CSE-HRNet: A context and semantic enhanced high-resolution network for semantic segmentation of aerial imagery. IEEE Access 2020, 8, 182475–182489. [Google Scholar] [CrossRef]
- Ma, L.; Liu, Y.; Zhang, X.; Ye, Y.; Yin, G.; Johnson, B.A. Deep learning in remote sensing applications: A meta-analysis and review. ISPRS J. Photogramm. Remote Sens. 2019, 152, 166–177. [Google Scholar] [CrossRef]
- Zhang, C.; Yue, P.; Tapete, D.; Jiang, L.; Shangguan, B.; Huang, L.; Liu, G. A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images. ISPRS J. Photogramm. Remote Sens. 2020, 166, 183–200. [Google Scholar] [CrossRef]
- Zhang, L.; Hu, X.; Zhang, M.; Shu, Z.; Zhou, H. Object-level change detection with a dual correlation attention-guided detector. ISPRS J. Photogramm. Remote Sens. 2021, 177, 147–160. [Google Scholar] [CrossRef]
- Kanezaki, A. Unsupervised Image Segmentation by Backpropagation. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 1543–1547. [Google Scholar]
- Kim, W.; Kanezaki, A.; Tanaka, M. Unsupervised learning of image segmentation based on differentiable feature clustering. IEEE Trans. Image Process. 2020, 29, 8055–8068. [Google Scholar] [CrossRef]
- Comaniciu, D.; Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 603–619. [Google Scholar] [CrossRef] [Green Version]
- Benz, U.C.; Hofmann, P.; Willhauck, G.; Lingenfelder, I.; Heynen, M. Multi-resolution, object-oriented fuzzy analysis of remote sensing data for GIS-ready information. ISPRS J. Photogramm. Remote Sens. 2004, 58, 239–258. [Google Scholar] [CrossRef]
- Ciecholewski, M. River channel segmentation in polarimetric SAR images: Watershed transform combined with average contrast maximisation. Expert Syst. Appl. 2017, 82, 196–215. [Google Scholar] [CrossRef]
- Hadavand, A.; Saadatseresht, M.; Homayouni, S. Segmentation parameter selection for object-based land-cover mapping from ultra high resolution spectral and elevation data. Int. J. Remote Sens. 2017, 38, 3586–3607. [Google Scholar] [CrossRef]
- Wang, Y.; Qi, Q.; Liu, Y.; Jiang, L.; Wang, J. Unsupervised segmentation parameter selection using the local spatial statistics for remote sensing image segmentation. Int. J. Appl. Earth Obs. Geoinf. 2019, 81, 98–109. [Google Scholar] [CrossRef]
- Tetteh, G.O.; Gocht, A.; Schwieder, M.; Erasmi, S.; Conrad, C. Unsupervised Parameterization for Optimal Segmentation of Agricultural Parcels from Satellite Images in Different Agricultural Landscapes. Remote Sens. 2020, 12, 3096. [Google Scholar] [CrossRef]
- Johnson, B.; Xie, Z. Unsupervised image segmentation evaluation and refinement using a multi-scale approach. ISPRS J. Photogramm. Remote Sens. 2011, 66, 473–483. [Google Scholar] [CrossRef]
- Chen, J.; Deng, M.; Mei, X.; Chen, T.; Shao, Q.; Hong, L. Optimal segmentation of a high-resolution remote-sensing image guided by area and boundary. Int. J. Remote Sens. 2014, 35, 6914–6939. [Google Scholar] [CrossRef]
- Liu, Y.; Shan, C.; Gao, Q.; Gao, X.; Han, J.; Cui, R. Hyperspectral image denoising via minimizing the partial sum of singular values and superpixel segmentation. Neurocomputing 2019, 330, 465–482. [Google Scholar] [CrossRef] [Green Version]
- Dao, P.D.; Mantripragada, K.; He, Y.; Qureshi, F.Z. Improving hyperspectral image segmentation by applying inverse noise weighting and outlier removal for optimal scale selection. ISPRS J. Photogramm. Remote Sens. 2021, 171, 348–366. [Google Scholar] [CrossRef]
- Tong, H.; Maxwell, T.; Zhang, Y.; Dey, V. A supervised and fuzzy-based approach to determine optimal multi-resolution image segmentation parameters. Photogramm. Eng. Remote Sens. 2012, 78, 1029–1044. [Google Scholar] [CrossRef]
- Zhang, X.; Xiao, P.; Feng, X.; Wang, J.; Wang, Z. Hybrid region merging method for segmentation of high-resolution remote sensing images. ISPRS J. Photogramm. Remote Sens. 2014, 98, 19–28. [Google Scholar] [CrossRef]
- Yang, J.; He, Y.; Caspersen, J. Region merging using local spectral angle thresholds: A more accurate method for hybrid segmentation of remote sensing images. Remote Sens. Environ. 2017, 190, 137–148. [Google Scholar] [CrossRef]
- Cho, J.H.; Mall, U.; Bala, K.; Hariharan, B. Picie: Unsupervised Semantic Segmentation Using Invariance and Equivariance in Clustering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 16794–16804. [Google Scholar]
- Hamilton, M.; Zhang, Z.; Hariharan, B.; Snavely, N.; Freeman, W.T. Unsupervised Semantic Segmentation by Distilling Feature Correspondences. arXiv 2022, arXiv:2203.08414. [Google Scholar]
- Mou, L.; Zhu, X.X. Vehicle instance segmentation from aerial image and video using a multitask learning residual fully convolutional network. IEEE Trans. Geosci. Remote Sens. 2018, 56, 6699–6711. [Google Scholar] [CrossRef] [Green Version]
- Hua, Y.; Mou, L.; Zhu, X.X. Relation network for multilabel aerial image classification. IEEE Trans. Geosci. Remote Sens. 2020, 58, 4558–4572. [Google Scholar] [CrossRef]
- Audebert, N.; Le Saux, B.; Lefèvre, S. Beyond RGB: Very high resolution urban remote sensing with multimodal deep networks. ISPRS J. Photogramm. Remote Sens. 2018, 140, 20–32. [Google Scholar] [CrossRef] [Green Version]
- Shi, Y.; Li, Q.; Zhu, X.X. Building segmentation through a gated graph convolutional neural network with deep structured feature embedding. ISPRS J. Photogramm. Remote Sens. 2020, 159, 184–197. [Google Scholar] [CrossRef] [PubMed]
- Waser, L.T.; Rüetschi, M.; Psomas, A.; Small, D.; Rehush, N. Mapping dominant leaf type based on combined Sentinel-1/-2 data–Challenges for mountainous countries. ISPRS J. Photogramm. Remote Sens. 2021, 180, 209–226. [Google Scholar] [CrossRef]
- Pan, X.; Zhao, J.; Xu, J. Conditional generative adversarial network-based training sample set improvement model for the semantic segmentation of high-resolution remote sensing images. IEEE Trans. Geosci. Remote Sens. 2020, 59, 7854–7870. [Google Scholar] [CrossRef]
- Hua, Y.; Marcos, D.; Mou, L.; Zhu, X.X.; Tuia, D. Semantic segmentation of remote sensing images with sparse annotations. IEEE Geosci. Remote Sens. Lett. 2021, 19, 8006305. [Google Scholar] [CrossRef]
- Saha, S.; Shahzad, M.; Mou, L.; Song, Q.; Zhu, X.X. Unsupervised Single-Scene Semantic Segmentation for Earth Observation. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5228011. [Google Scholar] [CrossRef]
- Tian, J.; Chen, D.M. Optimization in multi-scale segmentation of high-resolution satellite images for artificial feature recognition. Int. J. Remote Sens. 2007, 28, 4625–4644. [Google Scholar] [CrossRef]
- Li, D.; Zhang, G.; Wu, Z.; Yi, L. An edge embedded marker-based watershed algorithm for high spatial resolution remote sensing image segmentation. IEEE Trans. Image Process. 2010, 19, 2781–2787. [Google Scholar]
- Yang, J.; He, Y.; Weng, Q. An automated method to parameterize segmentation scale by enhancing intrasegment homogeneity and intersegment heterogeneity. IEEE Geosci. Remote Sens. Lett. 2015, 12, 1282–1286. [Google Scholar] [CrossRef]
- Mishra, N.B.; Crews, K.A. Mapping vegetation morphology types in a dry savanna ecosystem: Integrating hierarchical object-based image analysis with Random Forest. Int. J. Remote Sens. 2014, 35, 1175–1198. [Google Scholar] [CrossRef]
- Geiß, C.; Klotz, M.; Schmitt, A.; Taubenböck, H. Object-based morphological profiles for classification of remote sensing imagery. IEEE Trans. Geosci. Remote Sens. 2016, 54, 5952–5963. [Google Scholar] [CrossRef]
- Su, T.; Zhang, S. Local and global evaluation for remote sensing image segmentation. ISPRS J. Photogramm. Remote Sens. 2017, 130, 256–276. [Google Scholar] [CrossRef]
- Pan, X.; Zhao, J. High-resolution remote sensing image classification method based on convolutional neural network and restricted conditional random field. Remote Sens. 2018, 10, 920. [Google Scholar] [CrossRef] [Green Version]
- Sutha, J. Object based classification of high resolution remote sensing image using HRSVM-CNN classifier. Eur. J. Remote Sens. 2020, 53, 16–30. [Google Scholar]
- Papadomanolaki, M.; Vakalopoulou, M.; Karantzalos, K. A novel object-based deep learning framework for semantic segmentation of very high-resolution remote sensing data: Comparison with convolutional and fully convolutional networks. Remote Sens. 2019, 11, 684. [Google Scholar] [CrossRef] [Green Version]
- Nalepa, J.; Myller, M.; Imai, Y.; Honda, K.-i.; Takeda, T.; Antoniak, M. Unsupervised segmentation of hyperspectral images using 3-D convolutional autoencoders. IEEE Geosci. Remote Sens. Lett. 2020, 17, 1948–1952. [Google Scholar] [CrossRef] [Green Version]
- Pan, X.; Zhang, C.; Xu, J.; Zhao, J. Simplified object-based deep neural network for very high resolution remote sensing image classification. ISPRS J. Photogramm. Remote Sens. 2021, 181, 218–237. [Google Scholar] [CrossRef]
- Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef] [Green Version]
- Neubert, P.; Protzel, P. Compact Watershed and Preemptive Slic: On Improving Trade-Offs of Superpixel Segmentation Algorithms. In Proceedings of the 2014 22nd international Conference on Pattern Recognition, Stockholm, Sweden, 24–28 August 2014; pp. 996–1001. [Google Scholar]
- Felzenszwalb, P.F.; Huttenlocher, D.P. Efficient graph-based image segmentation. Int. J. Comput. Vis. 2004, 59, 167–181. [Google Scholar] [CrossRef]
Feature | Source | |
---|---|---|
Vaihingen | Potsdam | |
Resolution | 9 cm | 5 cm |
Categories | Impervious surfaces, buildings, low vegetation, trees, cars and clutter/background. | Impervious surfaces, buildings, low vegetation, trees, cars and clutter/background. |
Image size | The size of all images is greater than 1500 × 1500. | The size of all images is 6000 × 6000. |
Selected images | top_mosaic_09cm_area1.tif top_mosaic_09cm_area2.tif top_mosaic_09cm_area3.tif top_mosaic_09cm_area4.tif top_mosaic_09cm_area5.tif top_mosaic_09cm_area11.tif | top_potsdam_2_10_rgb.tif top_potsdam_2_11_rgb.tif top_potsdam_2_12_rgb.tif top_potsdam_3_10_rgb.tif top_potsdam_3_11_rgb.tif top_potsdam_3_12_rgb.tif |
Method | Role | Parameter Settings |
---|---|---|
SLIC | Reference method | For an input image, the compactness is set to 10, and the target number of segments is increased until the OA of the result is over 90%. |
Watershed | To be evaluated | The markers’ compactness is set to 0.001, and the number of markers is increased until the SLIC accuracy interval is reached. |
Felzenszwalb | To be evaluated | The scale parameter is decreased until the SLIC accuracy interval is reached. |
UDNN | To be evaluated | For the UDNN between 5 and 100 iterations, typical results within 5 times the number of segments of SLIC are selected for comparison (a UDNN cannot reach the accuracy of SLIC in the approximation of the number of segments). |
HOFG | To be evaluated | The Pmax-size of the LDSM is set to 600 × 600, Nmax-iteration is specified as 5, and the method stops iterating when the SLIC accuracy interval is reached. |
Method | OA (%) | mIoU (%) | OA Target Achieved? | Number of Segments | |
---|---|---|---|---|---|
Shallow methods | SLIC | 91.75 | 85.94 | / | 331 |
Watershed | 91.49 | 85.13 | Y | 484 | |
Felzenszwalb | 91.56 | 84.05 | Y | 428 | |
Deep methods | UDNN (5 iterations) | 84.81 | 78.45 | N | 1346 |
UDNN (15 iterations) | 77.40 | 70.51 | N | 773 | |
HOFG | 92.04 | 87.11 | Y | 247 |
Test Image | Method | OA (%) | mIoU (%) | OA Target Achieved? | Number of Segments |
---|---|---|---|---|---|
1 | SLIC | 91.23 | 85.98 | / | 5799 |
Watershed | 91.69 | 86.22 | Y | 9184 | |
Felzenszwalb | 91.14 | 83.93 | Y | 6284 | |
UDNN | 84.64 | 76.85 | N | 8051 | |
HOFG | 91.29 | 85.66 | Y | 4186 | |
2 | SLIC | 91.67 | 86.57 | / | 7882 |
Watershed | 91.31 | 86.03 | Y | 9409 | |
Felzenszwalb | 91.84 | 85.34 | Y | 7126 | |
UDNN | 80.68 | 73.49 | N | 13,211 | |
HOFG | 91.45 | 86.85 | Y | 6393 |
Test Image | Method | Average Execution Time (s) |
---|---|---|
1 | SLIC | 8.4 |
Watershed | 10.6 | |
Felzenszwalb | 11.0 | |
UDNN | 22.8 | |
HOFG | 293.4 | |
2 | SLIC | 53.2 |
Watershed | 100.8 | |
Felzenszwalb | 124.6 | |
UDNN | 46.6 | |
HOFG | 904.0 |
Image | Dataset | Filename | Size |
---|---|---|---|
1 | Vaihingen | top_mosaic_09cm_area1.tif | 1919 × 2569 |
2 | top_mosaic_09cm_area2.tif | 2428 × 2767 | |
3 | top_mosaic_09cm_area3.tif | 2006 × 3007 | |
4 | top_mosaic_09cm_area4.tif | 1887 × 2557 | |
5 | top_mosaic_09cm_area5.tif | 1887 × 2557 | |
6 | Potsdam | top_potsdam_2_11_rgb.tif | 6000 × 6000 |
7 | top_potsdam_2_12_ rgb.tif | 6000 × 6000 | |
8 | top_potsdam_3_10_ rgb.tif | 6000 × 6000 | |
9 | top_potsdam_3_11_ rgb.tif | 6000 × 6000 | |
10 | top_potsdam_3_12_ rgb.tif | 6000 × 6000 |
Image | SLIC Results | HOFG Results | Percentage (%) | |||||
---|---|---|---|---|---|---|---|---|
OA (%) | mIoU (%) | Number of Segments | OA (%) | mIoU (%) | Number of Segments | Number of Iterations | ||
1 | 91.77 | 86.16 | 5495 | 91.82 | 86.99 | 4235 | 4 | 77.07 |
2 | 91.25 | 85.86 | 5516 | 91.46 | 86.27 | 4014 | 4 | 72.77 |
3 | 89.32 | 84.81 | 5777 | 89.59 | 84.78 | 4632 | 4 | 80.18 |
4 | 93.83 | 88.21 | 5809 | 93.61 | 88.51 | 4164 | 4 | 71.68 |
5 | 93.48 | 89.46 | 5796 | 93.27 | 88.89 | 4022 | 4 | 69.39 |
6 | 89.93 | 84.59 | 8092 | 89.82 | 85.42 | 7147 | 4 | 88.32 |
7 | 89.75 | 84.43 | 8261 | 89.80 | 84.85 | 8142 | 5 | 98.56 |
8 | 88.25 | 82.61 | 8110 | 87.62 | 82.95 | 6889 | 4 | 84.94 |
9 | 90.57 | 85.33 | 8211 | 90.81 | 85.86 | 7125 | 4 | 86.77 |
10 | 88.10 | 83.62 | 7887 | 87.71 | 83.55 | 6912 | 4 | 87.64 |
Average | 81.73 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pan, X.; Xu, J.; Zhao, J.; Li, X. Hierarchical Object-Focused and Grid-Based Deep Unsupervised Segmentation Method for High-Resolution Remote Sensing Images. Remote Sens. 2022, 14, 5768. https://doi.org/10.3390/rs14225768
Pan X, Xu J, Zhao J, Li X. Hierarchical Object-Focused and Grid-Based Deep Unsupervised Segmentation Method for High-Resolution Remote Sensing Images. Remote Sensing. 2022; 14(22):5768. https://doi.org/10.3390/rs14225768
Chicago/Turabian StylePan, Xin, Jun Xu, Jian Zhao, and Xiaofeng Li. 2022. "Hierarchical Object-Focused and Grid-Based Deep Unsupervised Segmentation Method for High-Resolution Remote Sensing Images" Remote Sensing 14, no. 22: 5768. https://doi.org/10.3390/rs14225768
APA StylePan, X., Xu, J., Zhao, J., & Li, X. (2022). Hierarchical Object-Focused and Grid-Based Deep Unsupervised Segmentation Method for High-Resolution Remote Sensing Images. Remote Sensing, 14(22), 5768. https://doi.org/10.3390/rs14225768