Abstract
This paper introduces a new method for the region segmentation of images. The approach is based on the raster-scan paradigm and builds the segments incrementally. The pixels are processed in the raster-scan order, while the construction of the segments is based on a distance metric in regard to the already segmented pixels in the neighbourhood. The segmentation procedure operates in linear time according to the total number of pixels. The proposed method, named the RSM (raster-scan segmentation method), was tested on selected images from the popular benchmark datasets MS COCO and DIV2K. The experimental results indicate that our method successfully extracts regions with similar pixel values. Furthermore, a comparison with two of the well-known segmentation methods—Watershed and DBSCAN—demonstrates that the proposed approach is superior in regard to efficiency while yielding visually similar results.
1. Introduction
Image segmentation denotes a process of dividing an image into a set of non-overlapping regions [1]. It is one of the most basic image preprocessing methods and is utilised in many applications, including object detection [2,3], medical imaging [4,5,6], remote sensing [7,8], and biometric recognition [9,10,11]. There are numerous methods available for this task, with the main objective of extracting adjacent pixel groups that share similar features and properties [12]. Different approaches can be used for performing image segmentation, which are divided into three groups [13]:
- Region segmentation.Region segmentation methods, also referred to as classic segmentation methods, exploit the similarity between pixels to arrange them into regions [13]. The most common region segmentation methods and their principles are considered briefly in Section 2.
- Collaborative segmentation.Collaborative segmentation (or co-segmentation) methods deal with the segmentation of similar objects in multiple correlated images [14]. The concept was introduced in [15], and has been used primarily for tracking the objects in videos [16,17,18].
- Semantic segmentation.The main goal of semantic segmentation methods is to divide an image into meaningful regions that are assigned to the most suitable predefined category labels [19,20]. Nowadays, the great majority of semantic segmentation methods are based on deep neural networks and other machine learning techniques. The most popular architectures include SegNet [21], ReSeg [22], DeepLab [23], CFNet [24], and HRViT [25]. Despite high segmentation performance [26], the downside of such methods is the demand for huge pre-annotated datasets during the training phase of the models.
The existing segmentation methods have proven their strengths in various tasks of image analysis [13]. However, depending on the technique, the available methods have several disadvantages: inefficiency, troublesome implementation, difficult manual tuning of parameters, and a prolonged training phase (machine learning approaches) [12]. The proposed approach overcomes the issues of other segmentation methods. It is efficient and simple to implement while still yielding comparable results to related region segmentation methods. It is especially suitable for deployment on edge devices [27] (e.g., surveillance cameras) due to its low computational resource demands.
The main contributions of this paper are as follows:
- A new, highly efficient method named the raster-scan segmentation method (RMS) based on a raster scan for the region segmentation of images.
- The design of different distance metrics, which define the similarity of pixels to segments, and actions, which are utilised for the incremental building of segments.
- Extensive experimental work, which demonstrates the efficiency of the RMS, and a comparison with state-of-the-art region segmentation methods (i.e., Watershed and DBSCAN).
The remainder of the paper is structured in the following way. Section 2 highlights the usage of image segmentation in sensor and actuator networks, and summarises the related methods for performing image segmentations. Section 3 describes the proposed method for the segmentation of images using the considered raster-scan approach, the RSM (raster-scan segmentation method). In Section 4, the results of the proposed method are shown and compared with other popular region segmentation methods. Section 5 concludes the paper, summarises the principles of the RSM, and describes future work.
2. Background and Related Work
Today, edge computing is one of the most popular paradigms in Computer Science [28]. The main characteristic of edge computing is that the majority of computations are performed on edge devices with low computational resources [29]. Sensor and actuator networks frequently include edge devices that gather image data. The segmentation methods should be efficient in terms of time and memory consumption in order to enable fast enough performance of various operations on images (tracking the objects in the scene). Image segmentation methods with low computational complexity have been used successfully on different edge devices in sensor and actuator networks, including remote sensing data [8,30,31], handwritten text [20], medical scans [32,33], and footage from surveillance cameras [34,35]. In robotics, image segmentation is often used for the recognition and analysis of objects within the robot’s surroundings [36,37,38].
The proposed raster-scan segmentation method belongs to the region segmentation methods, as its main goal is to split an image into regions with similar pixel values. In Table 1, the most popular related region segmentation methods and their main advantages and disadvantages are summarised. In the continuation of this section, a brief description of each listed segmentation method is provided.
Table 1.
Related region segmentation methods.
2.1. Edge Detection
Edge detectors are the earliest and most basic methods for region segmentation. Their main goal is to detect pixels with high gradients that represent boundaries of regions [13]. There are several approaches to this task, among which the least complex are differential operators. Their main advantage is low computational complexity. However, their main disadvantage is that they are very prone to producing visually peculiar results if there is even a small amount of noise present in an image [55]. A lot of differential operators have been proposed throughout the years, among which the most popular are Roberts, Kirsch, Deriche, Prewitt, Laplacian, Sobel, and Canny operators [39]. Their time complexity is , where n signifies the total number of pixels in an image.
Hough transform is a popular method for the detection of contours in an image. It was proposed by Duda and Hart [56], and has been adopted for many computer vision tasks, including image segmentation [40,57]. The main advantage of Hough transform is its high accuracy for the detection of regular polygons. On the other hand, it may not produce good segmentation results for concave and wavy structures in an image. Furthermore, Hough transform can be significantly demanding in terms of time complexity when dealing with such structures. In its basic form for detecting lines, the time complexity of Hough transformation is [58].
An iterative active contour approach (also referred to as snakes) was proposed by Kass et al. [59]. It is used for extracting regions from an image by fitting simple polylines to region edges. The latter is performed by energy minimisation [59]: the image forces pull the contour to important image features (i.e., edges, vertices, and lines) while obeying the constraints (either user-given or determined computationally). This method has been proven to yield good results in some cases [41,42]. However, there are two major disadvantages of the active contours method: sensitivity to the initial positions of control points, and difficult initialisation of their input parameters [60]. The time complexity of the basic approach is , where i signifies the number of iterations, c the selected number of control points, and m the number of scales (the contour is fitted to image regions in several resolutions).
Other image segmentation methods based on edge detection have been proposed in the past, such as graph cut optimisations [61,62]. Although they can produce high-accuracy segmentation results, they are only able to divide an image into two regions (foreground and background).
2.2. Region Division
Region division segmentation methods are based either on the serial or the parallel region division [13]. Serial region division splits an image into regions sequentially, while regions are formed simultaneously when parallel region division is applied.
A typical parallel region division technique is thresholding. In the first step, a threshold value is determined either manually or heuristically. After that, each pixel is assigned a binary value [63], based on whether the pixel’s value is above or below the threshold. Consequently, the obtained binary image consists of two segments. The described approach has low computational complexity. Division into two regions, however, is not particularly useful for images with many details. Therefore, thresholding with multiple thresholds [64] and adaptive local thresholding [43,44] were proposed in order to increase the usability of this segmentation method in practice. The time complexity of the described method is .
Watershed techniques utilise the geology concept of drainage divides for data segmentation [65,66]. A similar approach, based upon serial region division, is used for region segmentation [67,68]. A greyscale image is interpreted as a 3D topographic relief, which is split into valleys, ridges, and slopes [69]. The main advantage of the Watershed is low time complexity, as it operates in . However, it tends to over-segmentise an image. In order to solve this issue, a marker-controlled Watershed was proposed [45] where significant parts of the image are marked either by a user or using a heuristic.
2.3. Clustering
Clustering is a popular technique for organising data by grouping near data samples [70]. It is also used in different fields of image analysis [71,72,73]. One of the earliest and most well-known clustering algorithms is k-means [74]. Its main advantage is low computational complexity. On the other hand, major challenges arise concerning the choice of the parameter k that would yield the best segmentation, and the initialisation of the centroid pixels’ locations. Therefore, several analyses of an image are usually performed before the segmentation procedure [46,47,48,75,76,77]. The time complexity of k-means is , where t specifies the number of iterations, while k signifies the number of clusters.
Density-based spatial clustering of applications with noise (DBSCAN) was proposed by Ester et al. [78]. Compared to k-means, its major advantage is its ability to process non-linearly separable image structures. DBSCAN has been used successfully for image segmentations [49,50,51]. Its main disadvantage is having to precisely tune the input parameters in order to obtain the expected results. In the worst case, DBSCAN has a time complexity of , although, with the proper data structures, its time complexity is reduced to [79].
Superpixel segmentation is a clustering method that splits a weighted graph into subgraphs with low coupling and high cohesion [13]. Such an approach has been used in many image segmentation applications [52,53,54]. Despite the low time complexity using the proper implementation [54], there are several drawbacks of different superpixel segmentation methods: no explicit control over the superpixels’ numbers and compactness [52], and failing to correctly segmentise image regions that have high-intensity variety [80]. The time complexity of the most efficient superpixel segmentation methods is [54].
3. Raster-Scan Segmentation Method
The proposed method is based on a raster-scan paradigm that derives from the era of analogue television [81] based on a cathode-ray tube (CRT). The CRT directs a beam of electrons to the phosphorescent coating on the screen. As a result, a spot of light is produced at a place where the electron beam hits the screen coating. In order to display the picture on the screen, the CRT directs the beam from the left to the right, and, after the whole line is drawn, the procedure is repeated in the next row. This drawing order is known as the scan-line order [82], while the covering of the entire screen in this way is referred to as a raster scan [81].
Let I be a raster image with X columns and Y rows. Consequently, the number of pixels in I is . With no loss of generality, I is considered to be greyscale with b bit-planes, yielding shades of grey. The goal of region image segmentation is to divide I into a set of segments , so that . However, it is expected that .
The segmentation of I is performed in the scan-line order, which is why the method is referred to as the raster-scan segmentation method (RSM) in the continuation. The block diagram that represents the sequence of steps in the RSM is displayed in Figure 1.
Figure 1.
Block diagram of RSM.
In the beginning, the segment set is initialised as an empty set (block A in Figure 1). Then, the RSM iterates through I in a scan-line order (see Figure 2). After a pixel is reached, segments of neighbouring pixels are obtained (block B). The distances are calculated with the selected distance metric d (block C), based on which an action A is determined (block D). According to the local neighbourhood, is assigned to the segment (already existing or newly created). The partial result of the segmentation is stored in a set of edge-connected segments (block E). After all pixels in I are processed, contains the final result of the segmentation (block F). The pseudocode at the end of this section gives the details of the algorithm.
Figure 2.
The processing order of the RSM in I. Already processed pixels are marked with green circles, the current pixel is marked with a yellow circle, and non-processed pixels are marked with red circles.
Edge-connected segments are created and built incrementally during the segmentation procedure with the RSM. Each pixel is assigned to a segment . The selection of depends on the already processed pixels, and is based on the selected distance metric d. The following three distance metrics d were used for the calculation of the distance :
- One-Neighbour Metric (.This metric is based on the differences between the current pixel value and the pixel values of its left and top neighbours. The differences and are calculated according to Equation (1).
- Two-Neighbour Metric (.This metric considers the differences between the current pixel value and the average value of two pixels to the left and at the top. The differences and are calculated according to Equation (2).
- Adjacent Segments’ Average Metric (.Unlike the other two, this metric assesses larger areas in I instead of only separate adjacent pixels’ values. Let belong to the segment and let belong to . If and represent the average pixel values of the segments and , the differences and are calculated according to Equation (3).
In order to control the segmentation process, the user needs to provide two parameters. The first parameter is the distance threshold value , below which can be added to . The second parameter is the maximum deviation from a segment’s average value, . There are five possible actions A when assigning to :
- Merge.If , , , , and , segments of and are merged into a single segment . After that, is expanded with the current pixel . The latter is performed using a union: (Figure 3).
Figure 3. The action Merge: before (a) and after (b) the processing of pixel , marked by a red square. The green arrows indicate that . - Add.If , , , , , and , the segment is expanded as follows: (Figure 4).
Figure 4. The action Add: before (a) and after (b) the processing of pixel , marked by a red square. The green arrows indicate that . - Add-Left.If , , , and , the segment is expanded as follows: (Figure 5).
Figure 5. The action Add-Left: before (a) and after (b) the processing of pixel , marked by a red square. The green arrow indicates that while the red arrow indicates that . - Add-Up.If , , , and , the segment is expanded as follows: (Figure 6).
Figure 6. The action Add-Up: before (a) and after (b) the processing of pixel , marked by a red square. The green arrow indicates that while the red arrow indicates that . - New.In all other cases, a new segment is added to (Figure 7).
Figure 7. The action New: before (a) and after (b) the processing of pixel , marked by a red square. The red arrows indicate that .
The pseudocode of the RSM is presented in Algorithm 1. In the initialisation phase, an empty set of segments is created. In order to enable direct access to the already assigned pixels’ segments, a matrix M with indices of segments is used, and is initialised to −1 in the beginning. The loops in Lines 11 and 12 iterate through I. After a pixel is reached, the index of the left segment and the index of the upper segment are obtained from M in Lines 13 and 14. Segments and are obtained from M in Lines 15 and 16. and are calculated in Lines 17 and 18 using the provided d (according to Equations (1), (2), or (3)). In Line 20, one of the five actions A is selected, based on , , , , , and . If the action Merge is chosen, the segments of the left and the upper pixels are merged into a new segment, to which the current pixel is added in Line 24. The newly created is added to in Line 25, while the matrix M is updated in Line 26. In the case of the action Add, is added to the common of the left and top pixels in Line 28. After that, and M are updated. The actions Add-Left and Add-Top add to the left or the top in Line 32 and Line 36, respectively. In both cases, and M are then updated accordingly. In the case of the action New, a new is created, which is added to in Line 41. After that, in Line 42, M is also updated. The final result of the algorithm is a set of segments .
| Algorithm 1 Region image segmentation with the RSM |
|
4. Results
The results of the RSM are presented and discussed in this section. For testing purposes, the popular image datasets MS COCO [83] and DIV2K [84] were used. MS COCO is a well-known image dataset that is often used for the benchmarking of segmentation algorithms [85,86]. It consists of more than 330,000 images. Meanwhile, DIV2K is a considerably smaller dataset as it consists of 1000 images. Primarily, DIV2K is used as a benchmark dataset for super-resolution tasks [87]. In our case, we used both datasets to diversify the test image samples. The performance of the RSM is demonstrated on 10 greyscale images that were selected randomly from MS COCO and DIV2K. The high-resolution images belong to DIV2K, while the images with lower resolutions belong to MS COCO. The names of the selected images, the datasets that they belong to, their resolutions, and the total numbers of pixels n are shown in Table 2.
Table 2.
Names, source datasets, resolutions, and total number of pixels of the selected test images I.
In Figure 8, the results of image segmentation with the RSM are displayed. Different distance metrics d, described in Section 3, were used. For visualisation purposes, the segments are coloured with randomly selected colours, with the limitation that adjacent segments do not have the same colour. Such colouring enables distinguishing between adjacent segments.
Figure 8.
Image segmentation with the RSM using different distance metrics d: (a) input image, (b) , (c) , (d) . Each is coloured with a random colour.
The numbers of segments for each I using different d values are shown in Table 3. During the segmentation, a suitable action A (Merge, Add, Add-Left, Add-Up, or New) is selected according to the local neighbourhood of the current pixel . The distance metric produces higher than and for all test images. On the other hand, is consistently the lowest in all test cases if performing segmentation using . In Figure 8b, it can be seen that the distance metric produces fewer segments on the ground than and . Meanwhile, is able to extract segments that do not contain pixels with large gradients (e.g., segments that represent the sky). Therefore, the most appropriate distance metric d depends on the properties of I. If I contains many edges that are not clearly defined, it is sensible to use . On the other hand, if the pixels on the edges in I have large gradients, or can yield better results.
Table 3.
Number of segments for test images according to d.
Figure 9 shows region segmentations of the image Train with different values of (limited to the range that produces sensible segmentation results). The distance metric was used for the segmentation. Table 4 shows the number of segments for the test images according to different values of . Smaller values of cause the segmentation to yield higher . Furthermore, by selecting too small a value of , the RSM tends to over-segment the image (as seen in Figure 9b). Therefore, it is sensible not to choose values that are close to 0. On the other hand, an that is too large can lead to a failure in extracting segments with soft edges (e.g., in Figure 9d, where the ground and the bottom part of the train are merged into a single ).
Figure 9.
Image segmentation with the RSM according to different values of : (a) input image, (b) , (c) , (d) . Each is coloured with a random colour.
Table 4.
Number of segments for test images according to the parameter .
Figure 10 displays a comparison between the segmentation results with different values of . The numbers of segments for the test images are shown in Table 5. Effectively, small impacts the segmentation, so that regions with small pixel gradients and large differences in pixel values are split into several segments. Consequently, it can be observed that by increasing , the number of segments decreases. The decrease in is especially noticeable when dealing with smaller values of .
Figure 10.
Image segmentation with RSM according to different values of : (a) input image, (b) , (c) , (d) . Each is coloured with a random colour.
Table 5.
Number of segments for test images according to parameter .
Examples of region image segmentations of the test images Baseball, Stop Sign, and Taxi are displayed in Figure 11, Figure 12, and Figure 13, respectively. After the segmentation is performed, it can be seen that regions with low variance in pixel values form common segments while the important image details, such as edges and vertices, are preserved. Following this, the parameters d, , and were tuned manually, so that the segmentation yielded the best visual results. Although the majority of the regions were detected as expected, we can observe that some parts of the image are over-segmented (especially noticeable in Figure 12 below the Stop Sign).
Figure 11.
Region segmentation of Baseball: (a) input image, (b) segmented image.
Figure 12.
Region segmentation of Stop Sign: (a) input image, (b) segmented image.
Figure 13.
Region segmentation of Taxi: (a) input image, (b) segmented image.
In Figure 14, different region segmentation methods are compared on the test image Canal. Besides the RSM, we used the marker-controlled Watershed method from the library OpenCV [88] and DBSCAN from a popular C++ library mlpack [89]. In Table 6, the numbers of segments are summarised for the test image Canal. Despite the fact that Watershed yielded the fewest segments among the three methods, I was over-segmented in some areas (e.g., the water surface). On the other hand, the result of DBSCAN was considerably better, as larger areas are grouped to form segments. The RSM produces visually similar results to DBSCAN.
Figure 14.
Comparison of different region segmentation methods: (a) input image, (b) Watershed, (c) DBSCAN, (d) RSM.
Table 6.
Number of segments using different segmentation methods.
In terms of theoretical time complexity, with n denoting the total number of pixels, the worst case for Watershed segmentation is . On average, though, the algorithm performs in linear time [90]. DBSCAN’s worst time complexity using proper implementation is . The RSM processes each pixel only once, which is performed in linear time . Each classified pixel is assigned to a certain segment whose index is stored in a 2D matrix. Consequently, a query for a segment during the segmentation procedure is performed in . The operation of merging can be implemented with the union-find data structure, which has the amortised time complexity of [91]. Based on this, it can be concluded that the time complexity of the RSM is .
The average elapsed CPU times of region segmentation on the test images with Watershed, DBSCAN, and the RSM are shown in Table 7. The experiments were conducted on a personal computer with an Intel Core i9-12900K CPU, 64 GB of DDR5 RAM, and Windows 11 OS. Watershed’s performance turned out to be almost on a par with the RSM, while DBSCAN was considerably slower for images with large numbers of pixels. Despite the good performance of Watershed, the RSM was faster in all the test cases.
Table 7.
Time measurements of different region segmentation methods on test images. The measurements that are bold indicate the method that performed the fastest for a given test image.
5. Conclusions
This paper introduces a new method, the raster-scan segmentation method (RSM), for the region segmentation of images based on a raster scan. During the segmentation in scan-line order, image segments are built incrementally according to the similarity in the pixels’ local neighbourhoods. Different distance metrics are used to estimate the similarity between pixels. The RSM was extensively tested on images from the public image datasets MS COCO and DIV2K. Five images were selected randomly from each dataset, and were used to demonstrate the results of the proposed method. The RSM was compared with the widely used region segmentation methods Watershed and DBSCAN. Extensive experimental work revealed that the RSM brings several advantages. Besides its ability to extract segments with pixels that have small gradients, the RSM demonstrated comparable results to other state-of-the art region segmentation methods while being faster in terms of elapsed CPU time.
Region image segmentation could be used as a preprocessing step in different applications of image processing. The RSM could be applied on real-world datasets, or could be used as a real-time segmentation method in sensor and actuator networks due to its low computational complexity. Another possible use case would be remote sensing. Our method could be used as an alternative to other segmentation methods in order to extract different landscape elements from satellite images. Lastly, a possible domain where the RSM could be applied is object-based image compression, where compression could be performed on extracted segments instead of in the pixel space.
There are several ways to implement improvements to the RSM in the future. To start with, the RSM has no mechanisms to deal with noise present in an image. Additional studies on various types and intensities of noise, applied on images, should be conducted. After that, the most suitable noise-removing methods in the preprocessing step should be identified. In addition, currently, the method’s parameters are tuned manually, which can be inefficient in real-life applications. Consequently, an automatic approach for selecting the parameters, based on the extracted image’s characteristics, should be developed.
Author Contributions
Conceptualization, L.L. and B.Ž.; methodology, L.L., A.N., D.S. and B.Ž.; software, L.L. and Š.H.; validation, L.L., A.N. and B.Ž.; formal analysis, D.S. and B.Ž.; investigation, A.N. and D.S.; resources, Š.H. and B.Ž.; data curation, L.L. and Š.H.; writing—original draft preparation, L.L.; writing—review and editing, L.L., Š.H. and B.Ž.; visualization, L.L.; supervision, B.Ž. and D.S.; project administration, B.Ž.; funding acquisition, B.Ž. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded by the Slovenian Research and Innovation Agency under Research Project J2-4458 and Research Programme P2-0041.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
The source code of the image segmentation software is available at https://github.com/luckyLukac/ScanLineImageSegmentation, accessed on 30 September 2024. The datasets MS COCO and DIV2K are available at https://cocodataset.org/ and https://data.vision.ee.ethz.ch/cvl/DIV2K, respectively (accessed on 30 September 2024).
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Cheng, H.; Jiang, X.; Sun, Y.; Wang, J. Color image segmentation: Advances and prospects. Pattern Recognit. 2001, 34, 2259–2281. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Region-based convolutional networks for accurate object detection and segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 38, 142–158. [Google Scholar] [CrossRef]
- Hoeser, T.; Kuenzer, C. Object detection and image segmentation with deep learning on Earth observation data: A review-Part I: Evolution and recent trends. Remote Sens. 2020, 12, 1667. [Google Scholar] [CrossRef]
- Pham, D.L.; Xu, C.; Prince, J.L. Current methods in medical image segmentation. Annu. Rev. Biomed. Eng. 2000, 2, 315–337. [Google Scholar] [CrossRef] [PubMed]
- Fenster, A.; Chiu, B. Evaluation of Segmentation Algorithms for Medical Imaging. In Proceedings of the 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, Shanghai, China, 1–4 September 2005; pp. 7186–7189. [Google Scholar] [CrossRef]
- Guo, Z.; Li, X.; Huang, H.; Guo, N.; Li, Q. Deep learning-based image segmentation on multimodal medical imaging. IEEE Trans. Radiat. Plasma. Med. Sci. 2019, 3, 162–169. [Google Scholar] [CrossRef] [PubMed]
- Wang, Z.; Jensen, J.R.; Im, J. An automatic region-based image segmentation algorithm for remote sensing applications. Environ. Modell. Softw. 2010, 25, 1149–1165. [Google Scholar] [CrossRef]
- Kotaridis, I.; Lazaridou, M. Remote sensing image segmentation advances: A meta-analysis. ISPRS J. Photogramm. Remote Sens. 2021, 173, 309–322. [Google Scholar] [CrossRef]
- El-Hajj-Chehade, W.; Abdel Kader, R.; Kassem, R.; El-Zaart, A. Image Segmentation for Fingerprint Recognition. In Proceedings of the 2018 IEEE Applied Signal Processing Conference (ASPCON), Kolkata, India, 7–9 December 2018; pp. 314–319. [Google Scholar] [CrossRef]
- Tobji, R.; Di, W.; Ayoub, N. FMnet: Iris segmentation and recognition by using fully and multi-scale CNN for biometric security. Appl. Sci. 2019, 9, 2042. [Google Scholar] [CrossRef]
- Trokielewicz, M.; Czajka, A.; Maciejewicz, P. Post-mortem iris recognition with deep-learning-based image segmentation. Image Vision Comput. 2020, 94, 103866. [Google Scholar] [CrossRef]
- Kaur, D.; Kaur, Y. Various image segmentation techniques: A review. Int. J. Comput. Sci. Mob. Comput. 2014, 3, 809–814. [Google Scholar]
- Yu, Y.; Wang, C.; Fu, Q.; Kou, R.; Huang, F.; Yang, B.; Yang, T.; Gao, M. Techniques and challenges of image segmentation: A review. Electronics 2023, 12, 1199. [Google Scholar] [CrossRef]
- Vicente, S.; Rother, C.; Kolmogorov, V. Object Cosegmentation. In Proceedings of the CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011; pp. 2217–2224. [Google Scholar] [CrossRef]
- Rother, C.; Minka, T.; Blake, A.; Kolmogorov, V. Cosegmentation of Image Pairs by Histogram Matching—Incorporating a Global Constraint into MRFs. In Proceedings of the 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), New York, NY, USA, 17–22 June 2006; pp. 993–1000. [Google Scholar] [CrossRef]
- Chen, D.J.; Chen, H.T.; Chang, L.W. Video Object Cosegmentation. In Proceedings of the 20th ACM International Conference on Multimedia, Nara, Japan, 29 October–2 November 2012; pp. 805–808. [Google Scholar] [CrossRef]
- Liu, Z.; Wang, L.; Hua, G.; Zhang, Q.; Niu, Z.; Wu, Y.; Zheng, N. Joint video object discovery and segmentation by coupled dynamic Markov networks. IEEE Trans. Image Process. 2018, 27, 5840–5853. [Google Scholar] [CrossRef]
- Wang, L.; Lv, X.; Zhang, Q.; Niu, Z.; Zheng, N.; Hua, G. Object cosegmentation in noisy videos with multilevel hypergraph. IEEE Trans. Multimedia 2021, 23, 1287–1300. [Google Scholar] [CrossRef]
- Hao, S.; Zhou, Y.; Guo, Y. A brief survey on semantic segmentation with deep learning. Neurocomputing 2020, 406, 302–321. [Google Scholar] [CrossRef]
- Patil, S.; Varadarajan, V.; Mahadevkar, S.; Athawade, R.; Maheshwari, L.; Kumbhare, S.; Garg, Y.; Dharrao, D.; Kamat, P.; Kotecha, K. Enhancing optical character recognition on images with mixed text using semantic segmentation. J. Sens. Actuator Netw. 2022, 11, 63. [Google Scholar] [CrossRef]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Visin, F.; Ciccone, M.; Romero, A.; Kastner, K.; Cho, K.; Bengio, Y.; Matteucci, M.; Courville, A. ReSeg: A Recurrent Neural Network-Based Model for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Las Vegas, NV, USA, 26 June–1 July 2016. [Google Scholar]
- Chen, L.C.; Papandreou, G.; Kokkinos, I.; Murphy, K.; Yuille, A.L. DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 2018, 40, 834–848. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Zhang, H.; Wang, C.; Xie, J. Co-Occurrent Features in Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 548–557. [Google Scholar]
- Gu, J.; Kwon, H.; Wang, D.; Ye, W.; Li, M.; Chen, Y.H.; Lai, L.; Chandra, V.; Pan, D.Z. Multi-Scale High-Resolution Vision Transformer for Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 12094–12103. [Google Scholar] [CrossRef]
- Kulhandjian, H.; Barron, J.; Tamiyasu, M.; Thompson, M.; Kulhandjian, M. AI-based pedestrian detection and avoidance at night using multiple sensors. J. Sens. Actuator Netw. 2024, 13, 34. [Google Scholar] [CrossRef]
- Chen, J.; Ran, X. Deep learning with edge computing: A review. Proc. IEEE 2019, 107, 1655–1674. [Google Scholar] [CrossRef]
- Khan, W.Z.; Ahmed, E.; Hakak, S.; Yaqoob, I.; Ahmed, A. Edge computing: A survey. Future Gener. Comput. Syst. 2019, 97, 219–235. [Google Scholar] [CrossRef]
- Nayak, S.; Patgiri, R.; Waikhom, L.; Ahmed, A. A review on edge analytics: Issues, challenges, opportunities, promises, future directions, and applications. Digit. Commun. Netw. 2024, 10, 783–804. [Google Scholar] [CrossRef]
- Hossain, M.D.; Chen, D. Segmentation for Object-Based Image Analysis (OBIA): A review of algorithms and challenges from remote sensing perspective. ISPRS J. Photogramm. Remote Sens. 2019, 150, 115–134. [Google Scholar] [CrossRef]
- Jahanbakht, M.; Xiang, W.; Waltham, N.J.; Azghadi, M.R. Distributed deep learning and energy-efficient real-time image orocessing at the edge for fish segmentation in underwater videos. IEEE Access 2022, 10, 117796–117807. [Google Scholar] [CrossRef]
- Wang, E.K.; Chen, C.M.; Hassan, M.M.; Almogren, A. A deep learning based medical image segmentation technique in Internet-of-Medical-Things domain. Future Gener. Comput. Syst. 2020, 108, 135–144. [Google Scholar] [CrossRef]
- Ali, O.; Ali, H.; Shah, S.A.A.; Shahzad, A. Implementation of a modified U-Net for medical image segmentation on edge devices. IEEE Trans. Circuits Syst. II Express Br. 2022, 69, 4593–4597. [Google Scholar] [CrossRef]
- Gruosso, M.; Capece, N.; Erra, U. Human segmentation in surveillance video with deep learning. Multimed. Tools Appl. 2021, 80, 1175–1199. [Google Scholar] [CrossRef]
- Muhadi, N.A.; Abdullah, A.F.; Bejo, S.K.; Mahadi, M.R.; Mijic, A. Deep learning semantic segmentation for water level estimation using surveillance camera. Appl. Sci. 2021, 11, 9691. [Google Scholar] [CrossRef]
- Bruce, J.; Balch, T.; Veloso, M. Fast and Inexpensive Color Image Segmentation for Interactive Robots. In Proceedings of the 2000 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2000), Takamatsu, Japan, 31 October–5 November 2000; pp. 2061–2066. [Google Scholar] [CrossRef]
- Wan, H.; Fan, Z.; Yu, X.; Kang, M.; Wang, P.; Zeng, X. A real-time branch detection and reconstruction mechanism for harvesting robot via convolutional neural network and image segmentation. Comput. Electron. Agric. 2022, 192, 106609. [Google Scholar] [CrossRef]
- Podgorelec, D.; Uran, S.; Nerat, A.; Bratina, B.; Pečnik, S.; Dimec, M.; Žaberl, F.; Žalik, B.; Šafarič, R. LiDAR-based maintenance of a safe distance between a human and a robot arm. Sensors 2023, 23, 4305. [Google Scholar] [CrossRef] [PubMed]
- Nixon, M.S.; Aguado, A.S. Feature Extraction and Image Processing, 2nd ed.; Academic Press: Oxford, UK, 2013. [Google Scholar]
- Ding, Y.; Ping, X.; Hu, M.; Wang, D. Range image segmentation based on randomized Hough transform. Pattern Recognit. Lett. 2005, 26, 2033–2041. [Google Scholar] [CrossRef]
- Sagiv, C.; Sochen, N.; Zeevi, Y. Integrated active contours for texture segmentation. IEEE Trans. Image Process. 2006, 15, 1633–1646. [Google Scholar] [CrossRef] [PubMed]
- Zhang, K.; Zhang, L.; Song, H.; Zhou, W. Active contours with selective local or global segmentation: A new formulation and level set method. Image Vision Comput. 2010, 28, 668–676. [Google Scholar] [CrossRef]
- Jiang, X.; Mojon, D. Adaptive local thresholding by verification-based multithreshold probing with application to vessel detection in retinal images. IEEE Trans. Pattern Anal. Mach. Intell. 2003, 25, 131–137. [Google Scholar] [CrossRef]
- Phansalkar, N.; More, S.; Sabale, A.; Joshi, M. Adaptive Local Thresholding for Detection of Nuclei in Diversity Stained Cytology Images. In Proceedings of the 2011 International Conference on Communications and Signal Processing, Kerala, India, 10–12 February 2011; pp. 218–220. [Google Scholar] [CrossRef]
- Benson, C.C.; Rajamani, K.; Lajish, V.L. A review on automatic marker identification methods in watershed algorithms used for medical image segmentation. Int. J. Innov. Sci. Eng. Technol. 2015, 2, 829–839. [Google Scholar]
- Dhanachandra, N.; Manglem, K.; Chanu, Y.J. Image segmentation using K-means clustering algorithm and subtractive clustering algorithm. Procedia Comput. Sci. 2015, 54, 764–771. [Google Scholar] [CrossRef]
- Zheng, X.; Lei, Q.; Yao, R.; Gong, Y.; Yin, Q. Image segmentation based on adaptive K-means algorithm. J. Image Video Proc. 2018, 2018, 1–10. [Google Scholar] [CrossRef]
- Debelee, T.G.; Schwenker, F.; Rahimeto, S.; Yohannes, D. Evaluation of modified adaptive K-means segmentation algorithm. Comput. Vis. Media. 2019, 5, 347–361. [Google Scholar] [CrossRef]
- Manavalan, R.; Thangavel, K. TRUS Image Segmentation Using Morphological Operators and DBSCAN Clustering. In Proceedings of the 2011 World Congress on Information and Communication Technologies, Mumbai, India, 11–14 December 2011; pp. 898–903. [Google Scholar] [CrossRef]
- Baselice, F.; Coppolino, L.; D’Antonio, S.; Ferraioli, G.; Sgaglione, L. A DBSCAN Based Approach for Jointly Segment and Classify Brain MR Images. In Proceedings of the 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 25–29 August 2015; pp. 2993–2996. [Google Scholar] [CrossRef]
- Kurumalla, S.; Rao, P.S. K-nearest neighbor based DBSCAN clustering algorithm for image segmentation. J. Theor. Appl. Inf. Technol. 2016, 92, 395–402. [Google Scholar]
- Comaniciu, D.; Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 603–619. [Google Scholar] [CrossRef]
- Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef]
- Li, Z.; Chen, J. Superpixel Segmentation Using Linear Spectral Clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1356–1363. [Google Scholar]
- Muñoz, X.; Freixenet, J.; Cufı, X.; Martı, J. Strategies for image segmentation combining region and boundary information. Pattern Recognit. Lett. 2003, 24, 375–392. [Google Scholar] [CrossRef]
- Duda, R.O.; Hart, P.E. Use of the Hough transformation to detect lines and curves in pictures. Commun. ACM 1972, 15, 11–15. [Google Scholar] [CrossRef]
- Tian, Q.C.; Pan, Q.; Cheng, Y.M.; Gao, Q.X. Fast Algorithm and Application of Hough Transform in Iris Segmentation. In Proceedings of the 2004 International Conference on Machine Learning and Cybernetics, Shanghai, China, 26–29 August 2004; pp. 3977–3980. [Google Scholar] [CrossRef]
- Asano, T.; Katoh, N. Variants for the Hough transform for line detection. Comput. Geom. 1996, 6, 231–252. [Google Scholar] [CrossRef][Green Version]
- Kass, M.; Witkin, A.; Terzopoulos, D. Snakes: Active contour models. Int. J. Comput. Vision 1988, 1, 321–331. [Google Scholar] [CrossRef]
- Xu, G.; Segawa, E.; Tsuji, S. Robust active contours with insensitive parameters. Pattern Recognit. 1994, 27, 879–884. [Google Scholar] [CrossRef]
- Freedman, D.; Zhang, T. Interactive Graph Cut Based Segmentation with Shape Priors. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA, 20–25 June 2005; pp. 755–762. [Google Scholar] [CrossRef]
- Boykov, Y.; Funka-Lea, G. Graph cuts and efficient ND image segmentation. Int. J. Comput. Vision 2006, 70, 109–131. [Google Scholar] [CrossRef]
- Shapiro, L.G.; Stockman, G.C. Computer Vision; Prentice Hall: Hoboken, NJ, USA, 2001. [Google Scholar]
- Reddi, S.S.; Rudin, S.F.; Keshavan, H.R. An optimal multiple threshold scheme for image segmentation. IEEE Trans. Syst. Man Cybern. 1984, SMC-14, 661–665. [Google Scholar] [CrossRef]
- Mongus, D.; Žalik, B. An efficient approach to 3D single tree-crown delineation in LiDAR data. ISPRS J. Photogramm. Remote Sens. 2015, 108, 219–233. [Google Scholar] [CrossRef]
- Forte, A.M.; Whipple, K.X. Criteria and tools for determining drainage divide stability. Earth Planet. Sci. Lett. 2018, 493, 102–117. [Google Scholar] [CrossRef]
- Vincent, L.; Soille, P. Watersheds in digital spaces: An efficient algorithm based on immersion simulations. IEEE Trans. Pattern Anal. Mach. Intell. 1991, 13, 583–598. [Google Scholar] [CrossRef]
- Meijster, A.; Roerdink, J.B.T.M. A Proposal for the Implementation of a Parallel Watershed Algorithm. In Proceedings of the Computer Analysis of Images and Patterns, Prague, Czech Republic, 6–8 September 1995; pp. 790–795. [Google Scholar] [CrossRef]
- Kornilov, A.S.; Safonov, I.V. An overview of watershed algorithm implementations in open source libraries. J. Imaging 2018, 4, 123. [Google Scholar] [CrossRef]
- El Habib Kahla, M.; Beggas, M.; Laouid, A.; Hammoudeh, M. A nature-inspired partial distance-based clustering algorithm. J. Sens. Actuator Netw. 2024, 13, 36. [Google Scholar] [CrossRef]
- Žalik, K.R.; Žalik, B. A sweep-line algorithm for spatial clustering. Adv. Eng. Softw. 2009, 40, 445–451. [Google Scholar] [CrossRef]
- Lukač, N.; Žalik, B.; Žalik, K.R. Sweep-hyperplane clustering algorithm using dynamic model. Informatica 2014, 25, 563–580. [Google Scholar] [CrossRef]
- Mittal, M.; Goyal, L.M.; Hemanth, D.J.; Sethi, J.K. Clustering approaches for high-dimensional databases: A review. WIREs Data. Mining. Knowl. Discov. 2019, 9, e1300. [Google Scholar] [CrossRef]
- MacQueen, J. Some Methods for Classification and Analysis of Multivariate Observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, CA, USA, 21 June–18 July 1965 and 27 December 1965–7 January 1966; University of California Press: Berkeley, CA, USA, 1967; pp. 281–297. [Google Scholar]
- Luo, M.; Ma, Y.F.; Zhang, H.J. A Spatial Constrained K-means Approach to Image Segmentation. In Proceedings of the Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia, Proceedings of the 2003 Joint, Singapore, 15–18 December 2003; pp. 738–742. [Google Scholar] [CrossRef]
- Žalik, K.R. An efficient k’-means clustering algorithm. Pattern Recognit. Lett. 2008, 29, 1385–1391. [Google Scholar] [CrossRef]
- Mat Isa, N.A.; Salamah, S.A.; Ngah, U.K. Adaptive fuzzy moving K-means clustering algorithm for image segmentation. IEEE Trans. Consum. Electron. 2009, 55, 2145–2153. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; pp. 226–231. [Google Scholar]
- Ali, T.; Asghar, S.; Sajid, N.A. Critical Analysis of DBSCAN Variations. In Proceedings of the 2010 International Conference on Information and Emerging Technologies, Karachi, Pakistan, 14–16 June 2010; pp. 1–6. [Google Scholar] [CrossRef]
- Felzenszwalb, P.F.; Huttenlocher, D.P. Efficient graph-based image segmentation. Int. J. Comput. Vis. 2004, 59, 167–181. [Google Scholar] [CrossRef]
- Jones, P.F. The graphics screen. In CAD/CAM: Features, Applications and Management; Macmillan Education: London, UK, 1992; pp. 21–30. [Google Scholar] [CrossRef]
- Martin, A. Cathode ray tubes for industrial and military applications. Adv. Imaging Electron Phys. 1986, 67, 183–328. [Google Scholar] [CrossRef]
- Lin, T.Y.; Maire, M.; Belongie, S.; Hays, J.; Perona, P.; Ramanan, D.; Dollár, P.; Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Proceedings of the Computer Vision–ECCV 2014, Springer, Zurich, Switzerland, 6–12 September 2014; pp. 740–755. [Google Scholar] [CrossRef]
- Agustsson, E.; Timofte, R. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, HI, USA, 21–26 July 2017; pp. 1122–1131. [Google Scholar] [CrossRef]
- Zhang, H.; Tian, Y.; Wang, K.; Zhang, W.; Wang, F.Y. Mask SSD: An effective single-stage approach to object instance segmentation. IEEE Trans. Image Process. 2020, 29, 2078–2093. [Google Scholar] [CrossRef]
- Masubuchi, S.; Watanabe, E.; Seo, Y.; Okazaki, S.; Sasagawa, T.; Watanabe, K.; Taniguchi, T.; Machida, T. Deep-learning-based image segmentation integrated with optical microscopy for automatically searching for two-dimensional materials. NPJ 2D Mater. Appl. 2020, 4, 3. [Google Scholar] [CrossRef]
- Timofte, R.; Agustsson, E.; Van Gool, L.; Yang, M.H.; Zhang, L.; Lim, B.; Son, S.; Kim, H.; Nah, S.; Lee, K.M.; et al. NTIRE 2017 Challenge on Single Image Super-Resolution: Methods and Results. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- OpenCV: Image Segmentation with Watershed Algorithm. Available online: https://docs.opencv.org/4.x/d3/db4/tutorial_py_watershed.html (accessed on 1 October 2024).
- Curtin, R.R.; Edel, M.; Shrit, O.; Agrawal, S.; Basak, S.; Balamuta, J.J.; Birmingham, R.; Dutt, K.; Eddelbuettel, D.; Garg, R.; et al. mlpack 4: A fast, header-only C++ machine learning library. J. Open Source Softw. 2023, 8, 5026. [Google Scholar] [CrossRef]
- Bieniek, A.; Moga, A. An efficient watershed algorithm based on connected components. Pattern Recognit. 2000, 33, 907–916. [Google Scholar] [CrossRef]
- Cormen, T.; Leiserson, C.; Rivest, R.; Stein, C. Introduction to Algorithms, 4th ed.; MIT Press: London, UK, 2022. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).