^{*}

This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

An approach based on the improved quadtree structure and region adjacency graph for the segmentation of a high-resolution remote sensing image is proposed in this paper. In order to obtain the initial segmentation results of the image, the image is first iteratively split into quarter sections and the quadtree structure is constructed. In this process, an improved fast calculation method for standard deviation of image is proposed, which significantly increases the speed of quadtree segmentation with standard deviation criterion. A spatial indexing structure was built using improved Morton encoding based on this structure, which provides the merging process with data structure for neighborhood queries. Then, in order to obtain the final segmentation result, we constructed a feature vector using both spectral and texture factors, and proposed an algorithm for region merging based on the region adjacency graph technique. Finally, to validate the method, experiments were performed on GeoEye-1 and IKONOS color images, and the segmentation results were compared with two typical algorithms: multi-resolution segmentation and Mean-Shift segmentation. The experimental results showed that: (1) Compared with multi-resolution and Mean-Shift segmentation, our method increased efficiency by 3–5 times and 10 times, respectively; (2) Compared with the typical algorithms, the new method significantly improved the accuracy of segmentation.

Image segmentation divides images into partitions, which is typically used to recognize objects or other relevant information in digital images [

The using of traditional pixel-based segmentation methods, such as K-Means and ISODATA, will obtain a large number of trivial segments. However, a lot of these segments are actually noise, causing serious impact to subsequent analysis. The main difficulty of image segmentation lies in efficient region generation and merging [

Quadtree structure method was proposed in 1974 by Raphael Finkel and J.L. Bentley [

Currently, the typical bottom-up approaches (such as MS algorithm and Full Lambda-Schedule) always process from pixels, and are more likely to result in lots of trivial segments, causing great difficulty to post-processing. At the same time, remote sensing images are always huge in size. So a direct merging process based on pixels is time-consuming. Actually, for many existed algorithms, high complexity is one of the bottlenecks of their practical application. Top-down approaches such as quadtree decomposition always have a high efficiency. However, it will obtain regions of inconsistent sizes, making it difficult to conduct neighborhood searching for region merging. To overcome this difficulty and improve the efficiency of the neighborhood searching, we used the quadtree as the basic structure for analysis, and added the spatial indexing mechanism based on improved Morton coding. Finally, we conducted a region merging process based on Region Adjacency Graph (RAG).

In general, our method has two major steps: a top-down initial segmentation step and a bottom-up region merging step. In the top-down step, a quadtree initial segmentation is performed first, providing the follow-up merging process with basic region elements. In this step, we also conduct a spatial indexing creation and region feature calculation, in order to provide the RAG creation with similarity foundation and neighborhood information. In the second bottom-up step, we use RAG to express the relationship between regions. In RAG, regions are represented by nodes. If two nodes are not adjacent (judged by region neighborhood relationship obtained from the previous step), there is no edge connected. Otherwise, they are connected by an edge, representing their similarity (in RAG creation procedure, similarities are calculated by region features calculated from the previous step). Region merging is performed between the most similar and adjacent regions in RAG. Region merging will stop until the smallest similarity in RAG is larger than a given threshold. After the merging process, the final segmentation result is obtained.

Quadtree segmentation is a top-down approach, treating the whole image as the root node at the beginning, before the region is split into four rectangular sub-regions under a certain splitting criterion. The same determination is then conducted in each sub-region until all regions have satisfied the given criterion. Standard deviation is a statistical value, which can effectively measure the dispersion of a random sequence. It can characterize the variation of grey values of an image. Therefore, we use the local standard deviation as the consistency criterion for the splitting judgment. Since quadtree segmentation is an iterative algorithm, it requires great computation on standard deviation for each layer and each sub-region. So we propose a fast algorithm for standard deviation computation based on the integral image [

Quadtree segmentation based on the standard deviation splitting criterion needs to calculate the standard deviation of the sub-regions repeatedly. As the tree depth increases, the computation will also increase dramatically. In recent years, the Haar-Like feature has proven to be the most effective technique of face detection, and has been successfully applied to real-time face detection systems [_{1}_{1}_{2}_{2}

Through the combined use of the integral image and the squared integral image, we can calculate the standard deviation of any region in constant time with the following equation:
_{2}_{1}_{2}_{1}_{2,1}_{2,1} = [_{2}, _{2}) + _{1}, _{1})] − [_{1}, _{2}) + _{2}, _{1})] is the pixel squared sum and _{2,1}/

Consider for example an image 512 × 512 in size. Quadtree segmentation was performed on the structure using the standard deviation criterion and the result is shown in

From the comparison, we can see that improvement for efficiency is obvious. When a larger image is processed and the quadtree structure is more complex, this advantage will be more apparent.

Using average of standard deviation for all bands as splitting criterion, the entire image is treated as quadtree root, and then the iterative quadtree segmentation is conducted for each sub-region until all regions have satisfied the given criterion threshold.

Quadtree segmentation splits images into four parts of the same size. However, for the image with a large length-width ratio (e.g., image strip along an imaging orbit), this method will produce narrow strip initial segments. So, before the segmentation, we will judge the image length-width ratio. According to experience, we use 1.5 as the length-width ratio threshold. When the ratio is larger than 1.5, the image will be cut along the longer edge direction, in order to make the length-width ratio for each slide less than 1.5. Then, we perform quadtree segmentation on each image slide independently, evading narrow strip segmentation. The following study focuses on images with a length-width ratio of less than 1.5.

Region merging is performed between neighbouring nodes. Because segments generated from quadtree segmentation have different sizes, it is difficult to perform a quadtree neighbourhood searching between segments. Direct node traversal requires a great computation. Hence, an indexing mechanism for efficient neighbourhood searching needs to be constructed. We use Morton coding [

Morton coding assumes that the code of the root node (represents the whole image) is 0. For each of the inner nodes encoding as _{4}(3

Neighborhood searching based on Morton coding is completed through the construction of the Virtual Complete Quadtree (VCQ). VCQ is constructed by adding virtual nodes to the original incomplete quadtree. In VCQ, all leaves are at the same layer, and in which every parent has four children.

A State Lookup Table (SLT) with

RI: Real inner nodes in VCQ. Also include the real nodes having virtual offspring.

RL: Real leaf nodes in VCQ (real nodes at deepest layer).

VN: Virtual nodes in VCQ. Also include virtual inner nodes.

By using the VCQ and SLT, the neighborhood searching problem on quadtree is transformed into neighborhood searching on a uniform grid.

For a given node of code _{D}_{D}

_{D}_{D}_{D}

However, the Morton algorithm has two main shortcomings: First, for each node in neighborhood judgment, its code needed to be obtained first, and then be used as an index to query its state in SLT. This mechanism requires maintenance of both the VCQ and SLT, and also requires frequent interactions between the VCQ and SLT. The operation is complicated. Secondly, it requires traversing up the parental node when processing nodes of a VN-state. When the quadtree depth is great, this traversal is inefficient.

To solve the above problem, we improved the Morton algorithm by adding a pointer field to each node structure.

The pointer is a quadtree node typical of a three-valued case:

Case 1: If the current node state is RI, then the pointer is set to NULL;

Case 2: If the current node state is RL, then the pointer points to itself;

Case 3: If the current node state is VN and at the deepest layer, then the pointer points to its nearest ancestor node with state RI. If the node is not at the deepest layer, then the pointer is set to NULL.

Value assignment of the pointer can be completed in the quadtree construction. By using the pointer value from Case 3, we can obtain the ancestor node with state RI from the current node in one step, evading multi-step (at least one step) up-traversals; thus, greatly improving the search speed. Also, a unified node is needed in Cases 2 and 3 for neighborhood discrimination. At the same time, the 3-valued pointer can completely replace the role of the SLT without increasing storage space, thereby evading the need for additional maintenance work.

In our method, region merging is performed on adjacent and similar regions. Adjacency can be judged by using the spatial indexing mechanism discussed in the above section. Similarity, it is calculated based on region features. In this section, we discuss the calculation of region features.

We use both spectral and texture features to construct feature vector for each region. We use pixel mean value and entropy of region to represent spectral feature, and use directionality and line-likeness of the Tamura [

In the spectral aspect, the pixel mean value and entropy are used for representing regional grey scale and spectral homogeneity, respectively. The entropy is defined as:
_{i}

In the spectral aspect, considering that edge direction and intensity are the most important textural features, we use directionality and line-likeness as features to characterize the region texture. The directionality _{dir}_{D}(φ)_{p}_{p}_{p}

The line-likeness _{lin}_{Dd}

Region merging is a bottom-up process in which large regions are obtained by combining small regions under certain criteria. Regions involved are required to be spatially adjacent and similar in features. The difficulty of region merging lies in the adjacency judgment and similarity measurement. Currently, most existing methods treat each pixel as the initial region with which to discuss the merging issue. However, the merging process requires a large amount of computation at the pixel level, and requires huge storage space. Therefore, we take segments generated by quadtree segmentation as initial regions, and obtain the final segmentation result by merging these regions. The advantage of this method lies in evading pixel level region merging and having lower computation and storage size. We characterize the relationship of the area that can be combined by constructing RAG.

RAG is defined as a weighted undirected graph, in which _{i}_{j}_{i}_{j}_{i}_{j}_{i}_{j}_{i}_{j}

In [

Procedure for RAG construction has two main steps:

^{2n + 1} − 2^{n + 1} virtual edges need to be constructed. There are 24 virtual edges in the example depicted in

After above processing, the initial RAG is created, as shown in

After the initial construction, region merging is performed on the RAG. The topological structure of the RAG will be changed in merging process. The merging process always occurred between the most similar and adjacent regions. After merging of two regions, a new ID is given to the new region and the feature vector is updated. Feature vector of the new merged region is calculated as the weighted average of the area by the pre-merging regions:
_{k}

In order to examine the method proposed by this paper, we perform our method on GeoEye-1 (0.41 m resolution, Experiment A) and IKONOS (1.0 m resolution, Experiment B) images, which are two common high-resolution satellite remote sensing images. Image sizes are 961 × 747 and 945 × 734 pixels respectively. Experiment is implemented by Visual C++ 9.0, and performed on Windows 7 operating system. The hardware configuration is 2G memory and 2.10 GHz CPU.

Images used in our experiment are all true color fusion images. So, standard deviation averages of the red, green and blue bands, _{s}_{s}

Comparison of the above quadtree initial segmentation is given in the following

Comparison of quadtree initial segmentation with different _{s}

_{s} |
||||||
---|---|---|---|---|---|---|

Exp. A | 3 | 2.3872 | 0.5140 | 2.9012 | 9 | 147586 |

10 | 1.2793 | 0.3252 | 1.6045 | 8 | 98185 | |

30 | 0.5634 | 0.1173 | 0.6807 | 8 | 18147 | |

| ||||||

Exp. B | 4 | 2.0366 | 0.6102 | 2.6468 | 9 | 186034 |

12 | 1.4622 | 0.5615 | 2.0237 | 9 | 97336 | |

25 | 0.4621 | 0.2614 | 0.7235 | 8 | 33859 |

In the above experiment, we proposed to use the standard deviation criterion fast calculation as a method to determine the initial quadtree segmentation. In order to test the efficiency of the proposed calculation, we compared our calculation with the traditional standard deviation calculation method, using the same image and parameter settings as depicted in

Initial segmentation completed with a too small _{s}

Region merging is performed after the quadtree initial segmentation. The merging scale can be controlled by using different merging thresholds _{m}

In Experiment A (_{m}_{m}_{m}_{m}

Two merging results of building roofs in Experiment A were used as typical examples, and their quadtree nodes and merging results are shown in

To validate the method, our method was compared with the MR and MS algorithms for precision and speed analysis. The MR segmentation algorithm was performed with the commercial image analysis software, eCognition, using software recommended settings for urban image segmentation. _{Scale}_{Scale}_{Scale}_{Scale}_{s}_{m}

Number of regions obtained and time consumption of three segmentation algorithms are shown in

To test the precision of the algorithm proposed in this paper, we used a vector map produced by a professional image interpreter as ground truth data, containing building and vegetation as two classes of ground targets. The ground truth map includes 47 building objects and 38 vegetation objects for Experiment A, and 98 building objects and 53 vegetation objects for Experiment B. Each region generated by segmentation was marked as building or vegetation category through human-computer interaction for the three methods. The segmentation results with the category mark were used for precision analysis and comparison.

We defined segmentation precision from two aspects: segmentation accuracy and object integrity. Segmentation accuracy is defined as: for a type of ground target, the area of the correct region (regions generated by segmentation process) proportional to the total area of the ground targets:
_{gt}_{seg}

Whether regions obtained by segmentation can express ground true objects is a key question for following object-oriented information extraction. We defined object integrity for a type of ground targets as: ratio of ground truth object count and correct region (regions generated by segmentation process) count:
_{gt}

The precision analysis results of two experiments are shown in

Comparison of segmentation precision of MR, MS, and our algorithm.

Exp. A | MR method | 90.50% | 31.79% |

MS method | 90.77% | 29.35% | |

Our method | 92.45% | 51.63% | |

| |||

Exp. B | MR method | 91.36% | 27.92% |

MS method | 89.32% | 25.11% | |

Our method | 95.74% | 48.44% |

It can be seen from the precision comparison that the three algorithms all have high segmentation accuracy levels. However, in the case of object integrity, our algorithm has more obviou advantages. It has stronger object-oriented features.

Both MR and MS segmentation are bottom-up algorithms, started from pixels. Therefore, it is unavoidable to produce inadequate region merging phenomenon as can be seen from ^{2}). It is difficult to effectively apply to large remote sensing image segmentation. The first step of our algorithm is an up-bottom procedure that evades subsequent pixel level region merging. To some extent, this mechanism will inhibit the over merging phenomenon. It is also more conducive to improving the efficiency of the region merging.

This paper presents a high-resolution remote sensing image segmentation algorithm based on improved quadtree structure and RAG technique. Our algorithm includes two steps: up-bottom initial segmentation and bottom-up region merging. In the first step, we proposed a fast method for standard deviation calculation method. By using this improved the segmentation efficiency of quadtree is improved for 4–6 times. This improvement has significance for huge remote sensing image processing. After the creation of quadtree structure, a spatial index mechanism based on improved Morton coding was added to quadtree structure, providing a fast neighborhood data access mechanism. In the second step, we use a region merging algorithm based on RAG to obtain the final segmentation result. Our algorithm is tested on a true color fusion GeoEye-1 image. We use both segmentation accuracy and object integrity as indicators for segmentation result evaluation. The result shows that the accuracy of our method is higher than 90%, and the object integrity is higher than 50%. Our segmentation method will provide subsequent process such as target objects classification and other applications with more accurate data.

This work is partially funded by the National High Technology Research and Development Program of China (863 program, Grant Number 2012AA121303) and the National Natural Science Foundation of China (Grant Number 71273154).

The authors declare no conflict of interest.

Flowchart of proposed method (use rectangular and round rectangular block to denote processing and input/output, respectively).

An example of quadtree segmentation with four layers.

Quadtree segmentation and Morton coding (

Virtual Complete Quadtree (VCQ), uniform grid and the Morton coding of the example shown in

Quadtree structure with pointer field (grey part is the virtual structure).

A two-layered quadtree segmentation and its corresponding Region Adjacency Graph (RAG). (

Initial RAG construction using the virtual grid (the virtual structure is represented in grey). (

Region merging schematic diagram. (

Quadtree initial segmentation results for GeoEye-1 image in Experiment A. (_{s}_{s}_{s}

Quadtree initial segmentation results for IKONOS image in Experiment B. (_{s}_{s}_{s}

Region merging results with different threshold values. (_{m}_{m}_{m}_{m}

Quadtree nodes and region merging of two building roofs in Experiment A. (_{m}

Segmentation results of MR, MS method and our method. (

Object integrity calculation illustration (_{gt}_{seg}_{gt}_{seg}_{gt}_{seg}

Complexity comparison between traditional method and our method for quadtree segmentation using standard deviation criterion (using example shown in

Traditional method | 2015198 | 671778 | 1572864 |

Our method | 524422 | 102 | 524228 |

A State Lookup Table (SLT) for above quadtree example.

RI | RI | RI | RI | RI | VN | VN | VN | VN | RL | RL | |

| |||||||||||

RL | RL | VN | VN | VN | VN | VN | VN | VN | VN |

Region number and time consuming comparison of MR, MS, and our method.

Exp. A | MR method | 1487 | about 8 |

MS method | 1853 | 29.77 | |

Our method | 657 | 3.7530 | |

| |||

Exp. B | MR method | 1659 | about 10 |

MS method | 1968 | 22.13 | |

Our method | 711 | 4.57 |