Fusion Feature Multi-Scale Pooling for Water Body Extraction from Optical Panchromatic Images

Qi, Baogui; Zhuang, Yin; Chen, He; Dong, Shan; Li, Lianlin

doi:10.3390/rs11030245

Open AccessArticle

Fusion Feature Multi-Scale Pooling for Water Body Extraction from Optical Panchromatic Images

by

Baogui Qi

¹,

Yin Zhuang

^2,*

,

He Chen

¹,

Shan Dong

³ and

Lianlin Li

²

¹

Beijing Key Laboratory of Embedded Real-time Information Processing Technology, Beijing Institute of Technology, Beijing 100081, China

²

School of Electronic Engineering and Computer Science, Peking University, Beijing 100087, China

³

Engineering Center of Digital Audio and Video, Communication University of China, Beijing 100024, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2019, 11(3), 245; https://doi.org/10.3390/rs11030245

Submission received: 16 December 2018 / Revised: 18 January 2019 / Accepted: 21 January 2019 / Published: 24 January 2019

(This article belongs to the Special Issue Advanced Modelling in Water Resources Using GIS and Remote Sensing Techniques)

Download

Browse Figures

Versions Notes

Abstract

:

Water body extraction is a hot research topic in remote sensing applications. Using panchromatic optical remote sensing images to extract water bodies is a challenging task, because these images have one level of gray information, variable imaging conditions, and complex scene information. Refined water body extraction from optical panchromatic images often experiences serious under- or over- segmentation problems. In this paper, for producing refined water body extraction results from optical panchromatic images, we propose a fusion feature multi-scale pooling for Markov modeling method. Markov modeling includes two aspects: label field initialization and feature field establishment. These two aspects are jointly created by the fusion feature multi-scale pooling process, and this process is proposed to enhance the feature difference between water bodies and land cover. Then, the greedy algorithm in the iteration conditional method is used to extract refined water bodies according to the rebuilt Markov initial label and feature fields. Finally, to prove the effectiveness of proposed method, extensive experiments were used with collected 2.5m SPOT 5 and 1m GF-2 optical panchromatic images and evaluation indexes (precision, recall, overall accuracy, kappa coefficient and boundary detection ratios) to demonstrate that our proposed method can produce more refined water body extraction results than the state-of-the-art methods. The global and local refined indexes are improved by about 7% and 10%, respectively.

Keywords:

fusion feature; multi-scale pooling; Markov modeling; panchromatic optical images; remote sensing; water body extraction

Graphical Abstract

1. Introduction

Remote sensing techniques are widely used for large area observation of the earth. Here, one important application is harbor water resource management and monitoring [1,2,3,4,5]. Refined water body extraction can better support water resource analysis such as disaster and pollution warnings [6]. Therefore, water body extraction from remote sensing images has become a hot research topic [7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37]. For refined water body extraction, many researchers have proposed a range of methods. These methods can be roughly summarized into two categories: spectral analysis based and object segmentation based. These spectral analysis-based methods [7,8,9,10,11,12,13,14,15,16,17,18,19,20] are widely used to extract water bodies. Near-infrared radiation (NIR) and shortwave infrared radiation (SWIR) are usually used to extract water bodies because water can absorb NIR and SWIR bands. However, the NIR and SWIR bands cannot provide accurate water body extraction results for water resource analysis. Therefore, the normalized difference water index (NDWI) and modified normalized difference water index (MNDWI) were applied to water body extraction [7,13]. These indexes can enhance water features by using the high reflectance of water in the green band and low reflectance in the NIR band. NDWI and MNDWI are introduced in Table 1. Here, the automated water extraction index (AWEI) proposed by Feyisa et al. [11], uses a multiple bands index to improve the performance of water body extraction. However, this method does not provide accurate results for complex scenes. Aiming to improve inland water body extraction accuracy, many high spectral resolution sub-pixel methods [16,17,18,19,20] have been proposed. Huan et. al. [19] proposed an automatic subpixel water mapping (ASWM) method to extract water body from urban areas, and Zhonghua et. al. [17] proposed an automatic sub-pixel coastline extraction (ASPCE) method to complete water body boundary modeling and refined inland water area extraction. However, given interferences of land area vegetation, soil and artificial buildings, the endmember selection and water abundance estimation for the unmixing process, are challenging tasks and struggle to establish a detailed water-land boundary.

The object segmentation-based methods use high spatial resolution to delineate water body features given their more visible features. Some researchers considered higher spatial resolution pan-sharpened remote sensing images [21,22,23,24,25,26,27] to study water body extraction. Tiagrajah et al. [21] used water body pixel level features to train a self-organized machine (SOM) and extract water bodies. We previously [22] built a defined circle model based on the water area color space description to extract harbor area water bodies. Then, Cheng et al. [23] proposed the structure edge network (SeNet), which added the edge and neighbor relation constraints into the loss function to prevent under-segmentation and achieve extract water bodies from pan-sharpened water-land images. To the best of our knowledge, spectral and spatial resolutions vary inversely [28,29]. Optical panchromatic remote sensing images can provide better quality high spatial resolution images, without any spectral information. Due to the high spatial resolution and clearly visible features, the optical panchromatic remote sensing images have the potential to produce refined water body extraction results. Therefore, we follow object segmentation methods to study refined water body extraction from panchromatic remote sensing images. In object segmentation-based methods [30,31,32,33,34,35,36,37,38,39], thresholding is widely used for water body extraction based on gray level histograms.

Figure 1 provides two kinds of panchromatic water-land scenes and their gray level histograms. (In this paper, Figure 1a is called the easy sample and Figure 1b is the hard sample). In Figure 1, the water area almost appears darker gray in the histograms. However, when the scene images gray level histogram is as shown in Figure 1a, a more suitable threshold value can be obtained to extract the water body, but it cannot generate refined water body extraction due to the land area objects interference. As gray level histogram features are insufficient to produce refined water body extraction from panchromatic images, many feature descriptions have been considered (i.e., topology, texture, wavelet, and morphological features) for object segmentation-based methods [30,31,32,33,34,35,36,37,38,39]. Xia et al. [37] used local binary pattern (LBP) as texture descriptor to extract water bodies from water-land scenes. Liu et al. [30] proposed a method for water-land segmentation using panchromatic images based on the entropy feature combined with improved MNcut and Chan-Vese methods. Zewen H. et al. [31] applied graph theory to achieve water area extraction from optical remote sensing images. Ma L. et al. [32] proposed a hierarchical water body extraction method based on texture and gray level information. These object segmentation-based methods all focus on water body feature delineation. However, the performance of these methods shows that their water body feature description limitation leads to poorly refined water body extraction results. State-of-the-art object segmentation-based methods always apply morphology operations to prevent the under- or over-segmentation problems. Although, morphology dilation and erosion operations can effectively eliminate under or over-segmentation problems, but they simultaneously destroy edge structure information. Therefore, producing refined water body extraction results from panchromatic images, remains a challenging task.

In this paper, aiming to accurate water body extraction from panchromatic images. We propose a method involving Markov modeling of fusion feature multi-scale pooling. The workflow is shown in Figure 2. First, we consider the discriminative feature description of water body and the probable land cover. The fusion feature map is composed of gray level, gradient and entropy features which has been proven effective in previous studies [30,32,37]. Then, the proposed fusion feature multi-scale pooling operation further enhances the difference between water and land areas to build the multi-scale description. Second, in relation to powerful water body multi-scale delineation, the initial label field and feature field are rebuilt for Markov modeling. Third, the iterative conditional model (ICM) is employed to generate the final refined water body extraction result. Finally, the collected 2.5m SPOT 5 and 1m GF-2 data are used to demonstrate proposed method. The contributions of this paper can be summarized as follows: (1) novel and powerful feature description method is proposed for the multi-scale delineation of water bodies; (2) based on the fusion feature multi-scale pooling operation, the initial label field and feature field can be jointly set up for Markov modeling. Then, the proposed label field initialization and feature field calculation can achieve rapid model convergence. (3) For refined water body extraction, the optimal iteration process without any morphology operations is proposed to gradually achieve refined water body extraction.

2. Study Areas and Data Sources

Seven Chinese ports were chosen as study areas as shown in Figure 3. Shanghai is a famous port in China, and it is one of the largest freight ports and economic centers. Qingdao port is located in Jiaozhou Bay in Shandong Peninsula, and is the second largest port of external trade with an annual capacity of 100 million tons. Qingdao port is an important international trade port and a maritime transport hub on the western coast of the Pacific Ocean. Tianjin port is a comprehensive and important external trade port, and is the highest artificial deep-water harbor of the world that was built by digging seas and reclamation on muddy shoals. Shenzhen port is a world-famous global container port, with wharf berths for the largest cruise ships. Dalian port is the most advanced transshipment base for bulk liquid chemical products in Asia. Xiamen port is a natural harbor on the southeastern coast of China. In 2017, its annual throughput was 10.38 million TEUs ranking 14th in the world. Haikou is a new port area. In the future, it will become an important logistics center. All panchromatic water-land scenes images were collected from these ports, and the images were collected to include variance water area (i.e., calm water, weak waves, strong waves, and polluted water) and land area (i.e., vegetation, buildings, shadows, and thin cloud cover) to demonstrate the effectiveness of the proposed method.

The information collected from SPOT5 and GF-2 is outlined in Table 2. SPOT 5 is the series of satellites from France, launched in 2002. SPOT 5 has a high-resolution stereo imaging device that obtains 120 × 120 km panchromatic images, and their panchromatic images have a spatial resolution of either 2.5 m or 5 m. GF-2 produces 0.8 m or 1 m spatial resolution satellite images independently developed in China, and launched in 2014.

3. Methodology

When extracting water bodies from panchromatic remote sensing images, the discriminate feature description of a water body is an important step. For water body feature delineation in panchromatic images, the gray level feature is not enough to produce refined extraction. Here, most studies considered texture and gradient as the discriminating feature for refined water body extraction [32,33,37], because they defaulted water areas as having a smooth texture. However, water areas often include strong waves and cloud cover situations, which create a complex texture. These situations lead the texture descriptor to produce a poorly refined extraction result [37]. In this paper, we were inspired by the deep learning feature extraction framework and used the pooling operation principle to aggregate fusion feature and further enhance water body feature delineation [25,27]. Next, the fusion feature multi-scale pooling description for water body was applied for Markov modeling and gradually generating refined water body extraction results via an optimized iteration process.

3.1. Fusion Feature Map Generation

First, we built a fusion feature map that comprehensively uses gray level, gradient, and entropy information to achieve water area feature description. The fusion image can be expressed as:

I_{F u s i o n I m g} = I_{g r a y} + I_{G r a d i e n t} + I_{E n t r o p y},

(1)

where I_FusionImg is the expected fusion image, I_gray represents the gray level feature map, I_Gradient is the gradient feature map produced by the Sobel gradient operator, and I_Entropy is the entropy feature map calculated from the gray level image. Through the fusion feature map generation, the water area delineation would be enhanced due to the fusion features containing three dimensions: gray, gradient, and entropy, as proven by previous works [30,32,37]. With respect to the performances reported in previous works, the generated fusion feature map can extract water bodies using threshold segmentation methods, but it cannot produce a refined water body extraction result. Therefore, we introduce an enhanced method with fusion feature multi-scale pooling in order to achieve a more powerful water body multi-scale delineation.

3.2. Fusion Feature Multi-Scale Pooling

Generated fusion feature maps contain gray level, gradient, and entropy information. Since these fused features can individually discriminate a water body, the pooling operation has powerful feature aggregation ability, which is typically used for convolutional neural networks (CNNs) [23,25]. Therefore, we propose a fusion feature multi-scale pooling operation to further enhance fusion feature description of water bodies. The fusion feature multi-scale pooling process is shown in Figure 4.

In Figure 4, the fusion feature is employed to use different sizes of pooling area to generate multi-scale delineation. Then, small-scale pooling can produce more detailed enhanced edge information than large-scales pooling. For the pooling operation, the land and water area feature would be aggregated as:

I m g (L) = p o o l i n g (α_{L}, I_{F u s i o n I m g})

(2)

where, Img(·) is the multi-scale delineation, L is the index of scale, pooling(·) is the average pooling operation, and

α_{L}

is the scale factor for controlling the pooling sizes, Setting a series of scale factors can complete the multi-scale delineation for fusion feature I_FusionImg. The multi-scale pooling feature maps and these corresponding binary images are shown in Figure 5, which depicts different results using different feature pooling sizes. The performance of each extracted pooling scale delineation of a water body can be directly related to their binary results.

Figure 5, provides an example of different pooling scales in the proposed fusion feature multi-scale pooling delineation method. In first line of Figure 5, multi-scale pooling is presented as the series of multi-scale pooling description from small to large scales. For the large pooling scale description result in the first line Figure 5c,d, the land area has a strong aggregated response compared with small scales Figure 5a,b, which means there is an obvious feature difference between water and land areas. Corresponding to the binary masks of different pooling scale feature maps, on the second line in Figure 5, the red circle area shows the edge structure location is maintained in small pooling scale description Figure 5a,b, and area integrity is maintained with large pooling scale description Figure 5c,d. In general, the proposed water body fusion feature multi-scale pooling delineation method contains detailed local edge structure, as well as global area integrity information.

We also analyzed the land cover, water area and their boundary pixels feature aggregated response in the series multi-scale pooling description. In Figure 6, the first line is the original panchromatic optical remote sensing images. The second line is the original images corresponding to the gray level histograms. A special threshold in the panchromatic optical remote sensing images cannot be found for the adaptive complex scenes to achieve refined water body extraction from gray level histogram feature. The third line, provides analyses of the histogram pixel response of fusion feature multi-scale pooling for different pooling scales. The horizontal axis denotes the description scale size, and the vertical axis represents the pixel response value in different scales.

The red line outlines the land area pixels feature aggregated responses. The yellow line denotes the aggregated boundary pixels feature responses. The blue line indicates the water body pixels feature aggregated responses. These red, yellow, and blue lines in the multi-scale pooling operation can help better distinguish water body area features. By comprehensively analyzing Figure 5 and Figure 6, the constructed fusion feature multi-scale pooling description of water body has powerful delineation ability. However, we cannot produce a refined water body extraction result from any pooling scale. Therefore, we considered using the Markov framework to merge the advantages of different description scales and extract refined water bodies using fusion feature multi-scale pooling delineation followed by the iteration condition model (ICM) optimization process. The refined water body extraction method proposed in this paper does not include dilation and erosion operations because these morphology operations would severely destroy detailed edge structure.

3.3. Markov Modeling for Refined Water Body Extraction

Based on the fusion feature multi-scale pooling description, we propose a Markov modeling method combined with an ICM optimal process to generate the final refined water body extraction result. Figure 7 shows the proposed Markov modeling framework for extracting refined water bodies.

In Figure 7, label field initialization and feature field description are the main steps of Markov modeling. The feature field

S = {F (N_{1}), \dots, F (N_{L})}

can be set up using the proposed fusion feature multi-scale pooling description. Each pixel can be seen as a feature vector, and each feature vector has a series of response values in different pooling scales. In addition to the feature field setting, label field initialization is an important step that effects the final refined water body extraction performance, because poor label field initialization requires considerable time to achieve Markov model convergence or to guide to the local optimum solution. Therefore, a suitable initial label field is necessary. Classical label field initialization employs classification based methods such as K-means and other clustering methods [40,41], but these often lead to the local optimum solution. Therefore, we used binary masks of series of multi-scale pooling feature maps to set the initial label field according to the observation function. Before we describe the label field initialization, the Markov modeling method should be introduced. In the Markov model, the neighborhood system is an important concept. For water body extraction, the neighborhood system can be expressed as follows:

S = C_{1} \cup C_{2} .

(3)

where, in the general neighborhood system concept, any image can be composed of several neighborhood systems such as

S = {N_{1}, \dots, N_{L}}

, where N_L represents one neighborhood system; and C₁ and C₂ represent two neighborhood systems of the land cover and water area of the observation images, respectively. During the optimal iteration process, whether the pixel groups in each neighborhood system are updated into other neighborhood systems depends on the neighborhood relation model setting and their feature field calculation. For Markov modeling to describe the neighborhood system, the Markov properties are employed, which can be expressed as:

S = {X_{1}, X_{2}, X_{3}, \dots, X_{n}} .

(4)

P (X = x) > 0, \forall X \in Ω .

(5)

P (x_{i} | X_{N_{i}}) = P (x_{j} | X_{N_{j}}) \begin{matrix} , & \begin{matrix} i f & \forall j \neq i, N (j) \in \end{matrix} \end{matrix} N_{j} .

(6)

where S is the observation image that can also be represented as the cliques X_n combination. Here, clique groups consist of one neighborhood system. Equations (5) and (6) define the positive definiteness and homogeneity property of the Markov principle, where, x_i and x_j represent two different pixels, which is the probability model of two pixels belonging to one neighborhood system. Then, when all clique groups meet the Gibbs distribution of one image, the Markov probability model can be calculated. Based on the Hammersley-Clifford theorem [42,43], the Gibbs distribution is equivalent to the Markov random field, and the clique group in a neighborhood system can be expressed as:

P (X = x) = \frac{1}{Z} \exp^{- U (x)} .

(7)

which is the Gibbs distribution expression for a clique group X in a neighborhood system, where x represents a pixel in any clique group, Z is a normalized constant, and U(x) is an energy-splitting function of the Gibbs distribution, which can be expressed as:

U (x) = \sum_{c \in C} V_{c} (x),

(8)

where C is the total number of groups. The Gibbs energy is the sum of all clique groups’ energy

V_{c} (x)

of the neighborhood system. When the adjacent pixels x_i and x_j belong to the same neighborhood system, there is no penalty

V (x_{i}, x_{j}) = 0

. Otherwise, it generates the penalty

V (x_{i}, x_{j}) = β

. Therefore, the clique-potential function can be expressed as:

V (x_{i}, x_{j}) = {\begin{matrix} \begin{matrix} 0 & (x_{i} = x_{j}) \end{matrix} \\ \begin{matrix} β & (x_{i} \neq x_{j}) \end{matrix} \end{matrix} .

(9)

Equation (9) is Potts cost model [42], and

β

is the potential function parameter, which shows the label field impact in the Markov model. Combining Equations (7)–(9), the Gibbs distribution energy function can be expressed as:

P (x_{i} | x_{N_{i}}) = \frac{\exp (- β n_{i} (x_{i}))}{\sum_{x_{i} \in L} \exp [- β n_{i} (x_{i})]} .

(10)

Equation (10) determines the cost probability model

P (x_{i} | x_{N_{i}})

for judging whether or not a clique’s pixel x_i belongs to the current neighborhood system

n_{i} (\cdot)

. Equation (10) enables the computation of Markov modeling. Then, we chose first– and second–order clique models to describe cliques in the neighborhood system, which can be expressed as:

U_{i} (x_{i}, x_{N_{i}}) = V_{1} (x_{i}) + \sum_{j \in N_{i}} V_{2} (x_{i}, x_{j}),

(11)

where

V_{1} (x_{i})

is the first-order system, which only considers whether it belongs to the current neighborhood system;

V_{2} (x_{i}, x_{j})

is the second-order system, that not only considers itself, but also whether its first-order neighborhood pixels belongs to current neighborhood system. Here, the higher order clique model was considered, but it would require excessive computation resources. Based on Equations (10) and (11), the Markov probability model can be expressed as:

P (x_{i} | x_{N_{i}}) = \frac{\exp [- V_{1} (x_{i}) - \sum_{j \in N_{i}} V_{2} (x_{i}, x_{j})]}{\sum_{x_{i}} \exp [- V_{1} (x_{i}) - \sum_{j \in N_{i}} V_{2} (x_{i}, x_{j})]} .

(12)

Refined water body extraction has to observe two neighborhood systems (i.e., water area or land cover) of the initial label field from panchromatic optical remote sensing images. Here, we applied the Bayesian framework as the parameter optimization method to minimize the cost function. The Bayesian principle can be expressed as:

P (X = x | Y = y) = \frac{\prod_{i = 1}^{L} [P (y^{i} | X = x) P (X = x)]}{P (Y = y)},

(13)

where

P (Y = y)

represents the probability model of the observation panchromatic optical remote sensing image,

P (y^{i} | X = x)

is probability model that shows that current pixel x belongs to clique X and to one neighborhood system

y^{i}

, and L is the number of neighborhood systems in the observation image. For refined water body extraction applications, the water area pixel should be labeled as “0”, and the land area should be labeled as “1”. Let the current prior probability meet the conditional independence and Gaussian distribution hypothesis. The refined water body extraction can be expressed as:

{\hat{x}}_{L} = \underset{x_{L}}{a r g m a x} {P (y_{L}^{i} | X_{L} = i) P (X_{L} = i | X_{N_{L}})} .

(14)

P (y_{L}^{k} | X_{L} = i) = \frac{1}{\sqrt{2 π σ_{i}^{2}}} e x p [- \frac{{(y_{L}^{i} - μ_{i})}^{2}}{2 σ_{i}^{2}}] .

(15)

where

{\hat{x}}_{L}

represents the optimal label field, which is the refined water body extraction result, with the maximum prior probability likelihood. Then, Equation (15) is used to optimize the process of parameter estimation. Here, the Gaussian distribution is defaulted to each pooling scale feature map label field neighborhood system. Therefore, in this paper, the observation function of the refined water body extraction result can be expressed as:

\begin{array}{l} {\overset{⌢}{x}}_{L} & = \arg \max_{x_{L}} {\frac{1}{\sqrt{2 π σ_{i}^{2}}} \cdot \frac{\exp [- (\frac{{(y_{L}^{i} - μ_{i})}^{2}}{2 σ_{i}^{2}} + V_{1} (x_{i}) + V_{2} (x_{i}, x_{j}))]}{\sum_{x_{i}} \exp [- (V_{1} (x_{i}) + V_{2} (x_{i}, x_{j}))]}} \\ = \arg \max_{x_{L}} {\exp [- (\frac{{(y_{L}^{i} - μ_{i})}^{2}}{2 σ_{i}^{2}} + V_{1} (x_{i}) + V_{2} (x_{i}, x_{j}))]} \\ = \arg \max_{x_{L}} {\exp [- (\frac{{(y_{L}^{i} - μ_{i})}^{2}}{σ_{i}^{2}} + β N_{x_{i}} (x_{i}, x_{j}))]} \\ = \arg \min_{x_{L}} (\frac{{(y_{L}^{i} - μ_{i})}^{2}}{σ_{i}^{2}} + β N_{x_{i}} (x_{i}, x_{j})) \end{array} .

(16)

Equation (16) is the minimized observation function used to find optimal label field according to feature field calculation.

β N_{x_{i}} (x_{i}, x_{j})

represents the cost energy of label field that is related to Equation (9), and

{(y_{L}^{i} - μ_{i})}^{2} / σ_{i}^{2}

is used to calculate the feature field mean value

μ_{i}

and variance

σ_{i}^{2}

in one neighborhood system based on Gaussian distribution and conditional probability statistics independence assumptions. Given the fusion feature multi-scale pooling description, each neighborhood system in different pooling scales can be expressed as

y_{L}^{i} = [y_{L}^{i} (1), y_{L}^{i} (2), \dots, y_{L}^{i} (S)]

. Therefore, for the initial label field, each pixel in the clique of its neighborhood system’s mean and variance values can be calculated as

μ_{i} = [μ_{i} (1), μ_{i} (2), \dots, μ_{i} (S)]

and

σ_{i} = [σ_{i}^{2} (1), σ_{i}^{2} (2), \dots, σ_{i}^{2} (S)]

. Let

d_{L} = y_{L}^{i} - μ_{i}

,

d_{L} = [d_{L} (1), d_{L} (2), \dots, d_{s} (S)]

. Then, the minimized Equation (16) can be expressed as:

E n e r g y = \sum_{S} d_{L} \sum_{i}^{- 1} d_{L}^{T} + \log | \det (\sum_{i}) | + β N_{x_{i}} (x_{i}, x_{j}) .

(17)

where

\sum_{i}

expresses the conditional co-occurrence matrix, and the second term can prevent the first term from being a minimum number. In this paper, ICM was used to find the minimum number for Equation (17) using the iteration optimal process but the iteration process is time consuming. Therefore, the label field initialization becomes an important problem previously mentioned. When we chose a suitable initial label field, the model rapidly converges. Following this idea, when we completed the fusion image multi-scale pooling process, the each pooling scale corresponded to a series of binary masks that would be produced by the OTSU algorithm [44] from the generated fusion feature multi-scale pooling description. Then, considering the observation function in Equation (17), we chose the minimized energy value from the generated series binary masks. Using the minimized energy binary mask as the initial label field can make achieve rapid Markov mode convergence. Experiments were also carried out, as discussed in Section 3.5. This label field initialization method requires fewer iterations to reach model convergence, and can avoid the local optimum. Figure 8 shows the initial label field setting process proposed in this paper.

In Figure 8, the small red arrow represents the feature field calculation, and the initial label field can be set from these binary masks that have the lowest energy of the observation function. When finished the label field initialization, the ICM process is used to produce the final water body extraction result as shown in Figure 7.

Another problem is the potential parameter

β

setting. When the label field is towards to an optimum solution, the impact of the label field has to be reduced. In general, the potential parameter is often set as a small number between 0 and 1. However, we considered the variable

β

in the ICM process; the potential parameter

β

decreases as the iterations increase. Then, for iteration optimization, the feature field calculation would facilitate the refined water body extraction result. Thus, the potential parameter

β

can be expressed as:

β = \exp (\frac{1}{T (t)}),

(18)

where T(t) is a variable value that increases with iterations. Substituting Equation (18) into Equation (17), the proposed final observation function of Markov modeling can be represented as:

E n e r g y = \sum_{S} d_{L} \sum_{i}^{- 1} d_{L}^{T} + \log | \det (\sum_{i}) | + \exp (\frac{1}{T (t)}) N_{x_{i}} (x_{i}, x_{j}) .

(19)

In general, our proposed fusion feature multi-scale pooling description and Markov modeling processes produce a refined water body extraction result. Then, extensive experiments were used to demonstrate that our proposed method can produce refined water body extraction results for any complex water-land harbor scenes. We chose some state-of-the-art methods and employed evaluation indexes to prove that our proposed method can produce better results than other methods.

3.4. Evaluation Indexes

To evaluate the refined water body extraction performance, we used Precision, Recall, and Overall Accuracy global indexes, and the local refinement indexes of Kappa coefficient and boundary detection ratios to prove our proposed method. First, these global evaluation indexes can be expressed with Equations (20)–(22), and they show the water body extraction completeness.

P r e c i s i o n = \frac{T P}{T P + F P} .

(20)

R e c a l l = \frac{T P}{T P + F N} .

(21)

O v e r a l l A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N} .

(22)

where TP is the number of pixels that are correctly predicted as water area, TN is the correctly predicted land area pixels, FP is the number of pixels where land area pixels are incorrectly predicted as water area pixels, and FN is the number of pixels where water pixels are incorrectly predicted as land area pixels. Therefore, ignoring water body boundary extraction performance, these indexes can provide an overall performance evaluation. Next, the local refinement evaluation indexes of boundary keeping performance were calculated using Equations (23)–(25), which highlight the accuracy of the refined water body extraction.

κ = \frac{N \sum_{l = 1}^{M} N_{l l} - \sum_{l = 1}^{M} (N_{l +} \cdot N_{+ l})}{N^{2} - \sum_{l = 1}^{M} (N_{l +} \cdot N_{+ l})} .

(23)

R_{b} = \frac{T_{a}}{T_{s e g}} .

(24)

R_{c} = \frac{T_{a}}{T_{r e a l}} .

(25)

where

κ

is the Kappa coefficient, N is the total number of pixels in a panchromatic optical remote sensing image, N_ll is the correctly predicted pixels of water and land areas, N_l+ represents the total number of water body area pixels in the testing dataset, and N_+l is the total number of land area pixels in the testing dataset. For refined water body area extraction applications, M represents the number of classes, which is M = 2 here. In Equations (24) and (25), which are the boundary detection ratios, T_a is the actual number of the water and land boundary pixels in the water body extraction result, T_seg is the number of water body extraction boundary pixels, and T_real is the actual water and land boundary pixels in a panchromatic optical remote sensing image. Using the Kappa coefficient and the boundary detection ratio, we evaluated the local boundary structure produced by the proposed and other methods.

3.5. Optimal Parameter Setting

After the evaluation indexes definition, before comparison with state-of-the-art methods, we discuss the setting of optimal parameters for the proposed refined water body extraction method. First, a different scale factor

α_{L}

setting is discussed with a fixed number of ICM iterations.

In Figure 9a, for a single pooling scale description, when the scale factor

α_{L}

is set to a small number (e.g., 25, 50, and 75), the water body extraction results are poor, demonstrating severe under–segmentation that affects the evaluation of global and local indexes. Setting the scale factor to a larger number (e.g., 250, 275, and 300), produces better performance in terms of the local refinement indexes but the over-segmentation problem produces poor performance in the global indexes. Therefore, setting a suitable scale factor is very important for fusion feature single pooling scale description, which has to balance under– and over– segmentation problems. When the best scale factor of

α_{L} = 137

is used, the precision is 81.64%, recall is 85.06% and Overall Accuracy is 81.96%. For the boundary performance in Figure 9b, when

α_{L} = 137

,

κ

is 0.81, and the boundary detection ratios are

R_{b} = 0.63

and

R_{c} = 0.75

. Figure 9 also shows that using a small scale can produce refined boundary maintenance performance, and that a large scale can generate more integrated water body extraction results. However, the single scale pooling description cannot merge the advantages of small and large scales to produce powerful feature description. Then, with the goal of powerful feature description, we wanted to determine the number of pooling scales that should be chosen to avoid under– and over– segmentation and improve extraction accuracy.

Figure 10 depicts the performances with different fusion feature multi-scale pooling descriptions for refined water body extraction.

In Figure 10, four different scales

S_{1} = {α_{1}, α_{2}}

,

S_{2} = {α_{1}, α_{2}, α_{3}}

,

S_{3} = {α_{1}, α_{2}, α_{3}, α_{4}}

, and

S_{4} = {α_{1}, α_{2}, α_{3}, α_{4}, α_{5}}

are employed to test the fusion feature multi-scale pooling description for water body extraction. Here,

S_{1} = {50, 100}

, which means the two smaller scale factors were selected as the fusion feature multi-scale pooling description. Then,

S_{2} = {50, 100, 150}

,

S_{3} = {50, 100, 150, 200}

, and

S_{4} = {50, 100, 150, 200, 250}

were also used to evaluate the different multi-scale pooling effects. Figure 10a shows that more scales in the multi-scale feature description for Markov modeling produce better performance with good global and local evaluation indexes. Figure 10a also proves that multi-scale pooling description can merge the advantages of small- and large- scales description to generate a better refined water body extraction result than single pooling scale description. However, when analyzing the computation time required in Figure 10b, increasing the pooling scales increase the computation time for the ICM optimal process, but increasing time consumption does not produce an equal improvement in performance. Therefore, in this paper, we chose four scales of S₃ as the fusion feature multi-scale pooling description for Markov modeling.

As scale factor and description scales setting for Markov feature field modeling can facilitate refined water body extraction, the label field initialization is also an important element for rapid model convergence and to produce an accurate water body extraction result [45]. Table 3 outlines the refined water body extraction results using different initialized label field methods and ICM iterations. The K-means clustering algorithm is widely used to initialize the label field in the Markov model. Song et al. [45] proposed a structure based selected auto encoding (SAE) method to achieve Markov model label field initialization and generate better segmentation results than K-means label field initialization. Therefore, we selected the classical K-means and SAE methods for comparison with our proposed label field initialization method. In these experiments, shown in Table 3, we analyzed 300 collected scene images as a dataset and we evaluated each label field initialization method for different numbers of ICM iterations.

The analysis of these different label field initialization settings, showed that, according to the observation function, our proposed fusion feature multi-scale pooling description for label field initialization produced the best performance. Due to severe under-segmentation and the local optimal solution problem, K-means and SAE label field initialization methods produced poor performance. Our proposed label field initialization method enabled the initialized label field to be close to the final refined water body extraction result. Our model only needed 10 iterations for Markov model convergence, and the optimal process rapidly produces a more refined water body extraction result with the initialized label field.

4. Refined Water Body Extraction Comparisons

In this section, the experiments that were conducted to evaluate the performance of the proposed method are outlined. We selected 200 scene images obtained from SPOT 5 and 100 images obtained from GF-2. The SPOT 5 images were 4096 × 4096 pixels with a spatial resolution of 2.5 m. The GF-2 images were 8192 × 8192 pixels with a spatial resolution of 1 m. In order to verify the robustness and adaptability of the algorithm, the 300 images, which were 4096 × 4096 pixels, included different contrast ratios, complex scene structures and the interference caused by multiple types of land objects. Then, several indexes, were employed to evaluate the proposed and other methods. We used the MATLAB working environment, run on an Intel^® Corel^TM i7-4500U processor with 2.00 GHz.

To demonstrate the performance of in comparison with the other state-of-the-art methods, we chose several panchromatic optical remote sensing harbor scene images. We roughly divided the collected data into two categories: easy and hard samples. The easy samples mean that harbor scene images have visually clear structure and intensity features, and the hard samples have illumination, low contrast ratio and land cover interference.

To demonstrate the effectiveness of our proposed method, we individually chose gray level based [39], texture feature based [37], and entropy based [30] methods for comparison with our proposed method. The gray level (GL) method [39] uses a morphological operator (e.g., Top-hat) to enhance land cover area because default water areas are darker than land areas. Then, the mean and variance values of land area are employed to achieve water body extraction. Texture feature based methods [37] used the local binary pattern (LBP) as the texture feature descriptor to produce distinguished description, because the default water area has a flatter texture characteristic than land areas. Maximum entropy (ME) based method [30] employ the topology segmentation model to achieve coarse water body extraction, and maximum entropy features and morphology operations are used to merge several separated coarse segmentation areas and achieve refined water body extraction. The hierarchical sea-land segmentation (HSS) method [32] and convolutional neural network (CNN) based method SeNet [23] were also considered for comparison. The HSS [32] method uses hierarchical analysis and gray level combing with texture features to achieve water body extraction, and SeNet [23] uses a special constraint loss function and a large manually annotated dataset to achieve refined water body extraction. Their performance evaluations are shown in Table 4 and Figure 11.

From these global and local refinement evaluation indexes shown in Table 4 and Figure 11, we can see that our proposed method produces better performance, especially, for refined water body extraction. The global indexes improve by about 7%, and local indexes by about 10% compared with other methods. We depict some of the comparison results of the easy and hard samples in Figure 12 and Figure 13. In figures, the white “1” areas in the binary mask represent land area, and dark “0” areas are water bodies.

In the easy samples in Figure 12 and hard samples in Figure 13, by comparing results in rows (c), (d) and (e), we can see that severe under- and over- segmentation occurred which was mainly caused by complex land cover interference and the imaging conditions. For proposed method and HSS [32], due to the coarse to fine water body extraction procedure, HSS [32] and the proposed method have better integrated water body extraction results as shown in Figure 12f,b and Figure 13f,b. The SeNet [23] method also produced more integrated water body extraction results in Figure 12g and Figure 13g.

For the hard samples in Figure 13, our proposed method and SeNet [23] produced integrated results in Figure 13b,g, respectively. Due to poor illumination, complex land area interference and low contrast ratio, the ME [30] and HSS [32] suffer from under–segmentation in Figure 13e,f, respectively, and GL [39] and LBP [37] fail as shown in Figure 13c,d, respectively.

Except integrality evaluation, the local refined evaluation is also important for refined water body extraction. Here, Kappa and boundary detection ratios indexes in Table 4 are employed to evaluate water body extraction boundary maintenance. Figure 14 shows the edge structure binary mask details of the different refined water body extraction methods. From Figure 14, we see that the proposed method better maintains the edge structure, as compared with the other methods.

5. Discussion

From the analysis of the refined water body extraction result, in terms of the completeness and refinement of the water body extraction, the gray level feature is unstable with variable imagery conditions, therefore GL [39] performed poorly. GL only works for some easy samples and is unable to solve most of the hard sample situations. The LBP [37] texture based method can solve a few hard samples with obvious feature differences between water bodies and land cover. However, if strong waves occur in water areas or flat texture exist in land areas, these texture based methods perform poorly. The ME [30] method has a certain good performance on the complex scene water body extraction. However, ME [30] is sometimes sensitive to local information changes (i.e., cloud cover, local gray gradient variance, and noise from poor quality imaging). Therefore, ME [30] would produce unstable extraction performances for the easy and hard samples compared with the other methods. Considering the gray level and texture features, the HSS [32] method applies a hierarchical analysis strategy from coarse to fine extraction, which produces better performance than the GL [39], LBP [37], and ME [30] methods. For HSS [32], the hierarchical analysis strategy and comprehensive features analysis effectively achieve refined water body extraction. However, morphology operations that are used in HSS [32] do not accurately maintain edges. In our proposed Markov modeling with the fusion feature multi-scale pooling process, the fusion feature map comprehensively considers gray level, texture, and entropy features for basic water body feature description. Then, the fusion feature multi-scale pooling description is used for the hierarchical iterative Markov optimal process so that the initialed label field is automatically towards refined water body extraction. Therefore, our proposed method has a more detailed hierarchical analysis strategy and produces better performance than the compared methods, due to the more powerful multi-scale feature description.

In Table 4 and Figure 11, the CNN based method SeNet [23] involves powerful CNNs feature description and the special constraint of the CNN loss function for extracting refined water bodies. Yet, SeNet [23] also produced poor refined extraction performance comparing with the proposed method. The reason for this result can be summarized in two points. First, SeNet [23] heavily relies on a refined manually annotated dataset, but pixel–level annotation of 4096 × 4096 and 8192 × 8192 panchromatic harbor scene optical remote sensing images is very expensive and unreliable, which severely affects the edge pixels constraint for the loss function when attempting to produce a refined result. Second, the input of the SeNet [23] perception field is 512 × 512 which would be limit SeNet when learning the harbor scene global feature information. Otherwise, SeNet is very hard to train with a large sized input for the perception field with better massive network parameters for refined water body extraction.

In general, in relation to edge structure maintaining performance, hierarchical extraction strategy and powerful feature description benefit the generation of an accurate result. Due to the morphology operations for edge structure destruction, these operations would have an adverse effect on refined water body extraction. Therefore, a hierarchical iterative process without morphology operations and powerful feature description should be considered, which is a recurrent confirming process that has been likened to human consciousness, to yield refined extraction results. Finally, in future work, we will focus on powerful feature extraction CNNs based methods and combine these methods with a feature parameter estimation framework to achieve semi-supervised learning and reduce the dependence of these methods on dataset annotation.

6. Conclusions

In this paper, we propose a novel refined water body extraction method that is based on fusion feature multi-scale pooling for Markov feature fields and label field modeling. Aiming to distinguish between water and land cover features in panchromatic optical remote sensing harbor scene images, we considered the fusion feature for water body extraction. Then, based on a fusion feature map, the multi-scale pooling operation aggregates the different scale fusion features from local contextual information to generate powerful feature description. The completed fusion feature multi-scale pooling operation, was used for Markov feature field and label field modeling. Next, based on the Markov optimal observation function, label field initialization and feature field calculation were jointly achieved, and the rebuilt label and feature field in the Markov model rapidly converge using the optimal label field updating process. The difference and advantage of our method compared to the other methods is that it does not use dilation and erosion operations to achieve refined water body extraction. In the proposed method, the refined water body extraction result is gradually generated by the optimal Markov iterative process. Our experiments demonstrated that our proposed method performs better for refined water body extraction than other methods, and the evaluation indexes indicated that global and local refined performance can be improved by about 7% and 10%, respectively.

Author Contributions

Conceptualization, Y.Z. and H.C.; Formal analysis, B.Q. and L.L.; Funding acquisition, H.C. and L.L.; Investigation, S.D., B.Q. and L.L.; Methodology, Y.Z., B.Q. and H.C.; Project administration, H.C. and L.L.; Writing—original draft, Y.Z. and B.Q.

Funding

This research was funded by the National Natural Science Foundation of China (NSF) grant 61471006; the 111 Project grant 111-2-05 and AFOSR grant FA9550-16-1-0386.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China (NSF) grant 61471006; the 111 Project grant 111-2-05 and AFOSR grant FA9550-16-1-0386.

Conflicts of Interest

The authors declare no conflict of interest.

References

Jarlan, L.; Khabba, S.; Er-Raki, S.; Le Page, M.; Hanich, L.; Fakir, Y.; Merlin, O.; Mangiarotti, S.; Gascoin, S.; Ezzahar, J.; et al. Remote Sensing of Water Resources in Semi-Arid Mediterranean Areas: The joint international laboratory TREMA. Int. J. Remote Sens. 2015, 36, 4879–4917. [Google Scholar] [CrossRef]
Saito, L.; Rosen, M.R.; Roesner, L.; Howard, N. Improving estimates of oil pollution to the sea from land-based sources. Mar. Pollut. Bull. 2010, 60, 990–997. [Google Scholar] [CrossRef] [PubMed]
Zhu, X.Y.; Li, Y.; Feng, H.Y.; Liu, B.X.; Xu, J. Oil spill detection method using X-band marine radar imagery. J. Appl. Remote Sens. 2015, 9. [Google Scholar] [CrossRef]
Lim, J.; Lee, K.S. Investigating flood susceptible areas in inaccessible regions using remote sensing and geographic information systems. Environ. Monit. Assess. 2017, 189, 96. [Google Scholar] [CrossRef] [PubMed]
Müller, R.; Berg, M.; Casey, S.; Ellis, G.; Flingelli, C.; Kiefl, R.; Ansgar, K.; Lechner, K.; Reize, T.; Sándor, G. Optical Satellite Services For EMSA (Opsserve)-Near Real-Time Detection of Vessels And Activites with Optical Satellite Imagery. In Proceedings of the ESA Living Planet Symposium, Edimburgh, UK, 9–13 September 2013. [Google Scholar]
Rajiv Kumar Nath, S.K.D. Water-Body Area Extraction from High Resolution Satellite Images-An Introduction, Review, and Comparison. Int. J. Image Process. 2010, 3, 353–372. [Google Scholar]
Gao, B.C. NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sens. Environ. 1996, 58, 257–266. [Google Scholar] [CrossRef]
Kwang, C.; Jnr, E.M.O.; Amoah, A.S. Comparing of Landsat 8 and Sentinel 2A using Water Extraction Indexes over Volta River. J. Geogr. Geol. 2017, 10, 1. [Google Scholar] [CrossRef]
Kaplan, G.; Avdan, U. Object-based water body extraction model using Sentinel-2 satellite imagery. Eur. J. Remote Sens. 2017, 50, 137–143. [Google Scholar] [CrossRef]
Zhang, F.; Li, J.; Shen, Q.; Zhang, B.; Ye, H.; Wang, S.; Lu, Z. Dynamic Threshold Selection for the Classification of Large Water Bodies within Landsat-8 OLI Water Index Images. Remote Sens. 2016. [Google Scholar] [CrossRef] [Green Version]
Feyisa, G.L.; Meilby, H.; Fensholt, R.; Proud, S.R. Automated Water Extraction Index: A new technique for surface water mapping using Landsat imagery. Remote Sens. Environ. 2014, 140, 23–35. [Google Scholar] [CrossRef]
Li, B.Y.; Zhang, H.; Xu, F.J. Water Extraction in High Resolution Remote Sensing Image Based on Hierarchical Spectrum and Shape Features. In Proceedings of the 35th International Symposium on Remote Sensing of Environment (Isrse35), Beijing, China, 22–26 April 2014; p. 17. [Google Scholar]
Xu, H.Q. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
Acharya, T.D.; Lee, D.H.; Yang, I.T.; Lee, J.K. Identification of Water Bodies in a Landsat 8 OLI Image Using a J48 Decision Tree. Sensors (Basel) 2016, 16, 1075. [Google Scholar] [CrossRef]
Yu, M.; Lan, T.; Wang, Q.Q.; Guo, G.D. An Improvement Method of Surface Water Extraction Based on Remote Sensing Data. J. Eng. Appl. Sci. 2017. [Google Scholar] [CrossRef]
Yang, X.; Chen, L. Evaluation of automated urban surface water extraction from Sentinel-2A imagery using different water indices. J. Appl. Remote Sens. 2017, 11, 026016. [Google Scholar] [CrossRef]
Zhonghua, H.; Xuesu, L.; Yanling, H.; Zhang, Y.; Wang, J.; Zhou, R.; Hu, K. Automatic sub-pixel coastline extraction based on spectral mixture analysis using EO-1 Hyperion data. Front. Earth Sci. 2018. [Google Scholar] [CrossRef]
Milad, N.J.; Alfonso, V. Reconstruction of River Boundaries at Sub-Pixel Resolution: Estimation and Spatial Allocation of Water Fractions. ISPRS Int. J. Geo-Inf. 2017, 6, 383. [Google Scholar] [Green Version]
Huan, X.; Xin, L.; Xiong, X.; Pan, H.; Tong, X. Automated Subpixel Surface Water Mapping from Heterogeneous Urban Environments Using Landsat 8 OLI Imagery. Remote Sens. 2016, 8, 584. [Google Scholar] [Green Version]
Rudorff, C.M.; Novo, E.M.L.; Galvao, L.S. Spectral Mixture Analysis of Inland Tropical Amazon Floodplain Waters Using EO-1 Hyperion. In Proceedings of the 2006 IEEE International Symposium on Geoscience and Remote Sensing, Denver, CO, USA, 31 July–4 August 2006. [Google Scholar]
Tiagrajah, V.; Win, K. SOM based segmentation method for water region detection in satellite images. World J. Eng. 2013, 10, 95–100. [Google Scholar] [CrossRef]
Zhuang, Y.; Wang, P.L.; Yang, Y.D.; Shi, H.; Chen, H.; Bi, F.K. Harbor Water Area Extraction from Pan-Sharpened Remotely Sensed Images Based on the Definition Circle Model. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1690–1694. [Google Scholar] [CrossRef]
Cheng, D.; Meng, G.; Cheng, G.; Pan, C. SeNet: Structured Edge Network for Sea–Land Segmentation. IEEE Geosci. Remote Sens. Lett. 2017, 14, 247–251. [Google Scholar] [CrossRef]
Huang, X.; Xie, C.; Fang, X.; Zhang, L.P. Combining Pixel- and Object-Based Machine Learning for Identification of Water-Body Types From Urban High-Resolution Remote-Sensing Imagery. IEEE J.-Stars 2015, 8, 2097–2110. [Google Scholar] [CrossRef]
Yu, L.; Wang, Z.; Tian, S.; Ye, F.; Ding, J.; Kong, J. Convolutional Neural Networks for Water Body Extraction from Landsat Imagery. Int. J. Comput. Intell. Appl. 2017, 16, 1750001. [Google Scholar] [CrossRef]
Lian, S.Z.; Chen, J.P.; Luo, M.H. A Probability-Based Statistical Method to Extract Water Body of Tm Images with Missing Information. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 41, 21–26. [Google Scholar] [CrossRef]
Xiong, L.H.; Deng, R.R.; Li, J.; Liu, X.L.; Qin, Y.; Liang, Y.H.; Liu, Y.F. Subpixel Surface Water Extraction (SSWE) Using Landsat 8 OLI Data. Water 2018, 10, 653. [Google Scholar] [CrossRef]
Pagano, T.S.; Chahine, M.T.; Aumann, H.H.; O’Callaghan, F.G.; Broberg, S.E. Advanced Remote-sensing Imaging Emission Spectrometer (ARIES): AIRS spectral resolution with MODIS spatial resolution. In Proceedings of the IEEE International Symposium on Geoscience & Remote Sensing, Denver, CO, USA, 31 July–4 August 2006. [Google Scholar]
Cucci, C.; Casini, A.; Picollo, M.; Poggesi, M.; Stefani, L. Open issues in hyperspectral imaging for diagnostics on paintings: When high-spectral and spatial resolution turns into data redundancy. O3a Opt. Arts Archit. Archaeol. III 2011, 8084, 4131–4140. [Google Scholar]
Liu, W.C.; Ma, L.; Chen, H.; Han, Z.; Soomro, N.Q. Sea-Land Segmentation for Panchromatic Remote Sensing Imagery via Integrating Improved MNcut and Chan-Vese Model. IEEE Geosci. Remote Sens. Lett. 2017, 14, 2443–2447. [Google Scholar] [CrossRef]
Zewen, H. A sea-land segmentation algorithm based on graph theory. In Proceedings of the International Conference on Computer Vision, Roorkee, Uttarakhand, 26–28 February 2016. [Google Scholar]
Ma, L.; Soomro, N.Q.; Shen, J.J.; Chen, L.; Mai, Z.H.; Wang, G.Q. Hierarchical Sea-Land Segmentation for Panchromatic Remote Sensing Imagery. Math. Probl. Eng. 2017, 2017. [Google Scholar] [CrossRef]
Zhuang, Y.; Guo, D.C.; Chen, H.; Bi, F.K.; Ma, L.; Soomro, N.Q. A novel sea-land segmentation based on integral image reconstruction in MWIR images. Sci. China-Inf. Sci. 2017, 60. [Google Scholar] [CrossRef]
Li, J.; Xie, W.X.; Pei, J.H. A sea-land segmentation algorithm based on multi-feature fusion for a large-field remote sensing image. In Proceedings of the Mippr 2017: Remote Sensing Image Processing, Geographic Information Systems, and Other Applications, Xiangyang, China, 28–29 October 2017; Volume 10611. [Google Scholar] [CrossRef]
Liu, G.; Chen, E.; Qi, L.; Tie, Y.; Liu, D. A Sea-Land Segmentation Algorithm Based on Sea Surface Analysis. In Proceedings of the Advances in Multimedia, Xi’an, China, 15–16 September 2016; pp. 479–486. [Google Scholar] [CrossRef]
Wang, D.; Cui, X.; Xie, F.; Jiang, Z.; Shi, Z. Multi-feature sea–land segmentation based on pixel-wise learning for optical remote-sensing imagery. Int. J. Remote Sens. 2017, 38, 4327–4347. [Google Scholar] [CrossRef]
Xia, Y.; Wan, S.; Jin, P.; Yue, L. A Novel Sea-Land Segmentation Algorithm Based on Local Binary Patterns for Ship Detection. Int. J. Signal Process. Image Process. Pattern Recognit. 2014, 7, 237–246. [Google Scholar] [CrossRef]
Poggi, G.; Scarpa, G.; Zerubia, J.B. Supervised segmentation of remote sensing images based on a tree-structured MRF model. IEEE Trans. Geosci. Remote Sens. 2005, 43, 1901–1911. [Google Scholar] [CrossRef]
Tang, J.X.; Deng, C.W.; Huang, G.B.; Zhao, B.J. Compressed-Domain Ship Detection on Spaceborne Optical Image Using Deep Neural Network and Extreme Learning Machine. IEEE Trans. Geosci. Remote Sens. 2015, 53, 1174–1185. [Google Scholar] [CrossRef]
Melas, D.E.; Wilson, S.P. Double Markov random fields and Bayesian image segmentation. IEEE Trans. Signal Process. 2002, 50, 357–365. [Google Scholar] [CrossRef] [Green Version]
Derrode, S.; Pieczynski, W. Signal and image segmentation using pairwise Markov chains. IEEE Trans. Signal Process. 2004, 52, 2477–2489. [Google Scholar] [CrossRef] [Green Version]
Geman, S.; Geman, D. Stochastic relaxation, gibbs distributions, and the bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 1984, 6, 721–741. [Google Scholar] [CrossRef] [PubMed]
Frank, O.; Strauss, D. Markov Graphs. J. Am. Stat. Assoc. 1986, 81, 832–842. [Google Scholar] [CrossRef]
Otsu, N. A Threshold Selection Method from Gray-Level Histograms. IEEE Trans. Syst. Man Cybern. 1979, 9, 62–66. [Google Scholar] [CrossRef]
Song, S.; Si, B.; Feng, X.; Liu, K. Label field initialization for MRF-based sonar image segmentation by selective autoencoding. In Proceedings of the Oceans, Kobe, Japan, 6–8 October 2016. [Google Scholar]

Figure 1. Gray level histograms of original optical panchromatic remote sensing images: (a) water-land scene with separable water body gray level histogram; (b) water-land scene with non-separable water body gray level histogram.

Figure 2. The refined water body extraction workflow of our proposed method.

Figure 3. Study area of famous Chinese ports: Dalian, Tianjin, Qingdao, Shanghai, Xiamen, Shenzhen, and Haikou ports, All SOPT 5 and GF-2 data were collected from these famous ports.

Figure 4. Fusion feature multi-scale pooling process.

α_{L}

means different scale factors which are the sizes of pooling operation; H represents a pixel in pooling area.

Figure 4. Fusion feature multi-scale pooling process.

α_{L}

means different scale factors which are the sizes of pooling operation; H represents a pixel in pooling area.

Figure 5. Fusion feature multi-scale pooling analysis. (a–d) show small to large pooling scale description with their corresponding binary marks. The red circle shows the detailed edge structure maintained using different pooling description scales.

Figure 6. Discriminative feature description analysis of fusion feature multi-scale pooling for water and land area. (a–c) are original panchromatic images with their corresponding gray level histograms and fusion feature pooling response curve statistics.

Figure 7. The workflow of Markov modeling for refined water body extraction. (a) is principle of label field modeling, (b) presents feature field modeling; (c) is ICM optimal iteration process, (d) is an example of extracted refined water body binary mask.

Figure 8. The proposed label field initialization with Markov modeling for refined water body extraction.

Figure 9. The performance discussion of different scale factor α_L setting with single pooling scale description. (a) shows the global indexes for regional integrity evaluation, (b) shows the local refined indexes for boundary accuracy evaluation.

Figure 10. Multi-scale pooling feature description analysis. (a) shows the performances of multi-scale pooling feature description with different number of pooling scales for refined water body extraction, (b) shows time consuming of multi-scale pooling feature description with different pooling scales.

Figure 11. Performance comparison analysis based on selected dataset. (a) is precision performance of selected dataset, (b) is recall performance of selected dataset, (c) is overall accuracy performance of selected dataset, (d) is precision performance of selected dataset, (e) is R_b performance of selected dataset, (f) is R_c performance of selected dataset.

Figure 12. Results for the easy samples for testing refined water body extraction performance: (a) original panchromatic images, (b) proposed method, (c) GL [39], (d) LBP [37], (e) ME [30], (f) HSS [32], and (g) SeNet [23].

Figure 13. Results of hard images for testing refined water body extraction performance. (a) original panchromatic images, (b) Proposed method, (c) GL [39], (d) LBP [37], (e) ME [30], (f) HSS [32], (g) SeNet [23].

Figure 14. Analysis of refined water body extraction results for comparing methods. The top of figure shows detailed local boundary selection and results formation.

Table 1. Classical Water Area Indexes. G: Green band radiation; NIR: Near-infrared radiation; SWIR: Shortwave infrared radiation; SWIR1: 1300 nm to 1900 nm; SWIR2: 1900 nm to 2500 nm.

Water Index	Expression
NDWI	NDWI = (G − NIR)/(G + NIR)
MNDWI	MNDWI = (G − SWIR)/(G + SWIR)
AWEI	AWEI = 4 × (G − SWIR1) − (0.25 × NIR + 2.75 × SWIR2)

Table 2. SPOT 5 and GF-2 satellite data information.

Country	Satellite	Launch Data	Panchromatic Resolution	Multi-Spectral Resolution
France	SPOT-5	2002	2.5 m	10 m
China	GF-2	2014	1 m	4 m

Table 3. Analysis of different methods for Markov label field initialization with different iterations.

	Precision	Recall	Overall Accuracy	Kappa	R_b	R_c
10 ICM Iterations
K-means	54.6%	47.9%	40.2%	0.44	0.374	0.433
SAE	69.2%	62.3%	57.6%	0.52	0.443	0.519
Proposed	87.5%	93.7%	89.2%	0.85	0.774	0.802
50 ICM Iterations
K-means	60.3%	53.4%	44.1%	0.51	0.404	0.477
SAE	77.3%	75.8%	71.7%	0.66	0.535	0.627
Proposed	87.8%	93.7%	89.3%	0.85	0.764	0.813
90 ICM Iterations
K-means	61.2%	53.6%	44.9%	0.51	0.408	0.472
SAE	79.8%	77.1%	74.3%	0.69	0.564	0.649
Proposed	88.1%	93.8%	89.4%	0.85	0.771	0.814

Table 4. Performance comparison for refined water body extraction.

Parameter		GL [39]	LBP [37]	ME [30]	HSS [32]	SeNet [23]	Proposed
Precision		73%	80%	82%	85%	83%	87%
Recall		64%	77%	85%	89%	86%	93%
Overall Accuracy		59%	72%	81%	84%	81%	89%
Kappa		32%	61%	73%	77%	74%	83%
Boundary Detection Ratio	R_b R_c	24%	55%	77%	79%	77%	84%
	R_b R_c	23%	49%	70%	72%	75%	87%
Calculation time/	4096 × 4096 images (s)	14.13 s	82.45 s	37.75 s	124 s	45 s	93 s

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qi, B.; Zhuang, Y.; Chen, H.; Dong, S.; Li, L. Fusion Feature Multi-Scale Pooling for Water Body Extraction from Optical Panchromatic Images. Remote Sens. 2019, 11, 245. https://doi.org/10.3390/rs11030245

AMA Style

Qi B, Zhuang Y, Chen H, Dong S, Li L. Fusion Feature Multi-Scale Pooling for Water Body Extraction from Optical Panchromatic Images. Remote Sensing. 2019; 11(3):245. https://doi.org/10.3390/rs11030245

Chicago/Turabian Style

Qi, Baogui, Yin Zhuang, He Chen, Shan Dong, and Lianlin Li. 2019. "Fusion Feature Multi-Scale Pooling for Water Body Extraction from Optical Panchromatic Images" Remote Sensing 11, no. 3: 245. https://doi.org/10.3390/rs11030245

APA Style

Qi, B., Zhuang, Y., Chen, H., Dong, S., & Li, L. (2019). Fusion Feature Multi-Scale Pooling for Water Body Extraction from Optical Panchromatic Images. Remote Sensing, 11(3), 245. https://doi.org/10.3390/rs11030245

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Fusion Feature Multi-Scale Pooling for Water Body Extraction from Optical Panchromatic Images

Abstract

1. Introduction

2. Study Areas and Data Sources

3. Methodology

3.1. Fusion Feature Map Generation

3.2. Fusion Feature Multi-Scale Pooling

3.3. Markov Modeling for Refined Water Body Extraction

3.4. Evaluation Indexes

3.5. Optimal Parameter Setting

4. Refined Water Body Extraction Comparisons

5. Discussion

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI