A Development of a Robust Machine for Removing Irregular Noise with the Intelligent System of Auto-Encoder for Image Classification of Coastal Waste

Wan, Shiuan; Lei, Tsu Chiang

doi:10.3390/environments9090114

Open AccessArticle

A Development of a Robust Machine for Removing Irregular Noise with the Intelligent System of Auto-Encoder for Image Classification of Coastal Waste

by

Shiuan Wan

¹ and

Tsu Chiang Lei

^2,*

¹

Department of Information Technology, Ling Tung University, Taichung 40851, Taiwan

²

Department of Urban Planning and Spatial Information, Feng Chia University, Taichung 40724, Taiwan

^*

Author to whom correspondence should be addressed.

Environments 2022, 9(9), 114; https://doi.org/10.3390/environments9090114

Submission received: 12 August 2022 / Revised: 29 August 2022 / Accepted: 2 September 2022 / Published: 4 September 2022

(This article belongs to the Special Issue Monitoring and Assessment of Environmental Quality in Coastal Ecosystems Volume II)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Currently, the seashore is threatened by the environment of climate change and increasing coastal waste. The past environmental groups used a large amount of manpower to manage the coast to maintain the seashore environment. The computational time cost and efficiency are not ideal for the vast area of the seashore. With the progress of GIS (Geographic Information System) technology, the ability of remote sensing technology can capture a wide range of data in a short period. This research is based on the application of remote sensing technology combined with machine learning to display the observation of our seashore. However, in the process of image classification, the seashore wastes are small, which required the use of high-resolution image data. Thus, how to remove the noise becomes a crucial issue in developing an image classifier machine. The difficulties include how to adjust the value of parameters for removing/avoiding noises. First, the texture information and vegetation indices were employed as ancillary information in our image classification. On the other hand, auto-encoder is a very good tool to denoise a given image; hence, it is used to transform high-resolution images by considering ancillary information to extract attributes. Multi-layer perceptron (MLP) and support vector machine (SVM) were compared for classifier performance in a parallel study. The overall accuracy is about 85.5% and 83.9% for MLP and SVM, respectively. If the AE is applied for preprocessing, the overall accuracy is increased by about 10–12%.

Keywords:

coastal waste; image classification; auto-encoder; image denoise

Graphical Abstract

1. Introduction

The coastline of Taiwan faces the very troublesome issue of coastal maintenance problems. In addition to the continuous investment of hundreds of millions of dollars to maintain the coastal environment, it also consumes a great deal of time and manpower. It is also difficult to efficiently protect and govern the coastal environment with human power without applying spatial technology. Considering the current dilemma, it requires subsequent maintenance with a huge amount of spatial data, which is hard to resolve as available information efficiently. In addition to the continuous and regular removal of coastal waste as well as the one-off beach-cleaning activities that are often organized, a better solution must be developed by using remote sensing (RS) technology combined with the overall supervised machine learning model to apply the identification of ground truth to the coastal information. The RS image data for detecting target categories can save manpower, time, and required expense for traditional on-site surveys, especially in large areas. It covers a wide area in real time, and it provides a great help in regional monitoring. Therefore, remote measurement technology is an economical and efficient method for surveying the ground truth for the coastal area.

More specifically, RS technology has been widely used in land-use surveys, especially in integration with spatial technology, which has enabled to detect the overall conditions of a large area. Although high-resolution images have some beneficial characteristics, they can be applied to large-scale land-use detection problems. Accordingly, high-resolution images have greatly increases in accuracy even to the centimeter level, which makes them a good resource for data. With their high spatial resolution, they are able to break through the limitations of insufficient spatial resolution of the previous type of images. Therefore, how to apply high-resolution image data to attract many scholars to use spectral indicators, textures, multi-period images, and radar images and even combined with GIS data to perform area classification and interpretation is quite important. However, past studies have used SPOT satellite images [1,2], and they can achieve good results. However, due to insufficient image resolution, the fine quality of high-resolution images can make up for the above-mentioned problems.

In the classification of images, there are two main solutions: (1) If a full understanding of the study area is given, one can use the supervised classification method [3,4]. (2) Unsupervised classification can be used if the study area is not well-understood [3]. Due to the advancement of science and technology, image data with advanced ancillary information obtained by general spectrum or multi-spectral images should be of help to the measurement technology of information analysis progress. However, in the meantime, the noise measured by the high-resolution image input often occurs. Alternatively, ancillary information (texture information or vegetation indices) is also crucial to image classification. The current ancillary information is successfully applied in the area of agriculture. We tried to apply this concept to coastal area. Some of the vegetation categories (grass, bushes, and forests) have different reelection for image detection, and the texture information influences the rock and vegetation (such as crops and rocks), thus influencing the classification outcomes. In addition, a part of the target categories of the in-situ samples must be reviewed or given correctly; otherwise, it may cause a large number of missed judgments within samples by supervised machine learning classification. If the attributes of image input data are reduced or rationally resembled, it will certainly be of great help to the generation of a thematic map of coastal waste area.

The optimization strategy for effective classification is also important. Coastal waste derives from weather changes, which cause massive pollution along the coast in Taiwan. It has two major causes. Half of the coastal waste is generated by a landslide upstream, such as rock + driftwood (Figure 1) of the natural environment [5,6]. On the other hand, manmade wastes (plastic bottles, Styrofoam, cans, etc.) are irregular [7,8], which is quite difficult to detect by a classifier. Hence, the training samples and testing samples are randomly extracted in this research uniformly. The rules constructed by the training samples correspond to the verification of the overall empirical area, and there is still a certain degree of uncertainty. That is, the ground truth data are investigated by in-situ observation. Furthermore, the image classification approaches are also strongly influenced by the background of the data. In our case, coastal wastes are to a great extent mixed into the background (such as sand or rocks). Thus, image data may produce noise, which seriously affects the classification outcomes [9].

Considering that coastal waste is irregular, how to remove the image noise seems to become very important, especially in a high-resolution image. One of the possible solutions is to select a pre-processing analysis tool to implement the removing noise. Thus, the approaches of systems applied to machine learning by image process on represented components (features of the picture) are based on resolving the input data and reconstructing the output data as similarly as possible to the input. Through such an approach, the system learns about the message presented by the input data. Hence, the auto-encoder (AE) approach displayed outstanding results in denoising image data [10], which is based on the addition of noise to the input image to corrupt the data and mask some of the values. It follows image reconstruction. During the image reconstruction, the AE trains the input features, resulting in overall improved extraction of latent representations. The auto-encoder provides the idea of the corruption of inputs before their consideration and then resembles outputs for analysis. In this case, AE is an appropriate consideration, specifically due to its applications in denoising, which has great potential in the feature extraction and data component. Some of the researchers used PCA (principle component analysis). PCA is a feature extraction method by performing a change of basis on the data, and AE resembles the inputs of data to recognize the outputs. However, this approach does not seem as beneficial as auto-encoder [11].

Multi-layer perceptrons (MLP) perhaps are the most useful type of neural network [12]. A perceptron is a single-neuron model that is a precursor to a bigger neural network. It is a field that simulates how simple models of biological brains can be used to solve difficult computational works to complete a predictive modeling task in machine learning. The goal is not to create realistic models of the brain but rather to develop robust algorithms for us to use to solve difficult classification problems. Mathematically, they are capable of learning any mapping function and have been proven to be a universal approximation algorithm. On the other hand, the predictive capability of neural networks comes from the hierarchical or multi-layered structure of the networks. Support vector machine (SVM) is another simple algorithm to which every machine learning study is compared in this study. SVM is highly preferred by many, as it produces significant accuracy with less computation power. Support vector machine (SVM) can be used for both regression and classification tasks. A special feature of these classifiers is to minimize the empirical classification error and maximize the geometric margin simultaneously. Therefore, it is also known as a maximum margin classifier. The benefit of SVM is that it can resolve many types of image types of data while attaining a relatively good image classification accuracy [13,14,15]. Few of the studies used machine learning with a preprocessing tool to apply to coastal waste. To gain a better understanding of the classification results, the study used parallel approaches to compare the outcomes of the results. The SVM and MLP are well-known methods for image classification; in general, they are widely used in spatial classification objectives [16,17,18,19].

This study used the UAV (unmanned aerial vehicle) image data to conduct an exploratory study. Since UAV images are mobile, fast, and can even reach places remotely, reason why we choose to use the UAV. The use of high-resolution UAV imagery is also an attempt to simulate what a person can observe regarding the coastal waste condition. On the other hand, due to the complex composition of coastal waste, the purpose of this study is to use image classification methods to quickly classify coastal waste to find the location and distribution in the beach environment. The entire study has four parts: (1) Introduction: this part introduces the background of the study; (2) research data: this section contains the description of the study area and image data; (3) methods: this part illustrates the auto-encoder for preprocessing and then the MLP and SVM for classifiers; and (4) discussion: this part presents the results.

2. Materials and Methods

2.1. Study Area

To study the coastal waste of environment area, the selection of an appropriate zone becomes an important issue. The main research points of this study area are limited by the following criteria:

(1) It must be an area open to the general public and convenient for entry and exit.

(2) The coast requires at least 100 m.

(3) It is an open-type scenic spot that is very crowded with tourists.

(4) The site is selected one kilometer from the coast. There are dangerous characteristics, such as estuary and reefs.

(5) The front edge of the coast must be able to be maintained from the front edge of the coast to the part of the road bank and its full extent. When the tide is low, the coast slope does not exceed 45°.

After many surveys, the study area was finally selected at 83.2 K of the road on Line 2, located in the northeast of New Taipei City, Beibin East China Sea, Gongliao District in the east and Shuangxi District and Pingxi District in the south. There are about 10 beaches requiring monitoring and management; we selected this beach as a prototype zone for analysis. The northwest of Keelung Bin-hai is an excellent zone for the analysis of coastal waste by high-resolution image data by UAV photos. More specifically, the Rui-fang District belongs to the Keelung Hills at the northernmost part of the Central Mountain Range and the coast of the northeastern corner of Taiwan. This coastal zone satisfies the aforementioned limitations. Except for some coastal areas and river alluvial land, it mainly consists of mountain slopes. The location is at 25°07’18.2”N, 121°54’08.6”E. Please see Figure 2.

2.2. Image Format

This study used high-resolution multi-spectral images with a spatial resolution of 20 cm×20 cm to be effectively implemented for analysis. It contains B (Blue), G (Green), R (Red), and IR (Infrared) bands. The pre-processing part of the image must first be cropped and adjusted to a size and range that can be easily tested by the equipment. Although the scope of the experiment is not large, it is easy to expand the selected range only involved in an automated processing problem. Hence, it can be widely applied to other beach areas for duplicated approaches. Figure 3 presents the distributions of sampling. The in-situ ground truth data were investigated and are shown in Figure 3. The characteristics in the images in three categories (rock, wood, and wastes) are uniformly and randomly selected.

2.3. Ancillary Information

This study provides the characteristic information for remote image data, which include the original band of the image (B, G, R, IR), vegetation information, and texture information. For the coastline environment, the study considered different conditions and chose to use images of different scales to extract the characteristic information of seashore waste to establish a knowledge base consisting of images and the image data features. Table 1 uses the spectrum characteristics index to improve the analysis of the classification, which includes the ratio vegetation index (RVI), the normalized difference vegetation index (NDVI), perpendicular vegetation index (PVI), soil-adjusted vegetation index (SAVI), transformed soil-adjusted vegetation index (TSAVI), greenness index (GI), infrared percentage vegetation index (IPVI), and modified soil-adjusted vegetation index (MSAVI). On the other hand, the use of the gray-level co-occurrence matrix (GLCM) method for the extraction of image texture (see Figure 3a) was also used. The image texture provides spatial distribution-related information, and it can increase image classification with distinction of information. In some cases, the appropriate selection of textured images can increase classification accuracy. The following seven texture characteristics were used to produce each band of the texture image: (1) homogeneity, (2) contrast, (3) dissimilarity, (4) entropy, (5) variance, (6) mean, and (7) second moment. All the vegetation indicators and texture information are shown in Table 1.

3. Method for Study Plan

The study has two plans. The first plan is to build the standard progress of image classification in Geosciences. Figure 4a shows the idea of research steps for the whole research. The first part is the data preprocessing, and the second part is the analysis by different classifier and preprocessing tools of AE.

3.1. Auto-Encoder (AE)

Auto-Encoder is a sort of artificial neural network applied to learn efficient coding of unlabeled data (unsupervised learning) [20]. The encoding process is validated and refined by attempting to regenerate the input from the encoding. In essence, the AE learns a representation (encoding) for a set of data by training the network to ignore insignificant data (“noise”). In this way, it executes a copy of the task perfectly to duplicate the signal. Instead, AEs are typically forced to reconstruct the input approximately. It preserves only the most relevant aspects of the data in the copy. AE is designed by training to copy the input to the output, and the latent representation will take on valuable properties. This progress can achieve success by creating constraints on the copying task. One way to obtain useful features from the AE is to constrain the attributes that have smaller dimensions as the inputs; in this case, the AE is called under-complete. By training an under-complete representation, we force the AE to learn the most salient features of the training data, which can achieve better outcomes.

The layers of AE consist of an input layer, hidden layer, and output layer. In the first phase, the input layer and the hidden layer construct an encoder. In the other phase, the hidden layer and the output layer construct a decoder. Between these two, there is a code. Their description is the following:

(a) Encoder has a function of compressing input information into a different latent space;

(b) Code is a part of the network that represents the compressed input, which is fed to the decoder;

(c) Decoder does the reverse work and reconstructs the original information. It moves from the latent space to the original information space.

3.2. Multi-Layer Perceptron (MLP)

Multi-layer perceptron (MLP) is built by a class of feedforward artificial neural network (ANN) [21]. The term MLP is used ambiguously: sometimes loosely to any feedforward ANN and sometimes strictly to refer to networks composed of multiple layers of perceptrons (with threshold activation), especially when they have a single hidden layer.

The MLP consists of at least three layers of nodes: an input layer, a hidden layer, and an output layer. Except for the input nodes, each node is a neuron that uses a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training [12,13]. Its multiple layers and non-linear activation distinguish MLP from a linear perceptron. It can distinguish data that are not linearly separable. The backpropagation algorithm has the drawbacks of local convergence and slowness. Thus, the goal of this approach is to minimize the smallest amount of noise in which the parameter of alpha and number of neurons in the hidden layer are automatically adjusted, which can avoid local convergence and slowness. We changed the neuron numbers and alpha values to have the amount of image noise. Please refer to Figure 4b.

3.3. Support Vector Machine (SVM)

The support vector machine (SVM) algorithm is a popular machine learning tool that offers solutions for classification problems. Support vector machines (SVMs) are a well-accepted supervised learning classification method. Whereas the SVM classifiers have binary classification and multiclass classification, the structured SVM can execute the training of a classifier for generally structured output labels. More specifically, the SVM builds hyperplanes in the existing data, which may be able to classify the data. One rational choice for the best hyperplane represents the largest separation, or margin, between the two classes. The optimal choice of the hyperplane is to make certain that their distance from the nearest data point on each side is maximized. A special characteristic of these classifiers is to minimize the empirical classification error and maximize the geometric margin. The major core of the support vector machine is to select an appropriate kernel function. The function of the kernel is to convert the dataset into the required classification outputs [14]. Because different types of data cannot be linearized in the original space, when separated, the data after nonlinear projection can be easier to separate in a higher-dimensional space usually through linear, polynomial radial basis function (RBF) and Sigmoid function. Most of the past studies in Geoscience of Image classification used RBF as the core [13,14,15]; hence, this study employed the radial basis function to show the classification results, and the value of bias was set to 0. Grid search is a tuning technique that attempts an exhaustive search that is performed on a set of specific parameter values for C and gamma of a model. Hence, in order to attain better model parameters, the grid search method was used to repeat the test parameters C = 2.5 (penalty parameter) and g = 0.42 (gamma function) for possible combinations and to calculate the correct rate of its parameters (C, g) for minimizing the noise number (please refer to Figure 4b).

3.4. Development of Robust Removing Noise Machine

As part of the study, the second plan was to design an auto-adjustment machine, which is shown in Figure 4b. It is named as the robust noise-removing machine [22]. After the training data and testing are selected, it will input the training data set to AE + SVM and AE + MLP, respectively. The core function of SVM in this study used RBF and MLP activation function using RELU with solver ADAM. The auto-encoder in this study used the activation function of sigmoid with optimizer ADAM. The Sigmoid activation function was tested through various condition for this study to be the best selection. There may have been some other alternatives, but they were not suitable for this set of image data. The robust noise-removing machine performs iterations for searching the smallest amount of noise based on different control variables of SVM and MLP. In SVM, it auto-adjusts the C and gamma values. In the MLP model, it auto-adjusts hidden neuron numbers and alpha values. The program is designed to run 500 epochs with 100 trails of adjustment of the machine learning approach.

4. Results

In this study, a computer program named AE, which is written in Python, was used to construct the architect of AE. There are two subroutines named SVM and MLP, which are designed to link to the main program AE for the calculation results. Each of the subroutines has two output functions: (a) confusion matrix and (b) thematic map. In the main program, the AE can be disabled to neglect the use of AE, or AE is selected (default) to consider the usage of AE [23,24,25].

4.1. Preparing Data for SVM and MLP

Step 1: Data for preprocessing AE.

We randomly selected 148 samples as training samples and 1170 as testing samples. Figure 5 shows the architecture of our AE model. It should be noted that the hidden layers may be changed for alternatives; however, it will not affect the outcome if the network computation is converged. We have B, G, R, and IR with 14 pieces of ancillary information (see Table 1) in which the total is 18 inputs. Hence, we have 18 input neurons. The first and second encoder hidden layers have eight and five neurons, respectively. The center bottleneck has three neurons. The first and second decoder hidden layers have five and eight neurons, respectively. The final output layer has 18 neurons. In general, the architectural number of neurons has to be symmetrical.

Step 2: Discussion on Convergence.

Reviewing several articles of auto-encoders, it was found that each one uses a different set of parameters [10,11]. These studies proposed that a particular algorithm or parameter preset is better than another, which makes it difficult to choose between them. Thus, it is difficult to find a detailed and objective comparison of the different approaches, and the available results are sometimes paradoxical. Accordingly, the choice was made by “trial and error” by undertaking exhaustive experimentation with different criteria until the solution that best suits each scenario was found.

The considerations and decisions that led to the parameters selected were as follows. We adopted activation function Tanh for the encoder layer and latent space and linear for the output layer. The optimizer used ADAM with a learning rate of 0.0001. Loss function mean square error (MSE) with the minimal value of 0.02 or 800 epochs was executed. Figure 6 presents the architectural number of neurons in different layers.

In this model, the statistical accuracy used the index of MSE. This provides importance to larger absolute errors:

MSE = \sqrt{\frac{\sum_{P}^{M} \sum_{j}^{N} {(T_{j}^{P} - Y_{j}^{P})}^{2}}{M N}}

(1)

where

T_{j}^{P}

: The target output value of the jth neuron output unit of the pth training (test) sample;

Y_{j}^{P}

: The inferred output value of the jth output neuron of the pth training (test) sample;

M

: The number of training (testing) samples;

N

: The number of neurons in the output layer.

In general, MSE is usually used for statistical accuracy. It also performs the best measurements of variation for different epochs or time history data. For instance, MAE (Mean Absolute Error), CV (Coefficient Variation), and R-squared can also perform statistical accuracy, but only the RSME is the best performer for describing variation [21].

Step 3: Considering SVM with/without AE.

In the field of data mining for measuring classification performance, a confusion matrix, or namely error matrix, which is a specific table for visualization of the performance of an algorithm by a given set of data, is used. It typically analyzes supervised learning or unsupervised learning. Each row of the matrix presents the instance numbers in an actual class, and each column presents the instance numbers in a series of prediction outcomes. The user accuracy (UA) and producer accuracy (PA) are used in this study. The UA is calculated by the total number of correct classifications for a particular class divided by the row total. The producer accuracy presents for a given class the proportion of the reference data that is classified correctly.

Table 2 presents the confusion matrix for the testing data through SVM analysis. The overall accuracy is 85.56%. The UA for driftwood, waste, and non-waste is 82.54%, 95.45%, and 84.81%, respectively. The PA for driftwood, waste, and non-waste is 90.7%, 74.6%, and 86.2%. The coastal waste has the highest UA and the lowest PA.

Table 3 presents the confusion matrix for the testing data by using auto-encoder for preprocessing and then using SVM for analysis. The overall accuracy is 94.2%. The UA for driftwood, waste, and non-waste is 92.1%, 99.1%, and 93.7%, respectively. The PA for driftwood, waste, and non-waste is 91.6%, 98.2%, and 94.4%. In Table 2, there exists a certain degree of confusion between coastal waste and driftwood. That is, there are misclassifications between wastes and driftwoods. In the driftwood category, the classification results (Table 3) can be effectively improved through the preprocessing of AE. The omission error of driftwood also was reduced from 9.23% to 8.33%, and the commission error was reduced from 17.46% to 7.89%, respectively. Particularly, considering the waste category, the producer accuracy was enhanced from 74.67% to 98.21%. This is because most of the tiny noises were removed. The commission error was prominently reduced. Please refer to Table 2 and Table 3. At the same time, the progress of using AE in this part is significant. The coastal is greatly improved for both UA and PA through applying the robust noise-removing machine. Therefore, AE is an excellent preprocessing tool for enhancing the removal of noise, such as in coastal waste identification.

Step 4: Considering MLP with/without AE.

Table 4 presents the confusion matrix for the testing data through MLP analysis. The overall accuracy is 83.93%. The UA for driftwood, waste, and non-waste is 81.22%, 97.22%, and 81.77%, respectively. The PA for driftwood, waste, and non-waste is 92.08%, 70%, and 84.3%. It is found that the PA is only 70%, which is rather low for merely using MLP. Table 5 presents the confusion matrix for the testing data by using auto-encoder for preprocessing and then using MLP for analysis. The overall accuracy is 95.9%. The UA for driftwood, waste, and non-waste is 81.2%, 97.2%, and 81.7%, respectively. The PA for driftwood, waste, and non-waste is 94.9%, 99.6%, and 95.0%. In this classifier, the coastal waste is greatly improved for both UA and PA by applying the robust noise-removing machine. Particularly, considering the waste category, the producer accuracy was enhanced from 70.00% to 99.21%. This is because most of the tiny noises were removed. The kappa value also increased from 0.75 to 0.94. In the coastal waste categories, the classification results (Table 4 vs. Table 5) were effectively improved through the preprocessing of AE. The omission error of driftwood was also reduced from 7.92% to 7.27%, and the commission error was reduced from 18.78% to 5.05%, respectively. The omission errors decreased slightly, and commission errors were reduced significantly.

Comparing AE for preprocessing, the coastal waste classification was greatly improved by observing Table 2 vs. Table 3 and Table 4 vs. Table 5. The non-waste category including rock and sand was also improved by using AE in preprocessing. From the observation of the confusion matrix, we found that the small-sized target of goals for training samples and test samples require AE for data preprocessing.

4.2. Thematic Maps

An image is made of “pixels”, as shown in Figure 7. When a thematic map is built by an auto-encoder model it can successfully clean some of the noisy images that it has never seen before. Figure 7a,b presents the thematic map using original bands with seven set of texture information and seven sets of vegetation information for SVM and MLP classification, respectively. The salt-and-pepper effect is very serious in Figure 7a,b. In the high-resolution image, while the classifier is applied, the noises can be displayed in any kind of form. That is, single-pixel (20 × 20 cm) noise is very commonly visualized, and the total numbers are about 108 and 131. Figure 7c,d present the thematic map using original bands with seven sets of texture information and seven sets of vegetation information for AE as preprocessing for SVM and MLP classification, respectively. The noise numbers by applying AE (robust noise-removing machine) are reduced to 39 (from 108) and 47 (from 131) for SVM and MLP, respectively. In these two figures, the driftwoods is clearer in the images than when using SVM or MLP as well.

5. Summary and Conclusions

The traditional method of disposing of coastal waste is the organization of beach-cleaning activities by coastal government agencies. If a well-developed image system can monitor and manage the coastal environment, it will be of help to save a great deal of manpower and time. However, there are two major sources of coastal waste: (a) rock and driftwood, which are produced by upstream climate dramatically change, and (b) trash or garbage, which are made by human beings and consist of, e.g., plastic bottles and metal cans. The second source is influenced by background material, which causes the salt-and-pepper effect. More specifically, we consider the waste’s effect by the material of the component. Hence, if there is no image preprocessing, it cannot reach very high accuracy. Some reflection problems can be resolved by ancillary information (texture information and vegetation indices). It is true that some of the waste may be influenced by distortions affected by sunlight reflections. Unfortunately, we cannot go further into detail about waste (such as plastic bottles and metal cans). Furthermore, the thematic map can present the locations and distributions of coastal waste. Hence, an AE for data preprocessing was considered to remove noise. It was decided to also input seven sets of texture information and seven sets of vegetation indices into the original bands (B, G, R, and IR) to enhance the classification accuracy. Moreover, because most of the waste on the seashore is tiny, it was decided to use a 20 × 20 cm high-resolution image for effective analysis. The high-resolution image classification usually produces salt-and-pepper effects. Consequently, how to remove the noise becomes a crucial issue for an effective image classifier machine. Accordingly, AE was used as a preprocessing model for this study. An AE model was constructed of 18 input neurons, with the first and second encoder hidden layers having eight and five neurons, respectively. The center bottleneck has three neurons. The first and second decoder hidden layers have five and eight neurons, respectively. The final output layer has 18 neurons. The convergence of AE was applied, as a preprocessing is also important for different classifiers. The MSE was used, and a satisfactory requirement for convergence was adopted with the minimal value of 0.02 or 800 epochs, which were executed. The AE was linked to a classifier to display the thematic map and confusion matrix.

The study design used SVM and MLP to compare the classification results. The overall accuracy is about 85.5% and 83.9%. If the AE is applied as preprocessing, the overall accuracy can increase by about 10–12%. This study found that the AE can reduce the commission errors significantly. It also reduces the omission errors but not by much. For instance, considering the waste category, the producer accuracy was enhanced from 74.67% to 98.21%. This is because most of the tiny noises were removed. In the thematic maps, after applying AE, the driftwood display showed a better outcome. In addition, in our study case, the single-pixel noise amount was reduced from 108 and 131 to 39 and 47, respectively. That is, two-thirds of the salt-and-pepper effects were eliminated. Moreover, this study used a relatively small area to ensure the possibility that AE was capable of removing small noise in the image. In future study, we can consider generative adversarial networks (GAN) or convolutional neural networks (CNN) to apply in larger areas for better coastal management.

Author Contributions

Conceptualization, S.W. and T.C.L.; methodology, S.W.; validation, S.W. and T.C.L.; formal analysis, S.W.; investigation, S.W.; writing—original draft preparation, S.W.; writing—review and editing, S.W. and T.C.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Ministry of Science and Technology (MOST 109-2121-M-275-001).

Data Availability Statement

The data is supported from GIS center, Taiwan.

Acknowledgments

The authors express their gratitude to the Ministry of Science and Technology (MOST 109-2121-M-275-001) for sponsoring this work.

Conflicts of Interest

The authors declare no conflict of interest.

References

Martín, L.; Howarth, P. Change-detection accuracy assessment using SPOT multispectral imagery of the rural-urban fringe. Remote Sens. Environ. 1989, 30, 55–66. [Google Scholar] [CrossRef]
Turner, M.D.; Congalton, R.G. Classification of multi-temporal SPOT-XS satellite data for mapping rice fields on a West African floodplain. Int. J. Remote Sens. 1998, 19, 21–41. [Google Scholar] [CrossRef]
Alajlan, N.; Bazi, Y.; Melgani, F.; Yager, R.R. Fusion of supervised and unsupervised learning for improved classification of hyperspectral images. Inf. Sci. 2012, 217, 39–55. [Google Scholar] [CrossRef]
Lillesand, T.; Kiefer, R.W.; Chipman, J. Remote Sensing and Image Interpretation, 5th ed.; John Wiley & Sons: Hoboken, NJ, USA, 2004; ISBN 0471152277. [Google Scholar]
Williams, A.T.; Coe, J.M.; Rogers, D.B. Marine Debris: Sources, Impacts and Solutions. Geogr. J. 1999, 165, 233. [Google Scholar] [CrossRef]
Coe, J.M.; Rogers, D.B. (Eds.) Marine Debris: Sources, Impacts, and Solutions; xxxv, 432p. New York: Springer-Verlag, 1997. Price DM 128.00. J. Mar. Biol. Assoc. UK 1997, 77, 917. [Google Scholar] [CrossRef]
Calder, D.; Choong, H.; Carlton, J.; Chapman, J.; Miller, J.; Geller, J. Hydroids (Cnidaria: Hydrozoa) from Japanese tsunami marine debris washing ashore in the Northwestern United States. Aquat. Invasions 2014, 9, 425–440. [Google Scholar] [CrossRef]
Wyles, K.J. The Human Dimension of Marine Litter: The Impacts on Us. Conserv. Biol. 2004. Available online: https://fsj.field-studies-council.org/media/3260065/fs2017_wyles.pdf (accessed on 12 August 2022).
Moy, K.; Neilson, B.; Chung, A.; Meadows, A.; Castrence, M.; Ambagis, S.; Davidson, K. Mapping coastal marine debris using aerial imagery and spatial analysis. Mar. Pollut. Bull. 2018, 132, 52–59. [Google Scholar] [CrossRef] [PubMed]
Xie, J.; Xu, L.; Chen, E. Image denoising and inpainting with deep neural networks. Adv. Neural Inf. Process. Syst. 2012, 1, 341–349. [Google Scholar]
Murali Mohan Babu, Y.; Subramanyam, M.V.; Giri Prasad, M.N. PCA based image denoising. Signal Image Process. Int. J. 2012, 3, 236–244. [Google Scholar] [CrossRef]
Minai, A.A.; Williams, R.D. Back-propagation heuristics: A study of the extended delta-bar-delta algorithm. In Proceedings of the 1990 IJCNN International Joint Conference on Neural Networks, San Diego, CA, USA, 17–21 June 1990; Volume 1, pp. 595–600. [Google Scholar]
Kavzoglu, T.; Mather, P.M. The use of backpropagating artificial neural networks in land cover classification. Int. J. Remote Sens. 2003, 24, 4907–4938. [Google Scholar] [CrossRef]
Wan, S.; Chang, S.-H. Crop classification with WorldView-2 imagery using Support Vector Machine comparing texture analysis approaches and grey relational analysis in Jianan Plain, Taiwan. Int. J. Remote Sens. 2019, 40, 8076–8092. [Google Scholar] [CrossRef]
Wan, S.; Lei, T.C.; Ma, H.L.; Cheng, R.W. The Analysis on Similarity of Spectrum Analysis of Landslide and Bareland through Hyper-Spectrum Image Bands. Water 2019, 11, 2414. [Google Scholar] [CrossRef]
Wan, S.; Yeh, M.-L.; Ma, H.-L. An Innovative Intelligent System with Integrated CNN and SVM: Considering Various Crops through Hyperspectral Image Data. ISPRS Int. J. Geo-Inf. 2021, 10, 242. [Google Scholar] [CrossRef]
Gualtieri, J.A.; Cromp, R.F. Support vector machines for hyperspectral remote sensing classification. In Proceedings of the SPIE 27th AIPR Workshop: Advances in Computer-Assisted Recognition, International Society for Optics and Photonics, Washington, DC, USA, 14–16 October 1998; pp. 221–232. [Google Scholar] [CrossRef]
Huang, C.; Davis, L.S.; Townshend, J.R.G. An assessment of Support Vector Machines for land cover classification. Int. J. Remote Sens. 2002, 23, 725–749. [Google Scholar] [CrossRef]
Marconcini, M.; Camps-Valls, G.; Bruzzone, L. A Composite Semisupervised SVM for Classification of Hyperspectral Images. IEEE Geosci. Remote Sens. Lett. 2009, 6, 234–238. [Google Scholar] [CrossRef]
Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016; Volume 1, pp. 1–800. ISBN 978-0-262-03561-3. [Google Scholar]
Baldominos, A.; Blanco, I.; Moreno, A.J.; Iturrarte, R.; Bernárdez, Ó.; Afonso, C. Identifying Real Estate Opportunities Using Machine Learning. Appl. Sci. 2018, 8, 2321. [Google Scholar] [CrossRef]
Wan, S.; Yeh, M.-L.; Ma, H.-L.; Chou, T.-Y. The Robust Study of Deep Learning Recursive Neural Network for Predicting of Turbidity of Water. Water 2022, 14, 761. [Google Scholar] [CrossRef]
Banan, A.; Nasiri, A.; Taheri-Garavand, A. Deep learning-based appearance features extraction for automated carp species identification. Aquac. Eng. 2020, 89, 102053. [Google Scholar] [CrossRef]
Afan, H.A.; Osman, A.I.A.; Essam, Y.; Ahmed, A.N.; Huang, Y.F.; Kisi, O.; Sherif, M.; Sefelnasr, A.; Chau, K.-W.; El-Shafie, A. Modeling the fluctuations of groundwater level by employing ensemble deep learning techniques. Eng. Appl. Comput. Fluid Mech. 2021, 15, 1420–1439. [Google Scholar] [CrossRef]
Fan, Y.; Xu, K.; Wu, H.; Zheng, Y.; Tao, B. Spatiotemporal Modeling for Nonlinear Distributed Thermal Processes Based on KL Decomposition, MLP and LSTM Network. IEEE Access 2020, 8, 25111–25121. [Google Scholar] [CrossRef]

Figure 1. The driftwood in coastal Taiwan.

Figure 2. Study area.

Figure 3. Sampling by (a) GLCM (b) color image.

Figure 4. (a) Research steps; (b) Development of Robust Machine to remove image classification noise.

Figure 5. The architecture of our AE model.

Figure 6. (a) The performance of the model using the training/test dataset for SVM; (b) the performance of the model using the training/test dataset for MLP.

Figure 7. Thematic maps for different model outputs.

Table 1. Formula list of vegetation indices and texture information.

Vegetation Indices	Formula	Texture Indices	Formula
RVI	$\frac{R}{N I R}$	Homogeneity	$\sum_{i = 0}^{N} \sum_{j = 0}^{N} \frac{1}{1 + {(i - j)}^{2}} C_{i j} (d, θ)$
NDVI	$\frac{N I R - R}{N I R + R}$	Contrast	$\sum_{i j} {\| i - j \|}^{2} p (i, j)$
PVI	$\frac{N I R - N I R_{S o i l}}{\sqrt{1 + B^{2}}}$	Dissimilarity	$\sum_{i = 0}^{n} \sum_{j = 0}^{n} C_{i j} \| i - j \|$
SAVI	$(1 + L) \times \frac{N I R - R}{N I R + R + L}$	Entropy	$\sum_{i = 0}^{n} \sum_{j = 0}^{n} C_{i j} l o g C_{i j}$
GI	$\frac{N I R}{G}$	Variance	$\sum_{i = 0}^{n} \sum_{j = 0}^{n} {(i - μ)}^{2} p (i, j)$
IPVI	$\frac{N I R}{N I R + R}$	Mean	$\frac{1}{n} \sum_{i = 0}^{n} \sum_{j = 0}^{n} P_{i j}$
TSAVI	$\frac{B (N I R - N I R_{S o i l})}{R + B (N I R - A) + X (1 + B^{2})}$	Second Moment	$\sum_{i = 0}^{n} \sum_{j = 0}^{n} {P (i, j)}^{2}$
The experience factor: $L = 0.5; X = 0.08; Y = 0.16; Z = 0.35$ Soil linear equation of considering multiple scattering conditions: $R_{Soil} = A + B \times R; (A = 0.011, B = 1.16)$

Table 2. Confusion matrix for SVM.

Categories		PREDICT DATASET			Producer Accuracy	Omission Error
Categories		Driftwood	Waste	Non-Waste	Producer Accuracy	Omission Error
REAL DATASET	Driftwood	364	2	35	90.77%	9.23%
	Waste	8	168	49	74.67%	25.33%
	Non-Waste	69	6	469	86.21%	13.79%
User Accuracy		82.54%	95.45%	84.81%	Overall accuracy	85.56%
Commission Error		17.46%	4.55%	15.19%	Kappa	0.77

Table 3. Confusion matrix for AE + SVM.

Categories		PREDICT DATASET			Producer Accuracy	Omission Error
Categories		Driftwood	Waste	Non-Waste	Producer Accuracy	Omission Error
REAL DATASET	Driftwood	385	2	33	91.67%	8.33%
	Waste	4	219	0	98.21%	1.79%
	Non-Waste	29	0	498	94.50%	5.50%
User Accuracy		92.11%	99.10%	93.79%	Overall accuracy	94.19%
Commission Error		7.89%	0.90%	6.21%	Kappa	0.91

Table 4. Confusion matrix for MLP.

Categories		PREDICT DATASET			Producer Accuracy	Omission Error
Categories		Driftwood	Waste	Non-Waste	Producer Accuracy	Omission Error
REAL DATASET	Driftwood	372	1	31	92.08%	7.92%
	Waste	9	175	66	70.00%	30.00%
	Non-Waste	77	4	435	84.30%	15.70%
User Accuracy		81.22%	97.22%	81.77%	Overall accuracy	83.93%
Commission Error		18.78%	2.78%	18.23%	kappa	0.75

Table 5. Confusion matrix for AE + MLP.

Categories		PREDICT DATASET			Producer Accuracy	Omission Error
Categories		Driftwood	Waste	Non-Waste	Producer Accuracy	Omission Error
REAL DATASET	Driftwood	357	1	27	92.73%	7.27%
	Waste	2	250	0	99.21%	0.79%
	Non-Waste	17	0	516	96.81%	3.19%
User Accuracy		94.95%	99.60%	95.03%	Overall accuracy	95.98%
Commission Error		5.05%	0.40%	4.97%	Kappa	0.94

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wan, S.; Lei, T.C. A Development of a Robust Machine for Removing Irregular Noise with the Intelligent System of Auto-Encoder for Image Classification of Coastal Waste. Environments 2022, 9, 114. https://doi.org/10.3390/environments9090114

AMA Style

Wan S, Lei TC. A Development of a Robust Machine for Removing Irregular Noise with the Intelligent System of Auto-Encoder for Image Classification of Coastal Waste. Environments. 2022; 9(9):114. https://doi.org/10.3390/environments9090114

Chicago/Turabian Style

Wan, Shiuan, and Tsu Chiang Lei. 2022. "A Development of a Robust Machine for Removing Irregular Noise with the Intelligent System of Auto-Encoder for Image Classification of Coastal Waste" Environments 9, no. 9: 114. https://doi.org/10.3390/environments9090114

APA Style

Wan, S., & Lei, T. C. (2022). A Development of a Robust Machine for Removing Irregular Noise with the Intelligent System of Auto-Encoder for Image Classification of Coastal Waste. Environments, 9(9), 114. https://doi.org/10.3390/environments9090114

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Development of a Robust Machine for Removing Irregular Noise with the Intelligent System of Auto-Encoder for Image Classification of Coastal Waste

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area

2.2. Image Format

2.3. Ancillary Information

3. Method for Study Plan

3.1. Auto-Encoder (AE)

3.2. Multi-Layer Perceptron (MLP)

3.3. Support Vector Machine (SVM)

3.4. Development of Robust Removing Noise Machine

4. Results

4.1. Preparing Data for SVM and MLP

4.2. Thematic Maps

5. Summary and Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI