A New Approach of Ensemble Learning Technique to Resolve the Uncertainties of Paddy Area through Image Classification

Remote sensing technology has rendered lots of information in agriculture. It has usually been used to monitor paddy growing ecosystems in the past few decades. However, there are uncertainties in data fusion techniques which can be resolved in image classification on paddy rice. In this study, a series of learning concepts integrated by a probability progress Fuzzy Dempster-Shafer (FDS) analysis is presented to upgrade various models and different types of image data which is the goal of this study. More specifically, the study utilized the FDS to generate a series of probability models in the classification of the system. In addition, Logistic Regression (LR), Support Vector Machine (SVM), and Neural Network (NN) approaches are employed into the developed FDS system. Furthermore, two different image types are Satellite Image and Aerial Photo used as the analysis material. The overall classification accuracy has been improved to 97.27%, and the kappa value is 0.93. The overall accuracy of the paddy field image classification for a multi-period of mid-scale satellite images is between 85% and 90%. The overall accuracy of the classification using multi-spectral numerical aerial photos can be between 91% and 95%. The FDS improves the accuracy of the above image classification results.


Introduction
Paddy rice is a major food source for more than half of the world's population, mostly in regions of Asia, Africa, and Latin America. On the other hand, paddy area cultivation has drawn attraction from the government which has become more important. The investigation of paddy areas needs new integration into different technology. Paddy areas are one of the major crops which have been cultivated all over Taiwan. Hence, a good solution from data sources and different classifiers need to be integrated. One of the uses of image data is to use satellite image data to handle the management of paddy areas precisely. Satellite image data provide the essential technology and methodology on monitoring, mapping, and observing the variation on paddy areas. It also considers repeated time intervals, which interprets paddy growing areas under a variety of aspects [1]. Furthermore, excessive research has been dedicated to significant efforts to employ satellite optical data to construct the target GIS map for delineation of paddy field areas by means of image classification. These problems have drawn great attention from the past half-century to recent times. Classification methods have used data from Landsat TM and ETM+ series [2], SPOT series [3], MODIS [4], RADARSAT series [5], ERS-1 and ERS-2 [6], ENVISAT/ASAR [7], IRS [8], AVHRR [9], Aerial-Photo [10], UAV [11], etc. Classifiers may 1.
The first step is the identification part, where the core issue is to understand data fusion. To handle the source of the data, one should be familiar with the methods and techniques of data analysis.

2.
In the second step, estimations can be divided into four parts, and the processing procedures are (a) Signal level; (b) Pixel level; (c) Feature level; and (d) Symbol level, etc. The signal levels and pixel levels are represented in different ways. The data of the pixel level is more intuitive for humans, but these two levels of data lack the relationship between the mathematical models of the measured object [23]. Feature level processing is to extract features from the original data and then to carry out the image fusion. This part of the method can be divided into (a) Feature extraction and (b) Feature selection, two major mainstreams [24]. Principal Component Analysis (PCA) is a common feature extraction method. However, it is the symbol level processing, in this part of the step, which is to merge the data usage statistics and the logical inference methods to facilitate the subsequent mathematical modeling or data analysis. At the last stage of symbol level fusion, the data is combined with the aid of a mathematical model and the analysis is based on statistical and logical inference. The symbol level processing is the result of decision analysis. This part of the algorithm can be roughly divided into three categories: (a) Physical model recognition algorithm (Kalman filter, maximum likelihood estimation, generalized least squares etc.), (b) Parameter classification and recognition algorithms (Bayesian estimation, Dempster-Shafer evidence theory, entropy estimation, supervised classification, and unsupervised classification), (c) Cognitive architecture model (expert system, fuzzy sets, LIDA [25], ACT-R [26], SOAR [27], etc. 3. In the third step, the validation contains the following: (a) Uncertainty in the solution content (probability measure, false alarm rate, or accuracy classification). This includes the assessment of the performance of the data fusion model, which can be made to measure the uncertainty content in the solution. (b) To establish a benchmark program to improve data fusion. (c) Processing data and the fusion information is at the validation stage. The information can be hereby used to effectively integrate by the above process.
Remote Sens. 2020, 12, x FOR PEER REVIEW 3 of 24 humans, but these two levels of data lack the relationship between the mathematical models of the measured object [23]. Feature level processing is to extract features from the original data and then to carry out the image fusion. This part of the method can be divided into (a) Feature extraction and (b) Feature selection, two major mainstreams [24]. Principal Component Analysis (PCA) is a common feature extraction method. However, it is the symbol level processing, in this part of the step, which is to merge the data usage statistics and the logical inference methods to facilitate the subsequent mathematical modeling or data analysis. At the last stage of symbol level fusion, the data is combined with the aid of a mathematical model and the analysis is based on statistical and logical inference. The symbol level processing is the result of decision analysis. This part of the algorithm can be roughly divided into three categories: (a) Physical model recognition algorithm (Kalman filter, maximum likelihood estimation, generalized least squares etc.), (b) Parameter classification and recognition algorithms (Bayesian estimation, Dempster-Shafer evidence theory, entropy estimation, supervised classification, and unsupervised classification), (c) Cognitive architecture model (expert system, fuzzy sets, LIDA [25], ACT-R [26], SOAR [27], etc. 3. In the third step, the validation contains the following: (a) Uncertainty in the solution content (probability measure, false alarm rate, or accuracy classification). This includes the assessment of the performance of the data fusion model, which can be made to measure the uncertainty content in the solution. (b) To establish a benchmark program to improve data fusion. (c) Processing data and the fusion information is at the validation stage. The information can be hereby used to effectively integrate by the above process. Finally, the present applications of data fusion expand into a wide range of fields. For instance, pattern recognition and radar tracking [28], robotics [29], traffic control [30], remote sensing [31], and geosciences [21], etc. Therefore, this research developed a series of ensemble learning concepts through transformations of information processing, classification procedures, and the concept of data uncertainty analysis. To solve the series of aforementioned problems, this research has developed a systematic problem research method. It can be stated as follows: (a) The era of multi-sensors has been developed, and the monitoring environment no longer depends on a single device, but satellite and aerial photos are suitable devices for monitoring landform changes. Similarly, different classifiers have their own pros and cons. Therefore, we defined the equipment to monitor the environment as Pixel Level information. In this study, we used "multi-period and multi-spectral satellite imagery" + "single-period and multi-spectral numerical aerial photography" as sources of information for different sensors. (b) We used spectrum, spectral index and texture information for Feature Level. (c) Therefore, this research developed a series of ensemble learning concepts through transformations of information processing, classification procedures, and the concept of data uncertainty analysis. To solve the series of aforementioned problems, this research has developed a systematic problem research method. It can be stated as follows: (a) The era of multi-sensors has been developed, and the monitoring environment no longer depends on a single device, but satellite and aerial photos are suitable devices for monitoring landform changes. Similarly, different classifiers have their own pros and cons. Therefore, we defined the equipment to monitor the environment as Pixel Level information. In this study, we used "multi-period and multi-spectral satellite imagery" + "single-period and multi-spectral numerical aerial photography" as sources of information for different sensors. (b) We Remote Sens. 2020, 12, 3666 4 of 23 used spectrum, spectral index and texture information for Feature Level. (c) Different from other studies, we used three different classifiers as tools to generate Symbol Level. These three classifiers are statistical "Logistic Regression (LR)" and "Support Vector Machine (SVM)" for machine learning, and "Artificial Neural Network (ANN)" in the field of artificial intelligence. However, each of these three classifiers has its advantages and disadvantages, and we will produce different results for this paddy area classification issue, so we use the concept of Dempster-Shafer (DS), through the practice of an evidence synthesis (Evidential reasoning, ER) algorithm, the three pieces of classification information are integrated into a piece of single decision-making information, thereby improving the uncertainty of the classification results, using the method of the FDS (Fuzzy-DS theory) theory of evidence. The concept is that this method has the ability to express what is "uncertain" and "not known" directly. It belongs to the category of artificial intelligence, and was first applied to some expert systems [32].
Finally, our study has six major steps: (a) Data collection and pre-processing; (b) Extraction and analysis of spectral characteristics of paddy fields and non-paddy fields in satellite images and aerial images; (c) Integration of new satellite images and the spectral characteristics of numerical aerial images are subdivided into mound patches units; (d) Multi-feature classification analysis of multi-scale images; (e) to establish the decision-making of the FDS module to present the uncertainty outcomes of patches; (f) Confusion matrix analysis for rice field image classification.

Material, Study Area, and Research Design
The Meinong area of Kaohsiung City was selected for the study case ( Figure 2). The Meinong in the northeast of the geographical center of Kaohsiung City is terrain that is mostly mountainous. It is located at the alluvial fan of the Laku-laku stream and the whole region is a rich hydrological system. There is the richness of the Laku-laku stream and it has tributary Meinong stream throughout the territory. The region is an important paddy area growth area in Taiwan. The analysis images are those in which the grid data are divided into the satellite images and the numerical aerial images. The satellite image is the optical image data of Formosa II. It is panchromatic (with space resolution 2 m) and the multi-spectral image (with resolution 8 m). It can use the texture of the shape, the edge, and other characteristics. The results of the interpretation of the ground surface can take place on 1 February 2015 for the paddy transplanting stage ( Figure 3a) and on 2 April 2015 for the paddy tillering stage (Figure 3b). In the high-resolution aerial photographs section, the study used Digital Mapping Camera (DMC), which has eight lenses, with four lenses in the middle forming a high-resolution panchromatic image of a 13,824 × 7680 cell, with four wide-angle lenses on the outside, blue, green, red, and infrared, etc. The Institute used DMC data to capture a total of 33 images of the first phase of the 2015 paddy DMC (Figure 3c). In addition, the ground truth is for the results of the 2015 paddy area interpretation and the latest version of the status map of the cultivated photo by the Agriculture and Agriculture Administration (see Figure 2). The red patches are the rice paddies and the green patches are the non-paddy fields. In this study, we use the concept of cross-validation to train and verify the model. The overall data have 26,815 numbers of patches, and the number of non-rice has 20,031. We initially divide the samples into training data in a proper range of numbers. The sample locations of both remote sensing data and aerial photograph data are carefully checked for consistency. The total number of training data is 8200. It is divided into a number of 3193 rice (the number is 1), and a number of 5007 non-rice (the number is 0). Due to there being a massive number of 0, we selected a high enough number of sample 1 to meet the classification criteria. The ratio is to meet the problem of the uneven number category of sampling. The total number of testing samples is 8604, of which the rice is 3591, and the non-rice is 5013, and the design of such research data should be sufficient for the sampling analysis.

Methods
The research method of this study can be divided into four major parts. (a) The ancillary information for the spatial database; (b)The introduction of the multiple classifiers used for our case; (c) Information processing architecture of multiple classifiers; and (d) The high uncertainty of the patches, which is generated by the multiple classifiers. The content of the interpretation module is as follows.

Ancillary Information for Spatial Database
This study provides the characteristic information of paddy area patches for multi-phase optical remote image data, which includes the original band of the image, vegetation information and texture information. For the paddy growth period, the study considers different conditions and chooses to use images of different scales to extract the characteristic information of farmland growth in order to establish a knowledge base of the multi-scale image of the farmland paddy area features. Table 1 presents the Spectrum characteristics index to improve analysis of the classification, which includes the Ratio Vegetation Index (RVI), the Normalized Difference Vegetation Index (NDVI), Perpendicular Vegetation Index (PVI), Soil-adjusted Vegetation Index (SAVI), Transformed Soiladjusted Vegetation Index (TSAVI), Crop Management Factor Index (CMFI), Greenness Index (GI), Infrared Percentage Vegetation Index (IPVI), Modified Soil-adjusted Vegetation Index (MSAVI), Optimized Soil-adjusted Vegetation Index (OSAVI), Generalized Soil-adjusted Vegetation Index (GSVI). On the other hand, the Gray Level Co-Occurrence Matrix (GLCM) method is used for the extraction of image texture. The texture images contain spatial distribution-related information, thus, it can increase the image classification with the distinct information. In some cases, the appropriate selection of texture images can increase the classification accuracy. The following texture information is used to produce each band of the texture image: (1) Homogeneity, (2) Contrast, (3) Dissimilarity, (4) Entropy, (5) Variance, (6) Mean, and (7) Second Moment. All the vegetation indicators and texture information are shown in Table 1.

Methods
The research method of this study can be divided into four major parts. (a) The ancillary information for the spatial database; (b) The introduction of the multiple classifiers used for our case; (c) Information processing architecture of multiple classifiers; and (d) The high uncertainty of the patches, which is generated by the multiple classifiers. The content of the interpretation module is as follows.

Ancillary Information for Spatial Database
This study provides the characteristic information of paddy area patches for multi-phase optical remote image data, which includes the original band of the image, vegetation information and texture information. For the paddy growth period, the study considers different conditions and chooses to use images of different scales to extract the characteristic information of farmland growth in order to establish a knowledge base of the multi-scale image of the farmland paddy area features. Table 1 presents the Spectrum characteristics index to improve analysis of the classification, which includes the Ratio Vegetation Index (RVI), the Normalized Difference Vegetation Index (NDVI), Perpendicular Vegetation Index (PVI), Soil-adjusted Vegetation Index (SAVI), Transformed Soil-adjusted Vegetation Index (TSAVI), Crop Management Factor Index (CMFI), Greenness Index (GI), Infrared Percentage Vegetation Index (IPVI), Modified Soil-adjusted Vegetation Index (MSAVI), Optimized Soil-adjusted Vegetation Index (OSAVI), Generalized Soil-adjusted Vegetation Index (GSVI). On the other hand, the Gray Level Co-Occurrence Matrix (GLCM) method is used for the extraction of image texture. The texture images contain spatial distribution-related information, thus, it can increase the image classification with the distinct information. In some cases, the appropriate selection of texture images can increase the classification accuracy. The following texture information is used to produce each band of the texture image: (1) Homogeneity, (2) Contrast, (3) Dissimilarity, (4) Entropy, (5) Variance, Remote Sens. 2020, 12, 3666 7 of 23 (6) Mean, and (7) Second Moment. All the vegetation indicators and texture information are shown in Table 1.

Logistic Regression
As part of this study, Logistic Regression was used as one of the classifiers for prediction. Logistic Regression is used extensively in the sciences as well as in many applications such as prediction of floods, debris flow, and landslides [33][34][35]. Logistic Regression is usually used for the prediction of the probability of occurrence of an event by fitting data to a logistic curve. In linear regression, the predictive values are theoretically inadmissible. In general, Logistic Regression predicts a discrete outcome to display different decision results for generating integer numbers. The decision or outcomes is dichotomous, such as success/failure or occurrence/non-occurrence outputs [36]. In statistics, a binary logistic model usually has a dependent variable with two values, pass or fail, which is represented by an indicator variable. The two values labeled "0" and "1" in this study represent paddy and non-paddy fields, respectively. In the logistic model, the independent variables can each be a binary variable (two classes, coded by an indicator variable) or a continuous variable (any real value). The corresponding probability varies between 0 and 1. A sigmoid function, defined as follows, is adopted in this study. It was decided to employ the Multinomial-polynomial model to construct our classifier and the base category for the classification, which is adopted for the rice in the training sample. In addition, the Pearson chi-square statistic is used to test the independence between variables. Specifically, the Maximum iterations is set up as 200 to facilitate the rapid convergence of the problem and achieve our classification purpose.

Support Vector Machine
The Support Vector Machine (SVM) algorithm is a popular machine learning tool that offers solutions for both classification and regression problems. Support vector machines (SVMs) are well-accepted supervised learning methods used for classification. Whereas the SVM classifier supports binary classification and multiclass classification, the structured SVM allows the training of a classifier for generally structured output labels. More specifically, there exist many hyperplanes which may be able to classify the data. One rational choice for the best hyperplane is one that represents the largest separation, or margin, between the two classes. The optimal choice of the hyperplane is to make certain the distance of it from the nearest data point on each side is maximized. A special feature of these classifiers is to minimize the empirical classification error and maximize the geometric margin, simultaneously [37]. The main core of the support vector machine is to select an appropriate kernel function. The function of the kernel is to take data as inputs and convert them into the required form. This is because different types of data cannot be linearized in the original space. When separated, the data after nonlinear projection can be easier to separate in a higher dimensional space, usually linear, polynomial, radial basis function (RBF), and Sigmoid function. This study employed polynomial function to show the classification results, and the value of bias we set to 0.

Artificial Intelligence
The artificial intelligence of neural networks is an information processing method inspired by the way biological neural systems process data. Neural Networks (NN) were first proposed in the early 1940s as an attempt to simulate human brain cognitive learning processes [38]. They may be programmed so that the primary function is to develop models of problems based on trial and error or learning procedures. In the past twenty years, Back Propagation Neural Network was extensively applied in many fields. The relationship of massive data and a certain phenomenon is obtained through a learning system (instead of calculation), based on the neuron cell concept. In the past, engineers and researchers have experienced that describing variables for classifying remote sensing imagery is a tough task. If a paddy area spatial database was well developed to describe the input variables and output categories rationally, it may be more suitable to apply a Back Propagation Neural Network as a learning machine [39]. In essence, the neural network is composed of many nodes to connect input neurons and output neurons to three different types of layers: input layers, hidden layers, and output layers. In this study, we designed and implemented a total of 22 different neurons, 2, 4 and 6-input biased and non-biased neurons with each having three different activation functions. Moreover, in terms of data training, this tool has six types of dataset training methods, which are quick, dynamic, multiple, prune, RBFN radius function network, and exhaustive prune thorough deletion. The study uses a fast method to construct sample data, where this method uses rules of thumb and data characteristics to select the appropriate network topology. Finally, the activation functions we selected are Logarithmic-Sigmoid (LogSig) as an output module for outcomes.

Information Processing Architecture of Multiple Classifiers
This study has two types of data sources with three classification methods. The attributes of the data include: (a) the R, G, B and NIR bands, (b) vegetation indicators, and (c) texture information through satellite imagery and aerial photos. Therefore, there will be six outcomes. This study will use two types of data in the multi-classification procedure (see Figure 4), and the interpretation results on the same scale basis of the GIS database. It can be a preliminary determination of the patch category results (as Figure 4). Basically, every patch will have two kinds of information. First, the "probability information" is given by the classifier, and then the final generalization of the "category information". In past research, people have often used the category information for classification results but have usually ignored the probability information given by the classifier, which contains some uncertainties. Due to the large number of patches need to detect them by automated conditions, we first used category Remote Sens. 2020, 12, 3666 9 of 23 information for problem-checking. If considering in the same patch, different scenarios and various classifiers were shown in the same category after different classification procedures, for example in this case paddy area (results of 1) or non-paddy area (results of 0), we considered these kinds of patches that are of high information certainty. In addition, the most difficult decision-making patch field contradicted the different classification procedures for the prediction. At this time, the patch field can be regarded as being of high information uncertainty. The undetermined category involves further processing, which will be discussed in next part. On the other hand, the analytic tool is ab IBM SPSS Modeler in this study, which can perform different classification outputs. The graphical user interface (GUI) was applied widely to different problems.
two types of data in the multi-classification procedure (see Figure 4), and the interpretation results on the same scale basis of the GIS database. It can be a preliminary determination of the patch category results (as Figure 4). Basically, every patch will have two kinds of information. First, the "probability information" is given by the classifier, and then the final generalization of the "category information". In past research, people have often used the category information for classification results but have usually ignored the probability information given by the classifier, which contains some uncertainties. Due to the large number of patches need to detect them by automated conditions, we first used category information for problem-checking. If considering in the same patch, different scenarios and various classifiers were shown in the same category after different classification procedures, for example in this case paddy area (results of 1) or non-paddy area (results of 0), we considered these kinds of patches that are of high information certainty. In addition, the most difficult decision-making patch field contradicted the different classification procedures for the prediction. At this time, the patch field can be regarded as being of high information uncertainty. The undetermined category involves further processing, which will be discussed in next part. On the other hand, the analytic tool is ab IBM SPSS Modeler in this study, which can perform different classification outputs. The graphical user interface (GUI) was applied widely to different problems.

Decision-Making Interpretation Module for Uncertain Patches
This study has developed a decision-making integrator on the high uncertainty data generated by the above analysis. It is based on the Concept of Dempster/Shafer Evidence Theory, which was first proposed by Dempster in 1967 and further developed by his student Shafer in 1976 to develop it as an inaccurate reasoning theory. DS Evidence Theory belongs to the category of artificial intelligence which is first applied to expert systems with the ability to deal with uncertain information [32]. As an uncertain reasoning method, the main characteristics of evidence theory are to meet weaker conditions than the Bayesian probability theory. Many techniques have to be refined and developed to implement the DS theory, one of which is the Evidential Reasoning (ER) algorithm. Because DS theory requires a combination of multiple uncertain evidences, the set of assumptions is gradually scaled down as the evidence accumulates, which results in precise reasoning results. The accumulation process of evidence requires a method or rule to calculate the degree of influence of multiple evidence based on the hypothesis for considering a specific problem. If these pieces of information or evidence are not completed, a trust degree evidence level can be calculated as a joint function for statistical analysis.

Decision-Making Interpretation Module for Uncertain Patches
This study has developed a decision-making integrator on the high uncertainty data generated by the above analysis. It is based on the Concept of Dempster/Shafer Evidence Theory, which was first proposed by Dempster in 1967 and further developed by his student Shafer in 1976 to develop it as an inaccurate reasoning theory. DS Evidence Theory belongs to the category of artificial intelligence which is first applied to expert systems with the ability to deal with uncertain information [32]. As an uncertain reasoning method, the main characteristics of evidence theory are to meet weaker conditions than the Bayesian probability theory. Many techniques have to be refined and developed to implement the DS theory, one of which is the Evidential Reasoning (ER) algorithm. Because DS theory requires a combination of multiple uncertain evidences, the set of assumptions is gradually scaled down as the evidence accumulates, which results in precise reasoning results. The accumulation process of evidence requires a method or rule to calculate the degree of influence of multiple evidence based on the hypothesis for considering a specific problem. If these pieces of information or evidence are not completed, a trust degree evidence level can be calculated as a joint function for statistical analysis.
The study is the first to take out the classification value of each algorithm ( Table 2). The real line frame in Table 2 is "category information" and the second is the use of "probability information" to distinguish the problem. For instance, the value of patch1 has (0, 0.00) by the Logistic method based on the satellite image. The 0 is category information and 0.00 is the probability information. The total number of patches in Table 2 is 26,815. Because there were too many items, we divided them into three types of categories for illustration purposes. In addition, the numbers of these types of patches are not consecutive. These three categories from top to bottom are the so-called patches with high/low certainty.    The number of these patches of non-paddy by classification results is 18,731. The results are all remarked as (0, 0, 0, 0, 0). The high degree of certainty is the paddies with the number of these patches is 5596 for the classification. They are all remarked as (1, 1, 1, 1, 1, 1) for paddies with high certainty. There is another kind of uncertainty classification in the Table in which the number of these patches is 2488. For instance, for the patches with a high uncertainty, we choose the patches numbered 42,643 to explain them ( Figure 5). In these patches, LR, SVM and ANN are the classification results of satellite image classification (0, 0.132), (0, 0.102), (0, 0.072), respectively, while the aerial imagery classification outcomes of LR, SVM and ANN are (1, 0.594), (1, 0.548), (0, 0.329), respectively. There are many possible reasons that may induce the differences of the image by resolution. This also may be caused by the change in the complex farming behavior. It may also cause classification errors. In fact, it is very difficult to analyze the differences in such problems one by one. However, our developed system of the FDS concept can rationally integrate them. For instance, for the patch 42,643, based on this probability value, this study used Gaussian Blur function for the probability in the dataset (Figure 5a). The integrating probability values were generated by various methods into a joint de-fuzzification probability (Figure 5b). The de-fuzzification value of the fuzzy value of different methods among the different scales is shown in Figure 5c. The value of the probability of each algorithm used is blurred. Specifically, the value of the different methods of Figure 5a is used. Figure 5b has the y-axis present as the normalized fuzzy rate value. The x-axis is from the evidence theory of information of an uncertain intensity. There range from 0 to 1, where 0 is the strongest negative intensity, 1 is the positive intensity of the strongest, and the value of 0.5 indicates the highest degree of information uncertainty. In each point in the dataset, we can determine an appropriate alpha-cut value (the red dotted line in Figure 5c on the y-axis) as a basis for the study to be blurred. Through de-fuzzification, the minimum informative level, average informative level, and the maximum informative level obtained after de-fuzzification, the information intensity in the theory of evidence is well-tested. It then combines them among different algorithms for the highest possible category in determined by voting results. In this study, the alpha-cut (α) value was given at 0.9 as a threshold for decision categories.
The reason for using the DS theory is to combine the fuzzy-based probability on characteristic values to summarize the final classification attribute. The concept is that because the pattern is already a Gaussian function, the pattern is characterized by a standardized pattern, and the pattern is taken out of the 0.9 threshold of the Y-axis for the minimum (X 1 ), the mean,(X 3 ), and the maximum values (X 5 ). Thus, one can measure the chance of this fuzzy integration information, and judge which position will be allocated. If we take 0.5 as the centerline, the overall information is a shift to the left. Hence, we will judge the non-paddy field. If the overall information is a shift to the right, we will judge the rice paddies. Very few (almost none) of the examples occur at 0.5 positions. If the position of 0.5 is in the center of the graph (green dotted line in a database), we can obtain a category determination. The value 0.5 is the threshold of determination. In this case, it is determined on the left side, thus the value is 0 and vice versa. Thus, we can synthesize all the information to produce a different result. Equation (1) to Equation (3) expresses the formula as follows [40]: i f Y 1 > Y 2 then class 0 (it s mean non − paddy area) i f Y 1 < Y 2 then class 1 (it s mean paddy area) The reason for using the DS theory is to combine the fuzzy-based probability on characteristic values to summarize the final classification attribute. The concept is that because the pattern is already a Gaussian function, the pattern is characterized by a standardized pattern, and the pattern is taken out of the 0.9 threshold of the Y-axis for the minimum (X1), the mean,(X3), and the maximum values (X5). Thus, one can measure the chance of this fuzzy integration information, and judge which

Results and Discussion
The results of this study consist of three major parts: (1) spectral characteristic extraction analysis of paddy area paddies and non-paddy area fields; (2) the spectral characteristics of satellite imagery and numerical aerial images for the knowledge-based unit of the paddy area; and (3) multi-scale image classification analysis. The establishment of a decision-making interpretation module is for the high uncertainty information uncertainty of the analysis of the classification of paddy field images, which is summarized as follows:

The Image Feature Analysis
This study uses the ancillary information of multi-period images, which contains the original band of the image, vegetation information, and texture information. In this study, the use of images of different scales is selected to extract the growth characteristic information of the paddy area for different conditions. Using the Regional Object Classification (ROC) technique [41], it can transfer the image information from the pixel-scale to the regional scale operating unit as well as establishing paddy area information. The designed program is used to display the needs of the patch.

Integration of the Spectral Characteristics of Satellite Imageries and Aerial Photos
The material includes: (a) the R, G, B and NIR bands; (b) vegetation indicators; and (c) texture information through satellite imagery and aerial photos. The study adopted the mean value of previous attribute data. This research is classified as the paddy area by patch detection. The image segmentation technique is adopted to separate different patches by cutting them into different regions. The results for the summation and averaging of each region are taken into account for the material of the patch detection. The knowledge database of paddy area patches are established and organized to improve the misjudgment problem by the FDS module. The inconsistency of two different image types and three classifiers are analyzed by the probability process.  Table 3 uses Logistic regression, Support Vector Machine, and Neural Network model to train sample area and verification sample area classification results for satellite imagery (original band + vegetation index + texture information), the overall accuracy of the training classification is between 94-96%, Kappa is 0.86-0.91, and the overall accuracy of the verification classification is between 94-95%, and Kappa is between 0.84-0.87. Table 3 uses the support vector machine, logic regression and sample area of neural network type in the aerial image (original band + vegetation index + texture information) and verified sample area classification results. The overall accuracy of the training classification is between 95-96%, Kappa is between 0.89-0.91. The overall accuracy of the verification classification is between 95-96% and Kappa is between 0.88-0.90. Speaking overall, the spatial resolution of satellite images is poorer than aerial photos. Generally, both of them have good results. However, these classification outcomes have two states, that is, a high degree of certainty and uncertainty of the results. We do not handle high certainty patches. There may include some neglectable errors. The rest of the samples will be handled through the Gaussian Blur and the probability value. There is also the concept of a fuzzy set to obtain the intersection of fuzzy probability values. Figure 6a is the result of an inconsistent classification of patches where there are 1107 inconsistent samples by satellite image. In Figure 6b, there are 683 inconsistency patches for an aerial image. Figure 6c is presented for six classification results (satellite image and aerial image). There are 2488 patches for six classification results which are not consistent. It includes (1) satellite image classification results that were consistent, but aerial image classification results that were inconsistent (575 patches); (2) aerial photo image classification results that were consistent, but satellite image classification results that were inconsistent (999 patches); and (3) results of satellite image classification were inconsistent, and the results of aerial image classification were inconsistent (914 patches). Observing the above figures, among the different resolutions of the image and the interpretation of paddy areas, this problem also has different abilities (Figure 6a,b). The classification uncertainty of satellite imagery is obviously higher than that of aerial photos. In addition to the lack of resolution, the interference of clouds and fog may exist in the classification progress. Although there are fewer uncertainties, aerial photos are also not perfect. This is largely due to the fact that the photos taken at the time were dependent on the different periods of paddy rice growth (only one realization). This is the first piece of evidence. The second piece of evidence is that the region of error randomly takes place. There is no majority error from the model. It will be compared with Figure 6a, b. Under the same conditions, different classification methods will have different classification results. Although past studies have tended to ignore the impact of these problems, it is important to effectively address these problems. photos taken at the time were dependent on the different periods of paddy rice growth (only one realization). This is the first piece of evidence. The second piece of evidence is that the region of error randomly takes place. There is no majority error from the model. It will be compared with Figure 6a, b. Under the same conditions, different classification methods will have different classification results. Although past studies have tended to ignore the impact of these problems, it is important to effectively address these problems.

Establishment of the Results of the Decision-Making Interpretation Module
This study develops a set of decision-making mechanisms for the uncertainty paddy area data generated by the above analysis and further extracts the classification rate value of each algorithm with producing high information uncertainty. Based on this probability value, a Gaussian blur can be generated for the ratio of each algorithm. It then performs the maximum-minimum multiplier

Establishment of the Results of the Decision-Making Interpretation Module
This study develops a set of decision-making mechanisms for the uncertainty paddy area data generated by the above analysis and further extracts the classification rate value of each algorithm with producing high information uncertainty. Based on this probability value, a Gaussian blur can be generated for the ratio of each algorithm. It then performs the maximum-minimum multiplier operation, which can obtain the preliminary fuzzy outputs. This study also integrates the classification signals between the scales with the probability distribution value of the evidence theory (DS) as a reference. Finally, through this decision-making process to determine the categories of unknown patches, it can effectively enhance the overall accuracy.
This study also based on the above classification results of the study area of the ground truth overlapping range of classification results (see Table 2). There are six classification results (satellite image and aerial image). The consistent patches include the classification of the 5596 patches for paddy area; and the non-paddy area paddies have 18,731 patches, 90.72% of the total (Figure 7a and Table 4). There were still 169 paddy area samples that were misjudged as non-paddy area samples and there are 57 non-paddy area fields. This allocated 0.84% of the total samples. The program inconsistently handled six classification results (satellite imagery and aerial imagery) and the paddy area was transformed by the FDS model. There are 884 for paddy area patches and 1138 non-paddy area patches amended, which stand for 7.39% of the total samples ( Figure 7b and Table 4). However, FDS misjudged rice paddies have 232 patches and 274 patches for non-paddy area patches, which stands at 1.89% of the totals samples (Figure 7c and Table 4).  Table 5 are fuzzy-DS overall accuracy and classification results. The ground truth is rice and is determined to be paddy area with the number of 6383, which was determined as the non-paddy area and has a number of 401. The ground truth for the non-paddy area and the paddy area has a number of 330. There were 19,701 patches in the non-paddy area. The user accuracy of paddy was 94.09% and the user accuracy of the non-paddy are was 98.35%. The producer accuracy of the paddy area was 95.08% and the producer accuracy of the non-paddy area was 98.01%, respectively. The overall accuracy is 97.27%. The kappa value is also enhanced to 0.93. The results of its classification are shown in Figure 8.  Furthermore, the advantages of the FDS method are presented in Figure 9. It shows a comparison of three classifiers using two data sources in the empirical area of Figure 9a. From Figure  9a, we can clearly find that the same method is possible for different outcomes through different data   Furthermore, the advantages of the FDS method are presented in Figure 9. It shows a comparison of three classifiers using two data sources in the empirical area of Figure 9a. From Figure 9a, we can clearly find that the same method is possible for different outcomes through different data sources. Moreover, the same data source is classified using different classification methods, the outcomes will not be the same. This is due to each classifier having its own characteristics and the ability of this characteristic is basically possible to meet the complex types of classification. No classifier can perform perfectly in image classification. On the other hand, we further discuss the effect of this study on the improvement of the highly uncertain patches through the FDS method. Because the patches are too large, we can only extract a small range of data from Figure 9a (black frame), in which there are a total of 1378 patches. We randomly selected five pieces belonging to the classification of inconsistent patches (blue frame), each of which are numbered in patches of 13,227, 13,240, 13,281, 13,336, and 13,795. For instance, the sample of 13,227 shows that the remote image classification is rice, while the aerial image classification is not rice. The ground truth data is not rice and the final classification of this study through FDS is not rice, and the evidence is obvious. In another example, the sample of 13,795 satellite image classification is not rice, but the aerial image classification is rice. The ground truth data is rice, and the final classification of this study through FDS is rice and the evidence is obvious. However, this tool is not perfect. Omission errors and commission errors will also occur. Most of these phenomena occur when the interpretation probability of rice and non-rice on the two image data is very close. This situation occasionally happens. We try to summarize several possible reasons, such as the interference of clouds or fog on image quality. Furthermore, it may be produced by the complexity of ground crops or it there are multiple crops in a patch. In addition, the diversity of characteristics of the classifier will also be another important factor that effects the results. These reasons have been discussed in previous content and may be available in the future. Finally, we have compiled the results of these errors in Table 6. Basically, the total number of misjudgments and missed judgments is 731 patches for about 3% of the entire data set. This analysis is perfect and practical with good acceptance. This will be good to plot a thematic map. Finally, in this research, we have confirmed this concept. To sum up, these uncertainties are less discussed in past research, but with this research the FDS system can integrate the uncertainties of these factors, and at the same time obtain higher-precision classification results. The contribution of this research is to integrate data resources and various classifiers and combine them in a heterogeneity FDS system. The pros of FDS are the commission errors and omission errors can be fixed by the probability value of judgments. However, there is the case of whether a sample has a low uncertainty in the initial stage, for example, a sample is a rice patch through the various classifiers and material data, but it is not a rice patch for ground truth, and vice versa. In this case, it could produce a wrong judgment that cannot be fixed by the FDS approach. There are the cons of the FDS model.

Summary and Conclusions
Data Fusion technology generally employs artificial intelligence and machine learning with a spatial network to develop simulation upgrade models. Our study intends to build a probabilitybased system to employ LR, SVM, and NN to resolve a complicated paddy rice determination system. Past studies have rarely tackled the classification uncertainties. This study can effectively improve the solution to handle the uncertainty of patches detection. We refer to the Esteban et al. [14] concept of his research and integrated supervised learning to propose a new concept of the paddy area thematic map. The study with the greatest feature is not image fusion but information integration. Among statistics analysis, machine learning and, artificial intelligence, the abovementioned approaches can be rationally combined to display some better classification outcomes. Therefore, the study integrates the concept of ensemble learning to get a thematic map of the patch through various

Summary and Conclusions
Data Fusion technology generally employs artificial intelligence and machine learning with a spatial network to develop simulation upgrade models. Our study intends to build a probability-based system to employ LR, SVM, and NN to resolve a complicated paddy rice determination system. Past studies have rarely tackled the classification uncertainties. This study can effectively improve the solution to handle the uncertainty of patches detection. We refer to the Esteban et al. [14] concept of his Remote Sens. 2020, 12, 3666 21 of 23 research and integrated supervised learning to propose a new concept of the paddy area thematic map. The study with the greatest feature is not image fusion but information integration. Among statistics analysis, machine learning and, artificial intelligence, the abovementioned approaches can be rationally combined to display some better classification outcomes. Therefore, the study integrates the concept of ensemble learning to get a thematic map of the patch through various multi resources/scales of image data. The Fuzzy-DS incorporates different evidence by different data resources with an adjustable α cut value to obtain some better classification outcomes.
According to our experience in classification outcomes, the accuracy of the satellite image of a single classifier was between 94-95% (Kappa: 0.85-0.89). The accuracy of aerial image classification was between 95-96% (Kappa: 0.89-0.90). Through the FDS progress, an integrated classification result was successfully scored with the overall classification accuracy, which increased to 97.26% and the kappa value to 0.93. The study has displayed 79.88% and the number of 2488 has been amended to the correct category. The number of 226 cannot be amended. The number of 506 contains original errors. These 732 patches (226 + 506) are the commission and omission errors, which only include 2.73% of the entire dataset.