Suitability Calculation for Red Spicy Pepper Cultivation ( Capsicum annum L.) Using Hybrid GIS-Based Multicriteria Analysis

: Red spicy pepper is traditionally considered as the fundamental ingredient for multiple authentic products of Eastern Croatia. The objectives of this study were to: (1) evaluate the optimal interpolation method necessary for modeling of criteria layers; (2) calculate the sustainability and vulnerability of red spicy pepper cultivation using hybrid Geographic Information System (GIS)-based multicriteria analysis with the analytical hierarchy process (AHP) method; (3) determine the suitability classes for red spicy pepper cultivation using K-means unsupervised classification. The inverse distance weighted interpolation method was selected as optimal as it produced higher accuracies than ordinary kriging and natural neighbour. Sustainability and vulnerability represented the positive and negative influences on red spicy pepper production. These values served as the input in the K-means unsupervised classification of four classes. Classes were ranked by the average of mean class sustainability and vulnerability values. Top two ranked classes, highest suitability and moderate-high suitability, produced suitability values of 3.618 and 3.477 out of a possible 4.000, respectively. These classes were considered as the most suitable for red spicy pepper cultivation, covering an area of 2167.5 ha (6.9% of the total study area). A suitability map for red spicy pepper cultivation was created as a basis for the establishment of red spicy pepper plantations.


Introduction
Land use planning and ecological land evaluation are considered the most important tools and factors of sustainable agricultural production [1]. Investments in agriculture imply knowledge of the production potential of a particular site, with regards to crop types and optimal agroecological conditions [2]. Multicriteria analysis provides information on the optimal utilization of available resources, allowing sustainable crop management [3]. In addition to crop requirements, it is necessary to have spatial information of the farmland soil, climate and land cover properties [4,5]. The upgrade of such analysis with a spatial component of GIS allows effective planning and its realization in the field [6,7]. The GIS-based multicriteria analysis was applied successfully for crop management down to the county or municipality area [8,9]. Among the multicriteria analysis methods, AHP has become increasingly popular since it allows the integration of a large quantity of heterogeneous data, making the process of criteria weight determination straightforward, even for a large number of criteria [10]. AHP was widely used for suitability analyses in agriculture, including land-use suitability [11][12][13], irrigation suitability [14,15], resource evaluation [16] and organic agriculture efficiency [17].

Hybrid GIS-Based Multicriteria Analysis
The proposed methodology for red spicy pepper suitability calculation contains four major steps: pre-processing, sustainability calculation, vulnerability calculation and suitability calculation ( Figure 2). Implementation of the methodology was conducted using open-source software SAGA GIS v7. 3.0 (Hamburg, Germany) for data processing, interpolation and calculation, alongside QGIS v3.8.3 (Grüt, Switzerland) for data visualization. The proposed methodology integrates raster-based multicriteria analysis, so all input vector layers were rasterized. Coordinate reference system (CRS) for analysis was the Croatian Terrestrial Reference System (HTRS96/TM, EPSG: 3765). Layers which initially had different CRS were reprojected to HTRS96/TM and resampled to 50 m spatial resolution.

Input data
Input criteria layers for red spicy pepper cultivation were collected and classified into sustainability factors, vulnerability factors and constraints. Sustainability and vulnerability factors continuously indicated an impact on sustainability and vulnerability, respectively [30]. Constraints contained Boolean data which indicated a strict possibility or inability of red spicy pepper cultivation [31]. Selected criteria for red spicy pepper cultivation are shown in Table 1. A total of 17 criteria were selected, with seven sustainability factors, five vulnerability factors and five constraints.
Sustainability factors primarily indicate soil potential for sustainable red spicy pepper cultivation. Fertility represented overall production potential based on soil type, derived from a modified Basic Soil Map of Croatia [32]. Proper soil drainage prevents precipitation stagnation near stalks, which leads to deterioration of the roots [33]. Soil organic content influences soil structure and diversity of soil organisms [34]. Two of the most important soil nutrients for red spicy pepper are potassium oxide (K2O) and phosphorus pentoxide (P2O5). A sufficient amount of P2O5 enables the development of the roots and flowering. Plant growth is conditioned by the K2O amount, as well as the formation of the optimal size of fruits. Neutral and mildly acidic soils allow optimal absorption of soil nutrients for red spicy pepper. Terrain slope was selected as the last sustainability factor.
Vulnerability factors consolidated diverse factors with potentially unfavourable effects on red spicy pepper production. The unused construction area is considered vulnerable due to the potential construction of residential and industrial structures. Forests under private ownership are classified into two classes, first having high wood potential and second with dominant natural vegetation. Forests with high wood potential are considered more vulnerable than forests with dominant natural vegetation since they represent a raw material for carpentry. Lower categories of water wells protection denote only organic farming and restricted irrigation. Agricultural plots on archaeological localities can potentially be repurposed to amenities in the interest of cultural tourism.
Constraints represent areas unavailable for red spicy pepper cultivation (built-up area, water bodies and wetland area) and areas prohibited by the government (forests and first category water well protection area). The most significant constrained area is Kopački rit, a wetland area designated as a nature park. The selected criteria are dominantly quantitative, except for the qualitative fertility and drainage criteria. The interpolation of soil features was conducted as a mandatory operation in the pre-processing of the four criteria layers that are sampled in the field: humus, K2O, P2O5 and pH. Due to the multiple possible options for interpolation regarding the selection of the interpolation method and its parameters, an evaluation of the interpolation methods was conducted. The selection of the optimal interpolation method was proven to have an impact on the suitability result, as the performance of interpolation methods depends on the characteristics of the input samples [35,36]. Tested interpolation methods were ordinary kriging (OK), inverse distance weighted (IDW) and natural neighbour (NN). Selected methods are mutually complementary considering the input samples' properties. OK belongs to geostatistical methods and performs best when samples possess normality, stationarity and regular sample distribution [37,38]. IDW and NN are deterministic methods that are considered more practical when samples do not possess a normal distribution and are irregularly spaced [39,40].
The interpolation accuracy test was calculated with the random split-sample method in five independent repetitions. For each repetition, the input sample consisting of 171 samples was split randomly into training (70%) and test data (30%), as used in [41]. Using the described method, we divided the input samples to 120 training and 51 test samples, which remained consistent for all five iterations. Coefficients of determination were selected as accuracy assessment values. Inner accuracy (R 2 inner) shows the relation of training data and their corresponding interpolated values. Inner accuracy represents how much an interpolation method preserves field sample values in the interpolation result, as those values are regarded as absolutely accurate. Outer accuracy (R 2 outer) shows the relation of test data and their corresponding interpolated values. Outer accuracy measures the accuracy of prediction for tested interpolation methods. Both R 2 inner and R 2 outer were calculated for each of five datasets and four soil features, resulting in a total of 20 tested interpolation variations. Descriptive statistics consisting of mean, coefficient of variation (CV), skewness (SK) and kurtosis (KT) were calculated for the determination of data normality and stationarity. These values served as a basis for the selection of interpolation parameters. This particularly refers to ordinary kriging, where normality and stationarity are prerequisites for its calculation. The interpolation of soil features was conducted using the most accurate interpolation method.

Sustainability and vulnerability calculation method
Sustainability and vulnerability factors were separately used for the determination of the criteria impact for sustainability or vulnerability [42,43]. The identical procedure of multicriteria analysis was applied for the calculation of sustainability and vulnerability. The same constraints were used for both calculations. The conducted multicriteria analysis is based on AHP [44]. This procedure integrated the standardization of factor values, factor weights determination and the aggregation of selected factors and constraints.
Standardization was performed by stepwise standardization methods in four classes in 1-4 number intervals, where 1 denotes the lowest and 4 denotes the highest impact on the result. This approach allowed the integration of quantitative and qualitative factor types [45], which are common in the multicriteria analysis of crops [46]. Weight determination of the individual factors was performed using pairwise comparison matrices, as part of the AHP method [47]. Pairwise comparison incorporates a scale of preference between every combination of two factors, with preference factors ranging from 1 (equal preference) to 9 (extreme preference). A detailed procedure of pairwise comparison is described in [48]. Its consistency was calculated by the consistency index (CI) and consistency ratio (CR), as shown in Equations (1) and (2). The random consistency index (RI) represents an average CI from a random matrix, λ is the average value of consistency vector and n is the number of parameters [49]: An acceptable value of inconsistency is CR ≤ 0.10, while larger values indicate a necessity for modification of pairwise comparisons [50,51]. Aggregation of factors and constraints was conducted using the weighted linear combination. Such a procedure is commonly used in multicriteria analysis for criteria aggregation [52]. A weighted linear combination (score) for the calculation of sustainability and vulnerability was calculated using the Equation (3) [53]: where wi is the weight of the factor i, Xi is standardized values of the factor i and C is a Boolean layer of constraints. Sustainability and vulnerability results range from 1 to 4. The ascending nature of these values represents the increase of sustainability and a decrease of vulnerability.

Suitability calculation method
The hybridization of the multicriteria analysis procedure was performed by unsupervised classification for suitability calculation. The selected classification algorithm was K-means, which was, in multiple studies, applied in multicriteria analysis [54,55] and various branches of agriculture [56][57][58]. Classification inputs were sustainability and vulnerability rasters, with equal impact on classification result. Unconstrained pixels were classified into four suitability classes: highest suitability, moderate-high suitability, moderate-low suitability and lowest suitability. Such classification meets the standards set by FAO, in which the land suitability evaluation is generally represented by suitable and unsuitable regions and mainly categorized in the five suitability classes [23]. In this research, four suitable classes were calculated during the unsupervised classification and one permanently unsuitable class is represented by constraints, completing the five class representation. Ranking of the suitable classes was performed based on the suitability value, calculated as an average of mean sustainability and vulnerability class values. These two values are considered as equally impactful on the suitability result, which is the reason that the regular average was selected for the integration of sustainability and vulnerability values.

Evaluation of Tested Interpolation Methods
The study area was expanded with a 1 km buffer area for soil sampling to avoid extrapolation on sparsely sampled areas. All of the collected 171 samples in the field were analysed in the laboratory for humus, K2O, P2O5 and pH. Descriptive statistics for all five random training sets are calculated as shown in Table 2  Since datasets do not possess a normal distribution, logarithmic transformation was performed as a preliminary step to OK. The variogram model and fitting range were selected based on the internal goodness-of-fit values. Combinations with the highest coefficients of determination between the variogram model and empirical variogram were selected. Tested mathematical models were linear, logarithmic, Gaussian and spherical. The Gaussian model was selected as optimal for the interpolation of humus, K2O and pH. The spherical model produced the highest goodness-of-fit values for low stationarity P2O5 samples. IDW was conducted with inverse distance to a power function, with a weighting coefficient of 3. The Sibson method was used for interpolation using NN. Table 3 contains R 2 inner and R 2 outer values for each tested interpolation combination.  IDW constantly produced the highest average R 2 inner for all soil features, closely followed by NN (Table 4). OK had an average of 0.206 lower R 2 inner value than IDW, being the least favourable option based on inner accuracy test. IDW performed best with both R 2 inner and R 2 outer on samples with lowest stationarity (K2O) data and highly skewed samples (P2O5). OK resulted with higher R 2 outer the more samples were closer to having a normal distribution and high stationarity. An unevenness of these properties caused a major variability of average R 2 outer values for OK, ranging from 0.278 for P2O5 to 0.742 for humus. NN remained consistent with the second-highest average accuracy values for all tested soil features. Considering the interpolation accuracies of all four soil features, IDW was selected as an optimal interpolation method.

Sustainability and Vulnerability Calculation
Standardization of sustainability factors was conducted as shown in Table 5. Qualitative factors were standardized by assigning a standardized value to each subclass. Standardization of quantitative factors was performed by the reclassification of the selected value interval to new standardized values. Qualitative vulnerability factors were standardized by the binary reclassification of vulnerable classes to 1 and non-vulnerable classes to 4.
Pairwise comparison matrices were created by vegetable cultivation expert for sustainability ( Table 6) and vulnerability factors (Table 7). Consistency values show that both pairwise comparison matrices resulted in acceptable consistency. Fertility and dainage factors were considered most influential in a first pairwise comparison, with combined 56.9% of the total impact on sustainability. Private forests with high wood potential resulted as the most vulnerable factor, comprising almost 45% of the total vulnerability impact.    Individual Boolean constraint criteria layers were multiplied mutually to form a constraints layer. A total of 85.1% of the study area was constrained during criteria aggregation, which resulted in the area of 47 km 2 available for red spicy pepper cultivation. Sustainability values ranged from 1.77 to 3.63, indicating moderate sustainability with no extreme values. Vulnerability resulted in the value range of 1.50-4.00, indicating a high variability of vulnerability on the calculated area. A total of 35.6% of the area resulted as non-vulnerable with the most favourable value (4.00).

Suitability Calculation
K-means classification resulted in four classes, with their mean sustainability and vulnerability shown in Table 8. The classification was performed with 20 maximum iterations. The mean suitability values of each class were ranked and the description to each class was added. The highest suitability class was dominantly located in the northwest of the municipality and covered over one-third of the unconstrained area. The top two classes are considered as suitable for red spicy pepper cultivation based on the relative suitability values between classes, covering the area of 2167.5 ha. The bottom two classes had significant limitations in the form of high vulnerability (moderate-low suitability) or low sustainability (lowest suitability). The final red spicy pepper suitability map was shown in Figure  9, alongside sustainability and vulnerability maps.

Discussion
The versatility of applied methods in the process of suitability calculation ensures the repeatability of the process, not related to the study area size or its location in the world. Existing studies on the application of multicriteria analysis on crop suitability are developed in similar areas, like the study area in this research, and imply the applicability of this methodology on larger areas [15,35,[59][60][61]. The same studies noted the flexibility of the used methods regarding the input data selection. Thus, the addition or deduction of new individual criteria or the criteria group is supported by the applied methods. Consequentially, the applicability of the suitability calculation proposed in this study is possible on almost any crop type, with the prerequisite of the input criteria adjustment to the selected crop type and the specificities of the study area. Evaluation of interpolation methods produced results that are contrary to multiple studies in which kriging methods resulted in better interpolation accuracy than deterministic methods [62,63]. The primary cause of the better accuracy of IDW than OK in this study is the low normality and moderately low data stationarity. Considering that OK produced higher accuracy, the closer the sample data are to a normal distribution, it is assumed that OK would be optimal interpolation method for the majority of samples in practice. Nevertheless, this case demonstrated that the evaluation of the interpolation method was an important step in the multicriteria analysis and that kriging is not always superior to deterministic methods. Evaluation of interpolation methods can be expanded using additional statistical indicators like root mean square error and mean relative error [64,65].
The climate criteria group, consisting of air temperature, precipitation and solar radiation, were not applied to this analysis due to homogenous climate conditions in the study area. However, in the case of more diverse conditions, these data have a major impact on the suitability results [66]. Similarly, the addition of the soil nitrogen content was strongly considered as a sustainability factor, but was dismissed due to its high homogeneity in the study area. The potential importance of soil nitrogen content as a fundamental macronutrient for red spicy pepper suitability calculation increases with its variability in the study area, presenting a base of future research. Stepwise standardization allowed better control and more expert influence on standardization in contrast to common linear scaling. This is primarily obvious in the case of a few extremely high or low pixel values. Fuzzy membership methods for standardization offer a potential upgrade to standardization [67] and will be a subject of future research. One of the main disadvantages of AHP is an expert's subjectivity in the creation of pairwise comparison matrices, which could be a source of inaccuracies in the suitability result [68]. These inaccuracies can be reduced in the form of fuzzy AHP [69] or by aggregating estimations of multiple experts [14].
The separation of input criteria to sustainability and vulnerability factors enabled the calculation of positive and negative impacts of red spicy pepper production. The number of input criteria for red spicy pepper cultivation can easily grow and cause cluttering since it includes soil, climate, infrastructure and economic criteria groups. Such a procedure reduces the overemphasis of positive or negative influences and prevents redundancy of factors in the individual analysis [70]. Separation of sustainability and vulnerability allows the limiting of the factors to seven per analysis, which is the maximum recommended number of input criteria in AHP [71]. It is also considered that reducing of the pairwise matrices' complexity leads to a decrease in possible inaccuracies during weight determination. During that procedure, the expert's subjectivity is also restricted, which reduces one of AHP's main drawbacks. The application of unsupervised classification for suitability calculation allowed objective and rule-based calculation, further reducing the probability of error due to expert's subjectivity [72].
The difference in suitability values of the highest and lowest-ranked class is relatively small, 0.527 out of a possible 3.000. Classes were ranked relatively with regard to the mutual relationship of class suitability values. Absolute suitability scales like FAO land-crop suitability scale [73] were considered inappropriate since the distinctively designate top and bottom classes as completely suitable or non-suitable. Further research is planned on suitability result validation with groundtruth data and the measurement of the calculated suitability impact on potential economic profits. Agricultural plots with similar cultivation methods are required for planned research, alongside yield data for each observed plot.

Conclusions
The proposed methodology was based on the hybridization of multicriteria analysis and unsupervised classification for the calculation of red spicy pepper cultivation suitability. One of the main advantages of the proposed methods is the interpolation methods evaluation, as four soil feature criteria required interpolation as part of the pre-processing. It was observed that the IDW method performed best out of the three tested interpolation methods, contrary to most cases where OK had an advantage over deterministic methods. Another advantage of methodology was the separation of sustainability and vulnerability multicriteria analysis. These results were used for the calculation of suitability, but can be used independently if necessary. The criteria separation procedure prevented criteria redundancy by limiting the number of criteria for each analysis to seven. It also reduced inaccuracies in calculation due to expert subjectivity. The final advantage represents an aggregation of sustainability and vulnerability results in suitability classes using unsupervised classification. That procedure further reduced errors caused by an expert's subjectivity. It was also used to create ranked classes that are easily applicable in the planning of new red spicy pepper plantations. With the use of unsupervised classification, 2167.5 ha (6.9% of the total study area) was determined as suitable for red spicy pepper cultivation. Further research is planned on the validation of the calculated results using ground-truth data and the integration of fuzzy methods with the multicriteria analysis procedure.