GIS-Based Cropland Suitability Prediction Using Machine Learning: A Novel Approach to Sustainable Agricultural Production

: The increasing global demand for food has forced farmers to produce higher crop yields in order to keep up with population growth, while maintaining sustainable production for the environment. As knowledge about natural cropland suitability is mandatory to achieve this, the aim of this paper is to provide a review of methods for suitability prediction according to abiotic environmental criteria. The conventional method for calculating cropland suitability in previous studies was a geographic information system (GIS)-based multicriteria analysis, dominantly in combination with the analytic hierarchy process (AHP). Although this is a ﬂexible and widely accepted method, it has signiﬁcant fundamental drawbacks, such as a lack of accuracy assessment, high subjectivity, computational inefﬁciency, and an unsystematic approach to selecting environmental criteria. To improve these drawbacks, methods for determining cropland suitability based on machine learning have been developed in recent studies. These novel methods contribute to an important paradigm shift when determining cropland suitability, being objective, automated, computationally efﬁcient, and viable for widespread global use due to the availability of open data sources on a global scale. Nevertheless, both approaches produce invaluable complimentary beneﬁts to cropland management planning, with novel methods being more appropriate for major crops and conventional methods more appropriate for less frequent crops.


Introduction
Accelerated global population growth necessitates the production of more and more food [1].Conventional intensive agricultural production can meet short-term food demands, but it frequently comes at the price of long-term sustainability and land degradation [2].An additional challenge is posed by climate change and pollution, which undermine the effectiveness of conventional cropland management [3].The two most popular methods used in conventional agricultural production systems to improve crop yields are (1) transforming land cover to create new cropland and (2) modifying agrotechnical techniques, such as increasing the use of fertilizers and pesticides, so as to boost yields on existing cropland.Land use conversion has a greater potential to increase overall yields than improving agricultural practices on existing cropland [4].However, new cropland converted from forest and wetland areas results in the destruction of natural habitats and poses a threat to biodiversity.Habitat destruction is the most common cause of flora endangerment, with a negative outlook for their recovery potential [5].The use of fertilizers and pesticides is necessary for the continuous production of high and stable yields in conventional intensive agricultural production [6].Fertilizer and pesticide applications continue to increase when the crop rotation system is not maintained, agrotechnical measures are inadequate, and certain agricultural crops are grown in an inherently unsuitable location [4,7].This practice leads to environmental pollution through heavy metals, mainly copper, persistent organic pollutants, and excessive nitrogen and phosphorus polluting waterways [8,9].Such pollution has a direct impact on human health, flora and fauna, spreading through surface and groundwater, and accumulating in living organisms [10].
Alternative methods for increasing crop yields have been developed in response to the requirement for the long-term sustainability of rising agricultural production.On existing croplands, it is feasible to increase yields while using fewer fertilizers and pesticides with a traditional method by cultivating crops in naturally suitable locations [11].Crop rotation and agricultural management strategies must be modified to foster circumstances for ecologically responsible and sustainable agriculture as important abiotic factors are either impossible to change or extremely difficult to do so [12].Because of the temporal variability of abiotic criteria, primarily caused by climate change, regular monitoring and analysis of changes in suitability levels are required [13].Agro-technics can directly benefit from the inventory of existing suitability levels as it can be used to propose changes to agrotechnical measures and the installation of irrigation systems in sensitive regions [14].Crop cultivation in naturally suited sites is one part of regionalized agricultural production that aims to achieve sustainable output (Figure 1).
Agronomy 2022, 12, x FOR PEER REVIEW 2 of 15 are inadequate, and certain agricultural crops are grown in an inherently unsuitable location [4,7].This practice leads to environmental pollution through heavy metals, mainly copper, persistent organic pollutants, and excessive nitrogen and phosphorus polluting waterways [8,9].Such pollution has a direct impact on human health, flora and fauna, spreading through surface and groundwater, and accumulating in living organisms [10].
Alternative methods for increasing crop yields have been developed in response to the requirement for the long-term sustainability of rising agricultural production.On existing croplands, it is feasible to increase yields while using fewer fertilizers and pesticides with a traditional method by cultivating crops in naturally suitable locations [11].Crop rotation and agricultural management strategies must be modified to foster circumstances for ecologically responsible and sustainable agriculture as important abiotic factors are either impossible to change or extremely difficult to do so [12].Because of the temporal variability of abiotic criteria, primarily caused by climate change, regular monitoring and analysis of changes in suitability levels are required [13].Agro-technics can directly benefit from the inventory of existing suitability levels as it can be used to propose changes to agrotechnical measures and the installation of irrigation systems in sensitive regions [14].Crop cultivation in naturally suited sites is one part of regionalized agricultural production that aims to achieve sustainable output (Figure 1).The analytic hierarchy process (AHP) in conjunction with GIS-based multicriteria analysis is currently regarded as the standard for assessing the suitability of crops.However, this method is sensitive to subjectivity, and cannot effectively incorporate a huge quantity of data [15].The use of machine learning techniques has improved the conventional methodology by addressing these flaws in the prediction of biological, chemical, and physical soil characteristics.Hengl et al. [16] included soil samples and publicly accessible spatial data reflecting abiotic criteria as part of the SoilGrids project into a novel machine-based prediction.This marks the start of a paradigm change in the forecasting of spatial factors in the environment, opening the possibility for applications in fields other than soil science.Among these fields, cropland suitability prediction is one of the most convenient for the application of a similar machine-learning-based approach, as it depends on the same environmental criteria groups [15].While the machine learning approach in cropland suitability prediction has fundamental similarities with SoilGrids, the most notable difference is present in training data selection (Figure 2).Cropland suitability has been defined in previous studies as a slightly abstract term, being quantified by in- The analytic hierarchy process (AHP) in conjunction with GIS-based multicriteria analysis is currently regarded as the standard for assessing the suitability of crops.However, this method is sensitive to subjectivity, and cannot effectively incorporate a huge quantity of data [15].The use of machine learning techniques has improved the conventional methodology by addressing these flaws in the prediction of biological, chemical, and physical soil characteristics.Hengl et al. [16] included soil samples and publicly accessible spatial data reflecting abiotic criteria as part of the SoilGrids project into a novel machine-based prediction.This marks the start of a paradigm change in the forecasting of spatial factors in the environment, opening the possibility for applications in fields other than soil science.Among these fields, cropland suitability prediction is one of the most convenient for the application of a similar machine-learning-based approach, as it depends on the same environmental criteria groups [15].While the machine learning approach in cropland suitability prediction has fundamental similarities with SoilGrids, the most notable difference is present in training data selection (Figure 2).Cropland suitability has been defined in previous studies as a slightly abstract term, being quantified by in-situ crop yield data [11] or data derived from remote sensing satellite missions, such as vegetation indices [17] or biophysical variables [15].
Agronomy 2022, 12, x FOR PEER REVIEW 3 of 15 situ crop yield data [11] or data derived from remote sensing satellite missions, such as vegetation indices [17] or biophysical variables [15].Although the extraordinary potential of machine learning and remote sensing data in suitability determination has been noted, fully comprehensive and straightforward solutions are still being researched.The aim of this study is to review present upgrades to the conventional GIS-based multicriteria analysis with the machine learning approach and to propose the most potent directions for cropland suitability calculation for future studies.

Advancements of the Conventional GIS-Based Multicriteria Analysis for Cropland Suitability Prediction
Previous studies predominantly considered GIS-based multicriteria analysis as the current standard for quantifying cropland suitability [11,18,19].Because of its adaptability and universal application, it has become an essential approach in suitability research Although the extraordinary potential of machine learning and remote sensing data in suitability determination has been noted, fully comprehensive and straightforward solutions are still being researched.The aim of this study is to review present upgrades to the conventional GIS-based multicriteria analysis with the machine learning approach and to propose the most potent directions for cropland suitability calculation for future studies.

Advancements of the Conventional GIS-Based Multicriteria Analysis for Cropland Suitability Prediction
Previous studies predominantly considered GIS-based multicriteria analysis as the current standard for quantifying cropland suitability [11,18,19].Because of its adaptability and universal application, it has become an essential approach in suitability research across a variety of scientific fields [20][21][22].However, this approach's basic flaws make it less successful when dependability and universal application become more important.The standard procedure of GIS-based multicriteria analysis includes several steps (Figure 3): 1.
Defining the study aim, 2.
Calculation of suitability and interpretation of the results.
Agronomy 2022, 12, x FOR PEER REVIEW 4 of 15 across a variety of scientific fields [20][21][22].However, this approach's basic flaws make it less successful when dependability and universal application become more important.
The standard procedure of GIS-based multicriteria analysis includes several steps (Figure 3):  Previous research used the GIS-based multicriteria analysis to assess the suitability of agricultural crops in a specific geographic location.An important component of the studies analyzed is the timing of the analysis, which is conditioned by the availability of data from one or more consecutive seasons [23].Knowing the timing of the analysis is also necessary because of the temporal variability of the basic abiotic criteria, especially the climate criteria [24].The novel methods using machine learning and open-data satellite imagery are based on the same assumptions.This feature allows for multitemporal comparison of suitability results from the two approaches, enabling the evaluation of historical accuracy and updating of the cropland management plans.The ability to adjust cropland management to the classification of abiotic parameters (climate, soil, and terrain) has increased with the emergence of remote sensing satellite missions with open data access [25].The availability of these data is expected to increase in the future, due to the lifespan of present missions and their continuous upgrade [26].The fluctuation of tem- Previous research used the GIS-based multicriteria analysis to assess the suitability of agricultural crops in a specific geographic location.An important component of the studies analyzed is the timing of the analysis, which is conditioned by the availability of data from one or more consecutive seasons [23].Knowing the timing of the analysis is also necessary because of the temporal variability of the basic abiotic criteria, especially the climate criteria [24].The novel methods using machine learning and open-data satellite imagery are based on the same assumptions.This feature allows for multitemporal comparison of suitability results from the two approaches, enabling the evaluation of historical accuracy and updating of the cropland management plans.The ability to adjust cropland management to the classification of abiotic parameters (climate, soil, and terrain) has increased with the emergence of remote sensing satellite missions with open data access [25].The availability of these data is expected to increase in the future, due to the lifespan of present missions and their continuous upgrade [26].The fluctuation of temperature and soil properties, caused by climate change and inadequate agricultural management, requires constant updating of cropland suitability results for individual crops [27].Yields targeted by farmers, based on relative comparisons with neighboring agricultural parcels, are an additional factor that may lead to increased use of fertilizers and pesticides in inherently unsuitable locations [28].While these goals can be attained using the conventional GIS-based multicriteria analysis, more computationally efficient approaches for cropland suitability prediction using machine learning could further improve their applicability.
Previous research used the analysis of prior studies and expert views to choose the abiotic criteria for cropland suitability modelling using GIS-based multicriteria analysis [11,29].This procedure was based on the assumption that each microsite is agroecologically specific for crop cultivation [30].The selection of the type and the amount of abiotic criteria for a particular site and crop type was based on the knowledge of one or more agronomic experts.The three factors that have the greatest effects on cropland suitability-climate, soil, and topography-were consistently divided from the abiotic criteria [15,31].Although most of these studies focused on these criteria groups, their amount varied significantly by crop type and geographic location (Table 1).The exceptions to the use of climate, soil, and topography criteria were constraints representing appropriate land cover classes and categories of irrigation systems.The selection of climate criteria depends directly on the extent of the study area, while for smaller areas (such as municipalities), climate homogeneity is assumed [19].The variability in the number of abiotic suitability criteria used and their distribution among the criteria groups indicate a high influence of human subjectivity in their selection.Although this method enables accurate suitability modelling based on professional expertise, it may be incorrect and biased, as it requires diverse and numerous environmental criteria in order to include all major aspects of suitability [15].The variation from the ideal number of seven criteria in the AHP, according to Saaty and Ozdemir [49], which should range from five to nine criteria, expresses the computational inefficiency of this technique.According to these suggestions, using less than four criteria only permits a bare minimum depiction of cropland suitability.When subjectively assessing the relative relevance of the criteria, ten or more criteria reflect a wider variety of abiotic criteria, but they also raise the possibility of inaccuracy and the complexity of computations.Spatial modelling of selected abiotic criteria in a GIS environment is usually performed in a raster data model [50].The input data are usually distributed in numerous combinations of data types obtained from various institutional and scientific sources.To harmonize point vector input data into raster form, it is necessary to perform the prediction of soil values at unsampled locations by spatial interpolation [51].The selection of the optimal method and parameters of spatial interpolation is a necessity for reliable modelling of the input criteria, which decreases significantly if they are not adjusted to the characteristics of the input values [52].The relative complexity of such modelling, as well as subjectivity in the selection of the spatial interpolation method, parameters, and classification standards, suggest a potential reduction in human error by automating the process [53].The same approach increases time efficiency by not requiring individual tool editing in GIS and facilitates data distribution using a globally accepted standard.Following the same principle, the developed processing framework can be easily adapted to different abiotic criteria in cropland suitability studies.Commonly used criteria in cropland suitability studies have been partially represented by global standards with subjective modifications [27] or the application of different standards with the same objective [6,54].As one of the most common approaches to criteria selection is the analysis of previous studies, such cases lead to potential inaccuracy in the selection of value ranges in further standardization and weighting procedures.
Diverse input value ranges of modeled abiotic criteria are converted into a consistent numerical normalization interval throughout the standardization procedure [17].Typically, numerical intervals such as 0-1 or, more commonly, 1-5 are used, which allow for a simple representation of suitability using the five classes defined by the Food and Agriculture Organization of the United Nations (FAO).In addition to combining values expressed in different units of measurement, quantitative and qualitative data are also integrated, which is often required to determine cropland suitability [55].Three basic standardization methods have been used in previous studies: linear stretching, stepwise standardization, and fuzzy standardization.In linear stretching, the minimum and maximum input values correspond to the limit values of the defined standardization interval.Although the linear stretching method is very simple and completely objective, it leads to unreliable standardization when the input data contain extreme values, which is often the case in suitability studies.In contrast, the stepwise standardization method is a completely subjective method based on discrete ranges of input values for a single standardized value.Thus, generalized and approximate numerical values that typically have an identical range are used to quantify the suitability level [56].Because of the simplicity and flexibility of the method, it has found the most frequent application in previous cropland suitability studies [17].Standardization using the fuzzy method combines the advantages of the previous two methods with continuous standardization and relative objectivity using mathematical models and the implementation of standardization thresholds based on a subjective approach [57].The alternatives in the choice of fuzzy logic mathematical models (linear, S-shaped, J-shaped, and G-shaped) allow for additional flexibility in standardization.Nevertheless, fuzzy logic methods are used much less frequently in suitability studies compared with stepwise standardization.There is currently no extensive research that outlines the precise impact of standardization methods on the accuracy of suitability results; therefore, users choose a standardization method based only on their subjective preferences.The comparative evaluation of these standardization methods has shown that the variety of available methods in complex GIS-based multicriteria analysis should be evaluated more thoroughly in future studies.
Criteria weights measure the relative weights of all of the chosen suitability criteria, as opposed to standardization, which assesses the suitability of a single criterion according to its range of values [58].Input criteria are weighted to proportionally indicate their effects, as they have various degrees of influence on agricultural suitability.Sensitivity analyses in previous studies found that the weights assigned to the criteria had the largest effect on the cropland suitability results in the GIS-based multicriteria analysis [59].There are several different approaches to weighing abiotic criteria, from straightforward estimating techniques to sophisticated ones such as AHP, TOPSIS, ELECTRE, and PROMETHEE [60,61].A common feature of all weighting methods is that the sum of all weights equals 1, which means 100% of the influence of the selected abiotic input criteria on suitability.Previous studies have highlighted the advantages of the AHP in terms of flexibility and simplicity [62][63][64], leading to its preferred use in cropland suitability studies (Table 2).The Web of Science Core Collection search was performed for the articles matching the topic of "land suitability" AND ("crop" OR "agriculture" OR "farming") AND the method name, as stated in Table 2.The principle of AHP is based on a relative pairwise comparison of all combinations of input criteria, where a relatively more influential abiotic factor is denoted by an integer in the interval of one to nine.Even if a consistency index checks each pairwise comparison result, using more than nine criteria makes the weighing procedure overly complicated and prone to errors.Previous suitability studies have highlighted the difficulties of comprehensive suitability modelling with the recommended number of abiotic criteria [4,65].Currently, the two main drawbacks of using AHP in GIS-based multicriteria analysis are the sensitivity of weighting to human subjective judgements with a high number of pairwise comparisons and the inability of choosing an arbitrary number of criteria.
Suitability calculation based on standardized values of abiotic input criteria and their weights is the simplest and most consistent step in GIS-based multicriteria analysis [66].The weighted linear combination is a conventional choice for calculating suitability, where the standardized values and their respective weights are multiplied.The range of suitability values corresponds to an arbitrarily chosen numerical normalization interval.Numerous studies regard the FAO's categorization of cropland suitability into five classifications as the standard for suitability calculation [11,39,67,68].Its application facilitates the comparison of suitability values between crops and for different locations for the same crop.
The evaluation of accuracy is frequently omitted in agricultural suitability studies based on the GIS-based multicriteria analysis, despite the fact that it is a crucial step in all closely related types of spatial analysis.The complicated idea of agricultural suitability can only be validated with very narrow data sources [54].Crop yield data have been utilized as an accurate indicator of suitability in the majority of earlier research where accuracy evaluations were performed [11,29].It is also influenced by components that cannot be modeled in a GIS environment, such as the implementation of agro-technical measures at the micro level, making it an incomplete indicator of suitability.The use of conventional cropland suitability studies is severely constrained as official databases of yield data for specific agricultural plots are extremely scarce, making it impossible to conduct an external, impartial examination of the suitability.

Recent Developments in Machine-Learning-Based Cropland Suitability Prediction
According to the disadvantages of the conventional GIS-based multicriteria analysis with AHP, machine learning methods have already enabled researchers to provide more computationally efficient, objective, and reliable cropland suitability prediction (Figure 4).Machine learning has been efficiently used to address both the subjectivity and the difficulty of including environmental data in the GIS context.It facilitated the integration of big data's many forms, as well as its processing, and it created intricate nonlinear linkages between training data and independent predictors (covariates) [16].Machine learning allows for a fully automated and subjective determination of feature importance, as opposed to the manual and subjective computation of weights of specific abiotic criteria in the suitability result [69].
nal, impartial examination of the suitability.

Recent Developments in Machine-Learning-Based Cropland Suitability Prediction
According to the disadvantages of the conventional GIS-based multicriteria analysis with AHP, machine learning methods have already enabled researchers to provide more computationally efficient, objective, and reliable cropland suitability prediction (Figure 4).Machine learning has been efficiently used to address both the subjectivity and the difficulty of including environmental data in the GIS context.It facilitated the integration of big data's many forms, as well as its processing, and it created intricate nonlinear linkages between training data and independent predictors (covariates) [16].Machine learning allows for a fully automated and subjective determination of feature importance, as opposed to the manual and subjective computation of weights of specific abiotic criteria in the suitability result [69].Two general approaches were mainly improved using machine learning in cropland suitability prediction studies: Two general approaches were mainly improved using machine learning in cropland suitability prediction studies: 1.
Computationally efficient suitability assessment methods using global satellite missions with a high (e.g., Sentinel-2, Landsat 8) and medium spatial resolution (e.g., Sentinel-3, PROBA-V).This approach ensures the applicability of the accuracy assessment for predicted cropland suitability, otherwise commonly omitted from the conventional approach.The excessive subjectivity of the GIS-based multicriteria analysis with AHP has been independently evaluated using this globally available remote sensing open data.These methods provide a scientific contribution to the training/test data component of the suitability prediction.

2.
Suitability prediction methods based on machine learning algorithms and globally available spatial data that provide high prediction reliability with lower user subjectivity compared with the GIS-based multicriteria analysis.Aside from enabling the inclusion of significantly more environmental covariates in the suitability prediction without impairing computational efficiency, exact and specific abiotic criteria become accessible.In contrast with the generalized and vague criteria (e.g., "precipitation", "temperature", or "soil texture"), these methods included specific relevant environmental abiotic criteria, such as the mean air temperature in individual months or soil clay, silt, and sand contents in narrow soil depth layers.
Although previous machine-learning-based suitability studies have shown a better performance for suitability calculation than the GIS-based multicriteria analysis, the fundamental shortcomings of the current development have only been partially addressed.This primarily relates to the lack of reference parameters for validating suitability results based on the same assumptions as the conventional approach.According to Frampton et al. [70], multispectral satellite data may be used to quantify several biophysical variables, including the leaf area index (LAI), percentage of photosynthetic radiation absorbed (FAPAR), and canopy chlorophyll concentration.The excellent connection between these data and crop yields at each growth stage of individual crops is a crucial aspect that makes it possible to utilize them to develop and test suitability models [71].According to United Nations guidelines, the significance of LAI and FAPAR is underlined, suggesting a high potential for cropland suitability studies [70].The multispectral Sentinel-2 and Sentinel-3 satellite missions' successful launch has significantly improved these capabilities by enabling highand medium-resolution modelling of biophysical vegetation characteristics.Additionally, the use of remote sensing data eliminates the costly and time-consuming gathering of data using terrestrial techniques, particularly for more extensive and less established transportation infrastructure, which defines the majority of agricultural parcels [25,72].
Taghizadeh-Mehrjardi et al. [54] contrasted standard parametric approaches with machine learning to assess the accuracy of suitability prediction for wheat and barley.For wheat and barley, the machine learning approach's overall accuracy was 26% and 29% higher, respectively, than the conventional method.Additionally, this method measured the relative weights of each abiotic input element, offering an objective alternative to the AHP method's weighing procedure.Møller et al. [6] pointed out the capability of precisely defining each component of suitability, highlighting the socioeconomic and environmental components, and acknowledged the similar potential of machine learning in suitability prediction.By creating and studying a technique for assessing the suitability accuracy utilizing the NDVI vegetation index from the multispectral Sentinel-2 images, the traditional GIS-based multicriteria analysis was primarily improved.To improve the subjective weight determination of AHP, Singh et al. [73] applied a Random Forest machine learning algorithm to derive criteria weights based on the relative variable importance.Radočaj et al. [17] proposed the novel peak NDVI method to identify the vegetation potential of soybean during the full maturity (R6) development stage, representing a measurable and exact approach for high repeatability in future seasons and other locations in the world.The strongest association between NDVI and soybean grain production was discovered at this development stage, which has great promise as an efficient and accessible replacement for the traditional validation method [74].This method may be used on any crop where there is a strong relationship between soil-related yield components and a vegetation index derived from satellite images at the individual growth stages.Because of the open data availability of Sentinel-2 and similar missions, the ability to evaluate the accuracy of cropping events has become accessible for most future studies.
Besides novel methods that improve the conventional GIS-based multicriteria analysis, a major scientific contribution was made by developing the fully objective method based on machine learning for suitability prediction [15].As a result of the strong association with biomass and crop output, LAI and FAPAR from the PROBA-V satellite mission data were chosen as the reference data for a novel machine learning technique for forecasting cropland suitability.Using unsupervised K-Means classification, the suitability classes of the reference data were established based on their values on historical crop parcels.These data were divided at random to create training and test samples, which also satisfied the need for validating the suitability results.The suitability level for the whole agricultural region in the research area was calculated using the machine learning algorithms Random Forest and Support Vector Machine based on training data and variables that represented climate, soil, and topographical criteria.After categorization, machine learning procedures, which are an objectively determined counterpart of the AHP weights from the traditional technique of the GIS-based multicriteria analysis, evaluated the relative value of the abiotic input criteria.The evaluation of the accuracy of the individual annual suitability results was based on the figure of merit and the overall accuracy, and the grids obtained by the optimal machine learning method were selected for further processing.Their values were aggregated in the unsupervised classification by the K-Means method, resulting in the final suitability classes (Figure 5).The majority of problems with GIS-based multicriteria analysis were solved by adopting the machine learning approach.It provides unbiased results, enabling the integration of vast and intricate geographical data, and accurately determines the suitability of crops using open-source satellite data.This approach is suitable for all main crop types as it requires training and testing data, but it may fall short for less frequent crops because they are planted on fewer and smaller agricultural plots.Therefore, at the present point of development, a machine learning-based suitability prediction approach should be utilized jointly with the conventional approach.While the proposed methods were developed in principle for application to all major crops, future studies should address their The majority of problems with GIS-based multicriteria analysis were solved by adopting the machine learning approach.It provides unbiased results, enabling the integration of vast and intricate geographical data, and accurately determines the suitability of crops using open-source satellite data.This approach is suitable for all main crop types as it requires training and testing data, but it may fall short for less frequent crops because they are planted on fewer and smaller agricultural plots.Therefore, at the present point of development, a machine learning-based suitability prediction approach should be utilized jointly with the conventional approach.While the proposed methods were developed in principle for application to all major crops, future studies should address their evaluation for other major crops, as well as their accuracy under different agricultural management systems.

Conclusions and Future Outlooks
The implementation of machine learning in cropland suitability prediction models has ensured several advantages over the conventional GIS-based multicriteria analysis.These include objective, robust, and computationally efficient prediction using a variety of specific environmental abiotic criteria.As these criteria, as well as the training/test data in recent studies, were derived from open data remote sensing satellite missions, there is immense potential for widespread global application for many crops.As was the case with soil mapping recently, machine learning induced a paradigm shift from the conventional approach of cropland suitability prediction.The application of global remote sensing data also enabled the development of globally applicable accuracy assessment methods for cropland suitability using vegetation indices and biophysical variables.This might be the most impactful upgrade in the domain of cropland suitability prediction in recent years, allowing for the independent assessment of the subjective conventional GIS-based multicriteria analysis with AHP.
Nevertheless, novel machine learning-based methods for cropland suitability prediction are still under research, and there is no globally standardized and straightforward procedure.Training/test data derived from low and medium spatial resolution remote sensing satellite missions presently require substantial coverage of individual crops in the study area.This includes both the overall cultivated area and the presence of relatively large agricultural parcels to avoid spectral noise from other crops and land cover classes.These conditions are very often met for major crops in most locations globally where these are cultivated (maize, wheat, rice, soybean, sunflower, etc.), but a variety of less common, yet important, crops do not support the application of novel methods presently.Therefore, even with the expected further development of machine-learningbased methods for cropland suitability prediction, conventional GIS-based multicriteria analysis is likely to remain.As the mentioned restrictions on remote sensing data and crop coverage are unlikely to be resolved in the near future, novel and conventional suitability prediction methods should coexist.With machine-learning-based methods used for major crops and conventional approaches for less frequently cultivated ones, both of these will provide invaluable complimentary benefits to sustainable cropland management planning in the future.

Figure 1 .
Figure 1.The concept of regionalized agricultural production according to cropland suitability.The conceptualization of agricultural land management without determined suitability (left), predicted GIS-based cropland suitability (center), and regionalized agricultural production with land management plans optimized according to predicted cropland suitability (right).

Figure 1 .
Figure 1.The concept of regionalized agricultural production according to cropland suitability.The conceptualization of agricultural land management without determined suitability (left), predicted GIS-based cropland suitability (center), and regionalized agricultural production with land management plans optimized according to predicted cropland suitability (right).

Figure 2 .
Figure 2. Conceptual comparison of the machine learning prediction approach implemented in SoilGrids [16] and for cropland suitability calculation.

Figure 2 .
Figure 2. Conceptual comparison of the machine learning prediction approach implemented in SoilGrids [16] and for cropland suitability calculation.

1 .
Defining the study aim, 2. Selecting relevant environmental criteria, 3. Standardizing criteria values, 4. Weighting (pondering) of criteria, 5. Calculation of suitability and interpretation of the results.

Figure 3 .
Figure 3.The conventional procedure of GIS-based multicriteria analysis for cropland suitability prediction.

Figure 3 .
Figure 3.The conventional procedure of GIS-based multicriteria analysis for cropland suitability prediction.

Figure 4 .
Figure 4.The comparative generalized workflows of conventional and novel machine learning approach for cropland suitability prediction.

Figure 4 .
Figure 4.The comparative generalized workflows of conventional and novel machine learning approach for cropland suitability prediction.

Figure 5 .
Figure 5.The cropland suitability prediction method fully based on machine learning, as proposed by [15].

Figure 5 .
Figure 5.The cropland suitability prediction method fully based on machine learning, as proposed by [15].

Table 1 .
Distribution of the three main criteria groups in previous studies for determining cropland suitability using the conventional GIS-based multicriteria analysis.

Table 2 .
Application of criteria weighting methods in cropland suitability studies indexed in the Web of Science Core Collection during the period of 2000-2020.