Spatial Decision Support Systems with Automated Machine Learning: A Review

Wen, Richard; Li, Songnian

doi:10.3390/ijgi12010012

Open AccessReview

Spatial Decision Support Systems with Automated Machine Learning: A Review

by

Richard Wen

^*

and

Songnian Li

Department of Civil Engineering, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada

^*

Author to whom correspondence should be addressed.

ISPRS Int. J. Geo-Inf. 2023, 12(1), 12; https://doi.org/10.3390/ijgi12010012

Submission received: 28 September 2022 / Revised: 6 December 2022 / Accepted: 23 December 2022 / Published: 30 December 2022

(This article belongs to the Special Issue GIS Software and Engineering for Big Data)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Many spatial decision support systems suffer from user adoption issues in practice due to lack of trust, technical expertise, and resources. Automated machine learning has recently allowed non-experts to explore and apply machine-learning models in the industry without requiring abundant expert knowledge and resources. This paper reviews recent literature from 136 papers, and proposes a general framework for integrating spatial decision support systems with automated machine learning as an opportunity to lower major user adoption barriers. Challenges of data quality, model interpretability, and practical usefulness are discussed as general considerations for system implementation. Research opportunities related to spatially explicit models in AutoML, and resource-aware, collaborative/connected, and human-centered systems are also discussed to address these challenges. This paper argues that integrating automated machine learning into spatial decision support systems can not only potentially encourage user adoption, but also mutually benefit research in both fields—bridging human-related and technical advancements for fostering future developments in spatial decision support systems and automated machine learning.

Keywords:

spatial; decision support; machine learning; automation; framework; system; SDSS; AutoML; GIS

1. Introduction

Advances in crowdsourcing [1], open data initiatives [2], and open-source standards [3] have made spatial data more publicly accessible. Spatial decision support systems (SDSS) store, manage, and process spatial and non-spatial data for important decisions, such as selecting business locations, placing traffic infrastructure, and implementing public-health policies [4]. However, many SDSS are not adopted by decision makers due to lack of trust, technical expertise, and resources [5,6]. Recently, automated machine learning (AutoML) has received attention from the research community and media. AutoML integrates automation and machine learning (ML) by generating models, with little human assistance, which perform well under certain requirements and computational budgets [7]. This reduces the effort and technical expertise required to process and model data, which accounts for a majority of the time spent on data analysis [8]. As leading technology companies released AutoML products in 2017 to 2018 [9,10,11], ML models became more widely used by non-experts and less expensive to implement. With the recent increase in accessibility to AutoML, resources for implementing and maintaining SDSS can be reduced, improving SDSS adoption by decision makers.

This paper provides a systematic review of AutoML and SDSS integration, which seeks to answer three research questions: (R1) What problems can both SDSS and AutoML solve according to recent research? (R2) How can AutoML be integrated into SDSS? and (R3) What are the challenges and opportunities of SDSS with AutoML to improve user adoption? Although there are existing review papers on AutoML and SDSS separately [4,12,13], review papers focused on the integration of both AutoML and SDSS were not found in the literature from an initial search on the topics of AutoML and SDSS together. This paper produces the following three research contributions to answer these questions: (C1) A systematic review that investigates recent methods, results, applications, and potential problems in SDSS and AutoML (C2) A framework based on recent literature for implementing AutoML in SDSS and (C3) A summary of key research opportunities and challenges of SDSS with AutoML in relation to user adoption.

Section 2 details the literature selection and review process. Section 3 briefly summarizes and analyzes the selected literature from Section 2, to provide an overview of important articles, topics, and trends. Section 4 reviews the selected literature for more in-depth background and theory of past and recent AutoML and SDSS theory, problems, and applications to answer research question 1. A framework of SDSS with AutoML is discussed in Section 5 to answer research question 2, along with key considerations, implementation challenges, and research opportunities to answer research question 3 as they relate to user adoption. Finally, Section 6 concludes the paper with a summary of the previous sections and future implications of research on SDSS with AutoML.

2. Methods

This paper used a two-step process to answer three research questions described in the introduction (Figure 1). This section details the process involving the gathering of relevant AutoML and SDSS research literature, then summarizing, analyzing, and discussing the gathered literature to answer the three research questions. An overview of the literature found from applying the process described in this section is available in Section 3.

2.1. Step One: Literature Search

The first step involved a search of recent AutoML and SDSS literature. Peer-reviewed journal articles were keyword searched (titles only) in 382 research databases (e.g., Scopus, arxiv, Web of Science, etc.) using Summon 2.0 between 1 January 2019 and 24 September 2022 [14]. These articles were then manually filtered by inspecting the title and abstracts for AutoML/SDSS-relevant articles that are review papers only. Since many recent literature review articles covered developments in AutoML [12,13,15,16,17] and SDSS [4,18,19,20,21] within the past 5 years, articles within three years were used to avoid outdated information and to focus on the most recent research. Thus, AutoML and SDSS articles that were not literature reviews or published earlier 2019 were excluded. The manually filtered articles (17 AutoML, 18 SDSS) in the past three years were used to also discover 101 additional supplementary references for both AutoML and SDSS topics using a snowball search strategy, which involved reviewing the manually filtered articles for major subtopics (e.g., feature selection, spatial clustering, machine-learning pipelines, etc.) to identify important supplementary references for AutoML and SDSS [22]. The purpose of identifying supplementary articles was to cover major SDSS and AutoML topics earlier than 2019, and was not limited to the past three years.

2.2. Step Two: Review and Discussion

The second step involved answering the research questions using the articles and relevant references gathered from the first step. Similar problems solvable by AutoML and SDSS were identified based on study objectives (e.g., predicting land-slide risk, simulating land use patterns, etc.) and summarized to answer research question 1: (R1) What problems can both SDSS and AutoML solve according to recent research? A general framework for SDSS with AutoML was developed to answer research question 2: (R2) How can AutoML be integrated into SDSS? Similarities in the methodology sections from the review papers in step one, and relevant references, were examined to identify AutoML and SDSS components. The AutoML and SDSS components were then connected based on general AutoML/SDSS approaches from the review papers/relevant references, and spatial problems identified from answering research question one. Finally, opportunities and challenges from the review papers and relevant references in step one were identified and discussed to answer research question 3: (R3) What are the challenges and opportunities of SDSS with AutoML to improve user adoption? Challenges and opportunities were found by comparing similarities and differences among the results, discussion, limitations, and other related sections/references.

3. Search Results

A total of 136 SDSS and AutoML articles were found for review. A total of 18 SDSS and 17 AutoML articles were used as primary sources to review recent important advancements, while 63 SDSS articles, 21 AutoML, and 17 AutoML/SDSS articles (articles having both AutoML and SDSS as topics) were found using these primary sources to serve as supplementary sources for reviewing foundational literature earlier than 2022. Primary sources (between 2019 and 2022) were important as they were made up of the most recent literature reviews to provide a detailed overview of significant research progress to date, while supplementary sources (earlier than 2022) were identified from primary sources to narrow down crucial sub-topics for AutoML and SDSS. The number of articles per year was steady between 1990 and 2018 (supplementary selection), before sharply rising after 2018 (primary selection for more recent articles) (Figure 2). Articles with both AutoML and SDSS topics were quite recent and seen only after 2020. Main keywords were focused on the topics of data, spatial/planning systems, machine learning, and models/analysis (Figure 3). Notable articles based on the number of citations are seen in Figure 4 and Figure 5, where citation data were found using the OpenCitations Corpus on 25 September 2022 [23]. These articles, grouped by topic and source, had a much larger number of citations compared to other articles in the same group. Notable primary SDSS articles had over 15 citations [4,19,24,25]. Notable primary AutoML articles had over 25 citations [12,26,27]. Notable supplementary SDSS articles had over 500 citations [28,29]. Notable supplementary AutoML articles had over 1000 citations: [30,31,32]. Notable supplementary articles with both AutoML and SDSS topics had far fewer citations due to recency and only had over 5 citations [33,34,35].

4. Review Results

Recall research question 1: (R1) What problems can both SDSS and AutoML solve according to recent research? This section addresses research question 1 by reviewing and summarizing the selected literature from Section 3 through a detailed overview of SDSS, AutoML, and their associated problems, methods, applications, and approaches in recent research.

4.1. Spatial Decision Support Systems (SDSS)

The term SDSS has been used since 1985 to describe software designed to support decision making by enabling users to analyze structured or semi-structured spatial problems for potential solutions [65]. Modern SDSS shifted from solution-centric software to human-centric frameworks which incorporated features and ideas from Geographic Information Systems (GIS) [66] and Planning Support Systems (PSS) [67]. SDSS are frameworks that incorporate a collection of tools designed to inform decision making involving spatially related problems, generally comprised of three components [4,18,19,24,36]: (1) spatial data (2) spatial information and (3) spatial knowledge.

The spatial-data component manages and processes data as input for the spatial-information component, which transforms the data into information (e.g., modelling, visualization) by organizing and presenting it for decision-making needs [21,68,69]. Approaches used in the spatial-information component include multiple criteria decision analysis (MCDA) [46], hot spot analysis [70], spatial regression [71], cellular automata (CA) [72], agent-based modelling (ABM) [47], and particle swarm optimization (PSO) [48]. This information acts as input for the knowledge component, which enables users to interact with and explore the information to produce knowledge [20,37,38,39,73]. Knowledge is used to support decisions or to help improve the data or information components [4,24]. Approaches for knowledge generation include participatory planning [19], citizen science [73], and geocollaboration [74]. Although the spatial data and information components focus on spatial data, they may also include supplementary non-spatial data. Each SDSS component contains several subcomponents representing more specific component features and functionality. General SDSS components, subcomponents, and their interactions are seen in Figure 6. The spatial-data component contains data-related sub-components (e.g., data collection, storage, access, documentation, processing, etc.) which serve as input for the spatial-information component. The spatial-information component contains several sub-components (e.g., monitoring, modelling, visualization, reporting, etc.) which transform spatial data into information (e.g., meaningful metrics during monitoring, plots/maps for visualization, automated textual summaries for reporting, etc.) which can be analyzed, interpreted, and, finally, transformed further into input for the knowledge component. The knowledge component contains action-related sub-components (e.g., communication, collaboration, exploration, etc.) which result in an action to change either the spatial data or spatial information components, and/or an action to implement or support a decision. These three components work together to produce actionable insights from the combination of data, information, and knowledge provided by SDSS.

Despite many studies in SDSS [4], challenges exist involving low user adoption [5,39,40] (e.g., awareness across disciplines, lack of practitioner acceptance/trust, expensive resources/training), evidence of usefulness [18,38] (e.g., proving practical utility/success for practitioners, added value vs. resources needed), adaptation [21] (e.g., balance between domain specificity and generalizability, application to similar contexts/problems), collaboration [36] (e.g., communication between non-technical and technical actors, translation of decision-making needs to models/tools), and interpretability [6] (e.g., excessive/complex information for non-technical users, transparency of processes/inputs/outputs). Discussion around the gaps between the research and practice of SDSS has remained an important topic, with many studies encouraging early involvement of, and collaboration between, decision makers, stakeholders, and the community [5,36,41]. Over time, SDSS research, largely focused on applications, case studies, and reviews, started to increase after 2004 and remained steady between 2010 and 2020, with a recent growth in studies related to urban science/analytics, smart cities/urban planning, and digital twins [75].

4.2. Automated Machine Learning (AutoML)

AutoML can be described as the automation of machine learning, a combination of two terms: (1) automation (Auto), to independently act, function, or operate without human intervention [76] and (2) machine learning (ML), a field of artificial intelligence (AI) focused on computer algorithms that can improve through experience [77]. Current AutoML approaches commonly involve optimizing components in the ML process (e.g., extraction/creation of features, tuning/creation of models) given constraints (e.g., reaching desired performances or time limits) [13]. These optimization approaches use metrics (e.g., accuracy, error) which determine the quality of the output components (e.g., features, models). Models include linear/logistic regression [78,79], naïve Bayes (NB) [80], decision trees (DT) [49], random forests [30], k-means clustering [32], support vector machines (SVM) [31], neural networks (NN) [81], and genetic algorithms (GA) [50]. A generic AutoML approach is seen in Figure 7, where two major processes (feature optimization and model optimization) are common amongst different AutoML approaches seen in the literature. The feature-optimization process converts data into a set of informative features which can be used to model a particular outcome, where these features are optimal based on a metric which measures the quality of each feature (e.g., how informative or varying each feature is). The model-optimization process then takes the optimized features and produces high-performing models based on the available models and associated model/optimization algorithm parameters provided. Model performance is measured by a metric of model quality which guides the model-optimization algorithm (e.g., accuracy or error between training and testing data).

Although AutoML has made AutoML more accessible to non-experts [7,82], there are issues in data dependency [12,26] (e.g., low data quality, unavailable data, data misuse), time and efficiency [17,83] (e.g., performance vs. acceptable running times, dataset sizes, search-space comprehensiveness), updates/reusability [13,42,43,84] (e.g., update existing model with new data, performance consistency, reproducible solutions), and interpretability [12,85,86] (e.g., why models perform better/worse or take certain actions). Closing the gap between domain experts/practitioners and ML specialists has recently been a topic of interest, as ML processes are increasingly automated, and applied to solve practical problems in industry [44]. Much of AutoML research has been focused on supervised learning, but recent research has diverged to tackle a larger range of ML problems such as unsupervised learning, time-series forecasting, and anomaly detection [87,88].

4.3. Spatial Problems in SDSS and ML

Spatial problems, commonly studied in SDSS, were not prevalent in AutoML research before 2020, when many studies focused on general problems, such as prediction and optimization, without considering spatial effects. However, ML applications to spatial problems are more common [89,90], and have recently been integrated into SDSS [21,91]. This section identifies spatial problems that have been studied by either SDSS or ML approaches to supplement the much smaller number of studies focused on both SDSS and AutoML. Since AutoML automates ML processes, it is relevant to review studies that use ML to solve spatial problems. A summary of spatial problems in SDSS and reviewed applications and approaches is shown in Table 1, and examined in more detail in Section 4.3.1 to Section 4.3.5.

4.3.1. Spatial Estimation

Spatial estimation problems involve the calculation of unknown values at different locations, which encompasses spatial interpolation [92], prediction [93], and overlay [94]. These problems are solved to create surfaces from samples (e.g., kriging [29]), predict future values at different locations, and calculate values based on location. Recent examples include disease risk calculation [95], disaster risk prediction [96], and land use indicator creation [97] using MCDA approaches, spatial regression (e.g., geographically weighted regression (GWR) [98]), ML (e.g., SVM, RF, NN), and geographically weighted ML (GW-ML) (e.g., geographically weighted NN and RF) [99,100]. Spatial estimation solutions are often evaluated with error (e.g., root mean square error (RMSE), true negatives, sum of square error (SSE)) and accuracy (e.g., F1 score, area under the ROC curve (AUC), sensitivity, specificity) metrics which compare estimated values to true values from real-world data [51]. Challenges in spatial estimation problems involve the need for larger amounts of ground-truth data, and consideration of additional factors/variables/scales [93].

4.3.2. Spatial Optimization

Spatial optimization involves spatial placement [101] and routing [102] of entities. Solving spatial optimization problems helps to determine the more optimal placement of important facilities and infrastructure (e.g., site-selection or facility-location problems [52]), and efficient transportation paths (e.g., vehicle routing [103] and travelling salesman problems [104]). Recent examples include hospital facility selection [105], energy infrastructure placement [106], delivery routing [107], and automatic traffic control [108] using GA, MCDA, and PSO. In addition to error and accuracy metrics, spatial optimization solutions are evaluated with multicriterion [109] (e.g., weighted sums, sensitivity analysis) and multi-objective [53] (e.g., spread, hypervolume, convergence) metrics which consider important indicators from data, expert input, and search-space comprehensiveness. Challenges in spatial optimization problems involve temporal effects, multiple objectives, and computing efficiency [52,110].

4.3.3. Spatial Clustering

Spatial clustering involves grouping entities in space (e.g., local indicators of spatial association (LISA) [111], hot-spot analysis, density-based spatial clustering of applications with noise (DBSCAN) [112]) and time (e.g., spatiotemporal clustering [113], SaTScan [114], K-means [32]), where entities inside groups are similar and entities outside different groups are dissimilar [115]. These groups are used to identify interesting areas (e.g., zones with high crime [116] or natural resource potential [117]) and timespans (e.g., areas of disease transmission during specific times [118]). More recent applications involve clustering high-volume and -velocity spatiotemporal data (e.g., social media, crowdsourcing) [119,120]. Spatial clustering solutions are evaluated with statistical [121] (e.g., likelihood ratio, autocorrelation, significance) and similarity [54] (e.g., euclidean distance, pearson correlation) metrics. Spatial-clustering challenges include irregularly shaped clusters, high dimensional data, spatial relations/weights selection, resolution, object interactions, and visual vs. quantitative evaluation [113,122,123].

4.3.4. Spatial Simulation

Spatial simulation refers to imitations of real-world or hypothetical phenomena in space and time [124]. Spatial simulation enables analyses in cases where data is difficult to obtain (e.g., finer spatial/temporal resolutions, hypothetical/future phenomena) [47]. Approaches involve domain-specific models (e.g., crop-yield models [125], hydrodynamic-fluid models [126]), CA, and ABM. Recent examples include reinforcement learning (RL), ABM, and CA for simulating traffic-light control [127], wildfire spread [128], and sustainable urban growth [129]. Spatial simulation solutions are evaluated with domain-specific (e.g., total traffic delay [127], total crop yield [125], landscape composition/patch sizes [130]) metrics which are used to guide empirical observations. Challenges in spatial simulation include model validation, excessive complexity, disorganization, reproducibility, computing resources, and lack of theoretical basis [131].

4.3.5. Spatial Insight

Spatial-insight problems focus on interpreting and visualizing spatial data and model outputs, typically involving spatial regression [132], interactive interfaces [133], and maps [134]. Incorporating spatial-insight features in SDSS help data and models produce knowledge useful for decision making (e.g., graphical interface to interactively view and manipulate spatial data and models [135], web maps displaying spatial data or model results [136], coefficients representing variable effects in models [55]). Recent examples include spatial regression for identifying factors for reducing pollution [137], web GIS to interactively generate watershed models [138], and interactive visualization tools for exploring and analyzing AutoML pipelines [139]. Spatial insight solutions are evaluated with variable-based (e.g., feature importance [56], coefficients [28]), interpretability [57] (e.g., cognitive indicators, explanation indicators), and empirical approaches [140] (e.g., usability testing, controlled user surveys, user insight observation) to examine the effectiveness of interactive/visualization tools for producing useful insight. Spatial-insight challenges involve measuring usefulness, selecting/justifying appropriate presentation methods, handling big data/complexity, and personalization vs. generalizability [58,140,141].

4.4. SDSS with AutoML

Research related to SDSS with AutoML begun to emerge recently, after the year 2020. AutoML methods were applied to a variety of applications in the areas of agriculture (e.g., crop prediction [59,142,143], crop classification [60]), environmental science (e.g., environmental-impact assessment [144], waterlogging-risk estimation [145], water-storage estimation [33], water-potential mapping [146], meteorological forecasting [147], ocean behaviour prediction [148]), geology (e.g., oil-well placement [149], soil-roughness estimation [35], soil-moisture estimation [34], landslide risk estimation [61,150]), transportation (e.g., road health inspection [151]), and public health (e.g., violence rate prediction [62]). The majority of studies use a fusion of geospatial data sources with satellite imagery, sensors, and surveys being the most common and sociodemographic data being the least common. A summary of SDSS with AutoML approaches and applications is shown in Table 2.

The reviewed AutoML methods were grouped into four generalized approaches: (1) ensembling, (2) Bayesian, (3) neural nets, and (4) evolutionary. The most prominent AutoML approach was ensembling, which involves the combination of multiple algorithms to achieve better performance than individual algorithms [152]. This is followed by Bayesian approaches, which are based on the Bayes theorem and utilize past observations to guide future predictions [83,153]. Neural-net approaches involve the optimization of neural network architecture to build deep-learning models which achieve high performance [154]. Lastly, evolutionary approaches use algorithms which mimic natural-selection-based techniques on a population of models, such as mutation, reproduction, and selection, to find optimal evolved models that achieve better performance [155].

Reviewed AutoML methods and software include Auto-Sklearn, Tree-based Pipeline Optimization Tool (TPOT), H2O, Autogluon, Neural Architecture Search (NAS), and Alpha3DM. Auto-Sklearn uses a combination of Bayesian optimization, meta-learning, and ensemble construction to perform hyperparameter tuning and algorithm selection [156]. TPOT uses genetic programming to optimize and generate tree-based ML pipelines [157]. H2O uses random search and stacked ensembles to produce a final model which evaluates a diversity of candidate models, which, in some situations, are better than Bayesian optimization or genetic-algorithm-based approaches [158]. Autogluon distills ensembled models into individual models, using a data augmentation strategy based on Gibss sampling, to produce final models which are faster and, in some cases, more accurate than training individual models by themselves or ensembled models [159]. NAS automates the construction of neural network architectures to build deep-learning models with different search strategies, spaces and performance-estimation techniques, which can be applied to Bayesian Nets and long short-term memory (LSTM) networks [154]. AlphaD3M uses meta reinforcement learning with self play by modelling meta-data, tasks, and ML pipelines as states in a deep-learning model, which allows AlphaD3M to be faster than AutoML approaches such as Auto-Sklearn and TPOT, while being explainable with transparent ML-pipeline edit operations [160].

5. Discussion

Recall the two research questions: (R2) How can AutoML be integrated into SDSS? and (R3) What are the challenges and opportunities of SDSS with AutoML to improve user adoption? Section 5.1 answers research question 2 by applying concepts in the reviewed literature from Section 4 to form a framework for SDSS with AutoML, and discussing key considerations when applying the framework. Section 5.2–Section 5.3 answer research question 3 by identifying and discussing implementation challenges and research opportunities in SDSS with AutoML related to SDSS user adoption.

5.1. SDSS with AutoML Framework

Recent research in SDSS has incorporated ML algorithms and other models, which can be automated by AutoML, to solve spatial problems for transforming spatial data to spatial information. Applying the generic AutoML concept in Figure 7 to the SDSS components in Figure 6, AutoML can be integrated into SDSS by framing these spatial problems as optimization problems. Referring to the SDSS components in Figure 6, AutoML automatically processes spatial data into spatial information, having a role between these two SDSS components, but not the spatial knowledge component as it is dependent on human-driven processes (e.g., communication, exploration, collaboration). Given spatial problem x, potential solutions S, and metric Q (measuring how well solutions solve the spatial problem), AutoML can auto-approximate near optimal solutions

\tilde{S}

for x based on metric Q within pre-defined constraints (e.g., time limits, desired performance). For example, a spatial estimation problem x could be to classify whether pixels at different coordinates are urban or rural land use. Metric Q can be a measure of how many pixels were predicted correctly based on groundtruth samples of urban/rural land use pixels, while potential solutions S can be a set of appropriate models (e.g., kriging, DT, NN) that can predict urban/rural landuse pixels. The near optimal solution

\tilde{S}

is the most accurate model from S based on Q, using an optimization algorithm (e.g., GA, PSO) under constraints (e.g., max iterations/runtime). In MCDA, the potential solutions S can also be different weighing schemes. Referring to the AutoML components in Figure 7, a framework to integrate AutoML into SDSS is developed as seen in Figure 8, where AutoML automatically processes spatial data into spatial information by solving various spatial problems. In general, this framework requires three key considerations, which may also be applicable to geographic problem solving on a broader scale:

Spatial problems: what are the spatial problem(s) to solve given the context and actors in decision making?
Metrics: what metrics are appropriate for evaluating and measuring the defined spatial problem(s)?
Potential solutions: with the given spatial problem(s) and metric(s), what are the potential solutions to solve the spatial problem(s)?

5.1.1. Key Consideration 1: Spatial Problems

Given the context and actors in decision making, spatial problems need to be defined to reflect decisions being evaluated [161]. Initially, determining the types of decisions being evaluated may help in defining the behaviour of decisions. The main types of decisions according to [162] are:

Independent: decisions made by a decision maker with full responsibility and authority;
Sequential interdependent: decisions made partially by a decision maker and partially by another party;
Pooled interdependent: decisions made from negotiation and interaction among decision makers.

Then, three main steps related to the process of decision making as described by [163] can further aid in defining spatial problems:

Intelligence: examination of spatial data to identify spatial problems that require decisions and have the opportunity for change;
Design: determining possible and alternative decisions and developing approaches to evaluate and understand the decisions;
Choice: selecting from the range of possible and alternative decisions after evaluating and understanding each decision.

After considering the type of decisions, the possible decisions, and approaches to selecting/evaluating the possible decisions, the spatial problem may be better defined as one or more (but not limited to) of the following general spatial problems as reviewed in Section 4.3:

Spatial estimation: calculation of unknown values in space (e.g., prediction, overlay);
Spatial optimization: optimization of entities in space (e.g., placement, routing);
Spatial clustering: organization of entities in space (e.g., grouping, categorization, zoning);
Spatial simulation: simulation of phenomena in space (e.g., physics, theoretical simulations);
Spatial insight: interpretation and exploration of phenomena and entities in space (e.g., interactive maps, visualizations, plots).

Decisions can be simple or complex depending on the change desired from the decision and the evaluation approach designed. A simple decision may only require defining a single spatial problem. For example, the identification of crime hotspots for police patrols can be defined as a spatial-clustering problem. A more complex single decision may require defining one or a combination of different spatial problems. For example, analyzing the effects of health interventions may require the identification of intervention areas (a spatial-clustering problem) and simulating the effects of each alternative intervention on the intervention areas (a spatial-simulation problem). The considerations in this section are meant to be a starting point to help structure decisions as spatial problems, not as a definitive guide to do so as every context, actor, and decision-making process can vary drastically and there is not a solution for every situation [161].

5.1.2. Key Consideration 2: Metrics

After defining the spatial problem to solve, the metrics used to measure and evaluate potential solutions to each spatial problem need to be determined. It is important to consider the task needed to solve the defined spatial problems. For ML and statistics, the following tasks are common amongst studies [51,164,165]:

Regression: estimation or prediction of continuous target values given other factors (e.g., calculating landslide risk, predicting number of traffic collisions);
Classification: identification of discrete target values (groups or categories) given other factors (e.g., predicting landuse types, identifying building types);
Clustering: organization of entities into groups or categories based on characteristics (e.g., identifying crime zones, disease areas).

Considering the tasks for solving spatial problems aids in identifying the metrics required to evaluate each spatial problem. Each metric has a particular purpose and is suited for measuring the performance of particular tasks. For example, RMSE and correlation coefficients metrics are used to evaluate regression tasks, while accuracy and F1 scores are used to evaluate classification tasks. As an example, metrics for the mentioned tasks above are provided in Table 3. Ref. [51] presents a more comphrehensive collection of regression and classification metrics, while [54,166,167] provide more detailed overviews of different clustering metrics.

The choice of metrics should fit not only the defined spatial problem, but the behaviour and characteristics of the spatial data used. It is important to note that each metric has its own advantages and caveats [51,53,166]. For example, accuracy metrics are biased when the data contains unbalanced classes (e.g., 90% of the data is class A and only 10% is class B). This causes classification models to be high-performing if a large majority of the output from classification contains the dominant class. In this case, the F1 score is a more appropriate metric to account for class imbalances.

5.1.3. Key Consideration 3: Potential Solutions

When the spatial problems and associated metrics are defined, potential solutions to match these problems and metrics can be determined. The spatial problems and metrics create a structured goal for AutoML approaches to reach, where the potential solutions are often possible models or algorithms for AutoML methods to select from. The potential solutions need to accept input spatial data relevant to the defined spatial problems, while allowing the chosen metrics to measure outputs in a manner which is comparable across different potential solutions. Thus, a few considerations for potential solutions include [121,168,169]:

Data size: how large or small the data are;
Interpretability: whether the potential solutions need to be interpreted or simply produce outputs to be used (e.g., identifying important variables vs. prediction performance);
Resource constraints: time, computation, and expertise constraints (e.g., runtime of models, training to interpret results);
Update frequency: how often potential solutions need to be re-evaluated (e.g., new input data, new model/algorithm adjustments).

Similar to choosing metrics, the potential solutions generally have their own caveats and advantages [170]. For example, the performance of neural networks are reliant on the design of the neural architecture [154], while unsupervised models tend to perform better on larger datasets [169]. However, the difference between determining potential solutions and metrics is that, given the appropriate metric and adequate computing power, a comphrehensive search space can be defined and the choice of potential solutions becomes more flexible [171]. In more complex cases, potential solutions may also be allowed to be combined to form new potential solutions [172].

5.2. Implementation Challenges

This section discusses the implementation challenges of SDSS with AutoML as they relate to user adoption and its barriers. Section 5.2.1 discusses the issue of data-quality dependency, which creates a technical barrier to user adoption due to the need for expert model and data-management knowledge. Section 5.2.2 discusses model interpretability, where model outputs need to be explained to non-technical users (e.g., decision makers, the public), as this transparency largely influences SDSS user trust, a major user-adoption barrier. Most importantly, the usefulness of SDSS and the evidence to prove usefulness is a major challenge common among all SDSS, and may strongly affect a user’s need for a SDSS. This is discussed in Section 5.2.3.

5.2.1. Data Quality

SDSS and AutoML rely on data for producing models to generate useful information, which creates a technical barrier for users due to the need for domain expertise in data management and appropriate modelling. Data quality is an important factor in modelling, as it determines whether the data are appropriate for SDSS purposes or AutoML modelling. Data contains noise [12] (e.g., errors, incompleteness), varying levels of detail [173] (e.g., aggregate, scale), and risks [174] (e.g., misuse, ethics). As about 60% of the time is spent on data preparation [8], the challenge is to distribute resources to ensure that the data used is diverse (e.g., inclusive, transparent), representative (e.g., adequate coverage and detail), and reliable (e.g., minimal errors) for the intended purposes over time [26].

5.2.2. Model Interpretability

Predictions or actions from models are eventually explained to non-technical users in decision making (e.g., clients, society), which improves trust, transparency, and fairness [86]. Many reviewed AutoML studies measure model performance, but often do not focus on interpretability—why models perform better/worse or take certain actions [12,26]. Models in reviewed SDSS studies consider interpretability, but are often too complex for decision makers to use or communicate to stakeholders [24]. Improving interpretability leads to higher user adoption due to trust and useful knowledge from better communication. Another challenge is to balance available resources, model performance, and model interpretability to particular spatial problems.

5.2.3. Evidence of Usefulness

Evidence of usefulness, which influences user adoption, is often not focused on in the reviewed SDSS and AutoML studies. Without measures/examinations of usefulness, it is difficult to prove the added value of SDSS or AutoML in practice. This lead to issues such as difficulty differentiating SDSS implementations [6], low SDSS user adoption [18], inconsistent AutoML performance [13], and non-reusable AutoML models [84]. SDSS are also used in different application fields and domains (e.g., agriculture, forestry, environmental management), which have a variety of requirements and users which need adjustments and special attention on top of general SDSS [4]. An important challenge is to design methods to measure practical success by examining the utility of SDSS/AutoML implementations to real-world decisions and different domains and application fields—evaluating not only performance, but how SDSS with AutoML directly affects decision making.

5.3. Research Opportunities

This section discusses research opportunities to address the challenges in Section 5.2, where each opportunity covers one or more of the challenges (Figure 9). Section 5.3.1 discusses spatial AutoML, which was often not considered in the reviewed AutoML literature, but can reduce technical barriers to spatial modelling for SDSS users. The need for resource-aware approaches is identified in Section 5.3.2 to widen the variety of users in different resource-constrained environments. Section 5.3.3 addresses the problem of reusable and comparable systems, while balancing generalizability and specificity, to allow users to easily adopt different SDSS based on standards and collaborative communities across different domains and fields of practice. Lastly, a major barrier to user adoption involves the translation of complex inputs/outputs in SDSS, which can be improved with research in human-centered system design, which is discussed in Section 5.3.4.

5.3.1. Spatial AutoML

Many common AutoML approaches in the reviewed studies do not consider spatial data patterns when dealing with spatial problems. Although AutoML can automate non-spatial models, a majority of SDSS use spatial models and these present barriers in technical knowledge for users. If spatial patterns exist in the data (e.g., clustering and dispersion in space), then the assumption that observations are independent of each other is violated and the data has spatial dependency [175]. Spatial strategies (e.g., spatial sampling [63], localized models [176]) can be integrated into AutoML approaches for potential performance or efficiency gains, while spatial parameter problems (e.g., selecting neighbours or distance bands [64]) can fit in many AutoML optimization methods. One research opportunity involves incorporating spatially explicit approaches in AutoML for SDSS to improve modelling performance/efficiency and reduce arbitrary parameter selection without strongly affecting interpretability [25].

5.3.2. Resource-Aware Approaches

As SDSS and AutoML use various modelling and visualization approaches, the resources available (e.g., amount/quality of data, server processing power, data/domain experts) determine whether an implementation is feasible and practical for its intended purposes. The variety of SDSS users mean that their environments are different, where a major restriction is resource constraints when it comes to adopting SDSS (e.g., computing power, cloud infrastructure, modelling knowledge). In AutoML, the design of the search space (e.g., range of models/parameters) and optimization stopping criteria (e.g., iteration limits, reaching desired performance) are dependent on time tolerance, data available, and computing power [17]. Similarly, SDSS are dependent on hardware and software resources, but also include human resources (e.g., software developers, decision makers, consultants) which communicate and collaborate to design system features and purposes [36]. Another opportunity is to develop resource-aware implementation approaches for SDSS with AutoML by balancing the available resources and the desired results (e.g., data quality, model interpretability/performance).

5.3.3. Collaborative and Connected Systems

Many SDSS are difficult to reuse due to being developed for specific domain purposes [18], while AutoML models are difficult to reproduce due to varying search spaces and optimization approaches [12]. Reusability/reproducibility problems hinder collaboration and communication, as SDSS implementations and AutoML models do not follow standards for comparability (e.g., benchmarks, performance, features) and transferability (e.g., reuse for similar problems/different geographic area) [6,27,43,45]. Standardization of SDSS implementations and AutoML models enable SDSS with AutoML to be more easily compared/applied across various spatial problems [4], and shared across studies and organizations [36]. These standards enable different SDSS with AutoML implementations to be connected, which improves data and research transparency, availability, and interoperability (e.g., web platforms [177], programming interfaces [178], open data [69]). A third opportunity involves developing interoperable standards to connect SDSS with AutoML implementations in a network for sharing data/information/knowledge to reduce redundancy and repetition, while improving usability and collaboration between stakeholders, decision makers, and other actors.

5.3.4. Human-Centered System Design

Usability and interpretability are often overlooked in SDSS and AutoML research focusing on performance. When inputs and outputs are complex, barriers to translating information for useful decision-making knowledge hinder non-technical users (e.g., decision makers, stakeholders, policy makers) and lower user adoption [179]. However, a tradeoff between simplicity and accuracy exists, where improving the ease of use and transparency while minimizing complexity is desired [36,180]. SDSS with AutoML also need to balance domain specificity (e.g., custom solutions tailored to a particular spatial problem) and adaptability (e.g., flexible solutions applicable to a variety of spatial problems) [18]. A final opportunity is to strive towards human-centered system design principles and co-design for SDSS with AutoML, where careful considerations are made regarding user and spatial-problem characteristics, complexities, and interactions to enhance user adoption, experience, and practical usefulness.

6. Conclusions

This paper examined recent research for the integration of SDSS and AutoML to answer three questions: (R1) What problems can both SDSS and AutoML solve according to recent research? (R2) How can AutoML be integrated into SDSS? and (R3) What are the challenges and opportunities of SDSS with AutoML to improve user adoption? To answer question (R1), SDSS- and AutoML-related problems from recent literature were organized into five spatial problem categories (estimation, optimization. clustering, simulation, and insight) and summarized according to research application and spatial methods. A general framework for SDSS with AutoML was proposed to answer question (R2) by identifying and connecting general SDSS and AutoML components from selected research papers, where SDSS automatically processes data into information by solving spatial problems. To answer question (R3), challenges were discussed regarding data quality, model interpretability, and evidence of usefulness, while research opportunities were also discussed to address these challenges in relation to the user-adoption issue in SDSS. When implementing SDSS with AutoML, there is a distribution of available resources to maintain data with adequate quality/quantity for decision-making purposes, while ensuring models and systems are interpretable, comparable, and perform well in practice. One opportunity involves incorporating spatially explicit models, commonly used by SDSS, in AutoML research to help optimize, standardize, and compare models used in SDSS. Other opportunities involve developing standards, approaches, and principles for resource-aware, collaborative/connected, and human-centered systems. These developments support the goal of SDSS with AutoML, which is to aid decision making involving collaboration among various actors and different resource settings. As human-related (e.g., interpretability, usability, usefulness) and technical (e.g., reproducibility, reusability, comparability) issues arise in recent SDSS and AutoML research, integrating SDSS with AutoML incorporates technical aspects of AutoML (e.g., standardized pipelines/metrics) in SDSS research, while also incorporating human-related considerations of SDSS (e.g., solution complexity, scenario evaluation) in AutoML research. SDSS with AutoML not only helps improve SDSS user adoption, but mutually benefits SDSS and AutoML research by fostering approaches that consider both human-related and technical issues. The requirements, implementation, and user adoption of SDSS differ among different fields of study (e.g., environmental science, public health), and future work into exploring SDSS with AutoML for each field of study may provide further value for both SDSS and AutoML research.

Author Contributions

Conceptualization, Richard Wen; methodology, Richard Wen, Songnian Li; Investigation, data curation, software, writing—original draft preparation, Richard Wen; supervision, funding acquisition, resources, project administration, Songnian Li; writing—review and editing, Richard Wen, Songnian Li. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the Natural Sciences and Engineering Research Council of Canada, grant number RGPIN-2017-05950.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:

ABM	Agent-based modelling
AI	Artificial intelligence
AUC	Area under the ROC curve
AutoML	Automated machine learning
CA	Cellular automata
DBSCAN	Density-based spatial clustering of applications with noise
DT	Decision trees
GA	Genetic algorithms
GIS	Geographic information systems
GW-ML	Geographically weighted machine learning
GWR	Geographically weighted regression
LISA	Local indicators of spatial association
LSTM	Long short-term memory
MAE	Mean absolute error
MCDA	Multiple criteria decision analysis
ML	Machine learning
MSE	Mean squared error
NAS	Neural architecture search
NB	Naïve Bayes
NN	Neural networks
PSO	Particle swarm optimization
PSS	Planning support systems
RL	Reinforcement learning
RMSE	Root mean square error
SDSS	Spatial decision support systems
SSE	Sum of square error
SVM	Support vector machines
TPOT	Tree-based pipeline optimization tool

References

Niu, H.; Silva, E.A. Crowdsourced Data Mining for Urban Activity: Review of Data Sources, Applications, and Methods. J. Urban Plan. Dev. 2020, 146, 04020007. [Google Scholar] [CrossRef] [Green Version]
Ruijer, E.; Meijer, A. Open Government Data as an Innovation Process: Lessons from a Living Lab Experiment. Public Perform. Manag. Rev. 2020, 43, 613–635. [Google Scholar] [CrossRef] [Green Version]
Riehle, D. The Innovations of Open Source. Computer 2019, 52, 59–63. [Google Scholar] [CrossRef]
Keenan, P.B.; Jankowski, P. Spatial Decision Support Systems: Three Decades On. Decis. Support Syst. 2019, 116, 64–76. [Google Scholar] [CrossRef]
Geertman, S. PSS: Beyond the Implementation Gap. Transp. Res. Part A Policy Pract. 2017, 104, 70–76. [Google Scholar] [CrossRef]
Jiang, H.; Geertman, S.; Witte, P. Avoiding the Planning Support System Pitfalls? What Smart Governance Can Learn from the Planning Support System Implementation Gap. Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1343–1360. [Google Scholar] [CrossRef]
Yao, Q.; Wang, M.; Chen, Y.; Dai, W.; Li, Y.F.; Tu, W.W.; Yang, Q.; Yu, Y. Taking Human out of Learning Applications: A Survey on Automated Machine Learning. arXiv 2019, arXiv:1810.13306. [Google Scholar]
Munson, M.A. A Study on the Importance of and Time Spent on Different Modeling Steps. ACM SIGKDD Explor. Newsl. 2012, 13, 65–71. [Google Scholar] [CrossRef]
Google LLC. Cloud AutoML—Custom Machine Learning Models. 2020. Available online: https://cloud.google.com/automl (accessed on 20 September 2020).
Microsoft Corporation. Automated Machine Learning | Microsoft Azure. 2020. Available online: https://azure.microsoft.com/en-ca/services/machine-learning/automatedml/ (accessed on 20 September 2020).
Amazon.com, Inc. Amazon SageMaker. 2020. Available online: https://aws.amazon.com/sagemaker/ (accessed on 20 September 2020).
He, X.; Zhao, K.; Chu, X. AutoML: A Survey of the State-of-the-Art. Knowl. Based Syst. 2021, 212, 106622. [Google Scholar] [CrossRef]
Escalante, H.J. Automated Machine Learning—A Brief Review at the End of the Early Years. In Automated Design of Machine Learning and Search Algorithms; Pillay, N., Qu, R., Eds.; Natural Computing Series; Springer International Publishing: Cham, Switzerland, 2021; pp. 11–28. [Google Scholar] [CrossRef]
ProQuest LLC. ProQuest Summon 2.0 Customer Resources. 2022. Available online: https://support.proquest.com/s/article/ProQuest-Summon-2-0-Customer-Resources?language=en_US (accessed on 21 November 2022).
Budjač, R.; Nikmon, M.; Schreiber, P.; Zahradníková, B.; Janáčová, D. Automated Machine Learning Overview. Ved. Práce Mater. Fak. Slov. Tech. Univ. 2019, 27, 107–112. [Google Scholar] [CrossRef]
Weng, Z. From Conventional Machine Learning to AutoML. J. Physi. Conf. Ser. 2019, 1207, 012015. [Google Scholar] [CrossRef] [Green Version]
Chen, Y.W.; Song, Q.; Hu, X. Techniques for Automated Machine Learning. ACM SIGKDD Explor. Newsl. 2021, 22, 35–50. [Google Scholar] [CrossRef]
Geertman, S.; Stillwell, J. Planning Support Science: Developments and Challenges. Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1326–1342. [Google Scholar] [CrossRef]
Flacke, J.; Shrestha, R.; Aguilar, R. Strengthening Participation Using Interactive Planning Support Systems: A Systematic Review. ISPRS Int. J. Geo-Inf. 2020, 9, 49. [Google Scholar] [CrossRef] [Green Version]
Pan, H.; Geertman, S.; Deal, B. What Does Urban Informatics Add to Planning Support Technology? Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1317–1325. [Google Scholar] [CrossRef]
Lock, O.; Bain, M.; Pettit, C. Towards the Collaborative Development of Machine Learning Techniques in Planning Support Systems—A Sydney Example. Environ. Plan. B Urban Anal. City Sci. 2020, 48, 484–502. [Google Scholar] [CrossRef]
Niazi, M. Do Systematic Literature Reviews Outperform Informal Literature Reviews in the Software Engineering Domain? An Initial Case Study. Arab. J. Sci. Eng. 2015, 40, 845–855. [Google Scholar] [CrossRef]
Peroni, S.; Shotton, D. OpenCitations, an Infrastructure Organization for Open Scholarship. Quant. Sci. Stud. 2020, 1, 428–444. [Google Scholar] [CrossRef]
Pan, H.; Deal, B. Reporting on the Performance and Usability of Planning Support Systems—Towards a Common Understanding. Appl. Spat. Anal. Policy 2020, 13, 137–159. [Google Scholar] [CrossRef]
Du, P.; Bai, X.; Tan, K.; Xue, Z.; Samat, A.; Xia, J.; Li, E.; Su, H.; Liu, W. Advances of Four Machine Learning Methods for Spatial Data Handling: A Review. J. Geovisualization Spat. Anal. 2020, 4, 13. [Google Scholar] [CrossRef]
Waring, J.; Lindvall, C.; Umeton, R. Automated Machine Learning: Review of the State-of-the-Art and Opportunities for Healthcare. Artif. Intell. Med. 2020, 104, 101822. [Google Scholar] [CrossRef] [PubMed]
Zöller, M.A.; Huber, M.F. Benchmark and Survey of Automated Machine Learning Frameworks. J. Artif. Intell. Res. 2021, 70, 409–472. [Google Scholar] [CrossRef]
Taylor, R. Interpretation of the Correlation Coefficient: A Basic Review. J. Diagn. Med. Sonogr. 1990, 6, 35–39. [Google Scholar] [CrossRef]
Oliver, M.A.; Webster, R. Kriging: A Method of Interpolation for Geographical Information Systems. Int. J. Geogr. Inf. Syst. 1990, 4, 313–332. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Noble, W.S. What Is a Support Vector Machine? Nat. Biotechnol. 2006, 24, 1565–1567. [Google Scholar] [CrossRef]
Steinley, D. K-Means Clustering: A Half-Century Synthesis. Br. J. Math. Stat. Psychol. 2006, 59, 1–34. [Google Scholar] [CrossRef] [Green Version]
Sun, A.Y.; Scanlon, B.R.; Save, H.; Rateb, A. Reconstruction of GRACE Total Water Storage Through Automated Machine Learning. Water Resour. Res. 2021, 57, e2020WR028666. [Google Scholar] [CrossRef]
Babaeian, E.; Paheding, S.; Siddique, N.; Devabhaktuni, V.K.; Tuller, M. Estimation of Root Zone Soil Moisture from Ground and Remotely Sensed Soil Information with Multisensor Data Fusion and Automated Machine Learning. Remote Sens. Environ. 2021, 260, 112434. [Google Scholar] [CrossRef]
Singh, A.; Kumar, G.; Rai, A.K.; Beg, Z. Machine Learning to Estimate Surface Roughness from Satellite Images. Remote Sens. 2021, 13, 3794. [Google Scholar] [CrossRef]
Schindler, M.; Dionisio, R.; Kingham, S. Challenges of Spatial Decision-Support Tools in Urban Planning: Lessons from New Zealand’s Cities. J. Urban Plan. Dev. 2020, 146, 04020012. [Google Scholar] [CrossRef]
Mutuku, B.; Boerboom, L.; Madureira, A.M. The Role of Planning Support Systems in National Policy Transfer and Policy Translation in Secondary Cities. Int. Plan. Stud. 2019, 24, 293–307. [Google Scholar] [CrossRef] [Green Version]
Erskine, M.A.; Gregg, D.G.; Karimi, J.; Scott, J.E. Individual Decision-Performance Using Spatial Decision Support Systems: A Geospatial Reasoning Ability and Perceived Task-Technology Fit Perspective. Inf. Syst. Front. 2019, 21, 1369–1384. [Google Scholar] [CrossRef]
Punt, E.P.; Geertman, S.C.M.; Afrooz, A.E.; Witte, P.A.; Pettit, C.J. Life Is a Scene and We Are the Actors: Assessing the Usefulness of Planning Support Theatres for Smart City Planning. Comput. Environ. Urban Syst. 2020, 82, 101485. [Google Scholar] [CrossRef]
Page, J.; Mörtberg, U.; Destouni, G.; Ferreira, C.; Näsström, H.; Kalantari, Z. Open-Source Planning Support System for Sustainable Regional Planning: A Case Study of Stockholm County, Sweden. Environ. Plan. B Urban Anal. City Sci. 2020, 47, 1508–1523. [Google Scholar] [CrossRef] [Green Version]
Hooper, P.; Boulange, C.; Arciniegas, G.; Foster, S.; Bolleter, J.; Pettit, C. Exploring the Potential for Planning Support Systems to Bridge the Research-Translation Gap between Public Health and Urban Planning. Int. J. Health Geogr. 2021, 20, 36. [Google Scholar] [CrossRef]
Escalante, H.J.; Tu, W.W.; Guyon, I.; Silver, D.L.; Viegas, E.; Chen, Y.; Dai, W.; Yang, Q. AutoML @ NeurIPS 2018 Challenge: Design and Results. In The NeurIPS ’18 Competition; Escalera, S., Herbrich, R., Eds.; The Springer Series on Challenges in Machine Learning; Springer International Publishing: Cham, Switzerland, 2020; pp. 209–229. [Google Scholar] [CrossRef] [Green Version]
Halvari, T.; Nurminen, J.K.; Mikkonen, T. Testing the Robustness of AutoML Systems. Electron. Proc. Theor. Comput. Sci. 2020, 319, 103–116. [Google Scholar] [CrossRef]
Karmaker, S.K.; Hassan, M.M.; Smith, M.J.; Xu, L.; Zhai, C.; Veeramachaneni, K. AutoML to Date and Beyond: Challenges and Opportunities. ACM Comput. Surv. 2021, 54, 175. [Google Scholar] [CrossRef]
Hanussek, M.; Blohm, M.; Kintz, M. Can AutoML Outperform Humans? In An Evaluation on Popular OpenML Datasets Using AutoML Benchmark. In Proceedings of the 2020 2nd International Conference on Artificial Intelligence, Robotics and Control, Cairo, Egypt, 12–14 December 2020; pp. 29–32. [Google Scholar] [CrossRef]
Greene, R.; Devillers, R.; Luther, J.E.; Eddy, B.G. GIS-Based Multiple-Criteria Decision Analysis. Geogr. Compass 2011, 5, 412–432. [Google Scholar] [CrossRef]
Crooks, A.; Castle, C.; Batty, M. Key Challenges in Agent-Based Modelling for Geo-Spatial Simulation. Comput. Environ. Urban Syst. 2008, 32, 417–430. [Google Scholar] [CrossRef] [Green Version]
Wahab, M.N.A.; Nefti-Meziani, S.; Atyabi, A. A Comprehensive Review of Swarm Optimization Algorithms. PLoS ONE 2015, 10, e0122827. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Quinlan, J.R. Decision Trees and Decision-Making. IEEE Trans. Syst. Man Cybern. 1990, 20, 339–346. [Google Scholar] [CrossRef]
Jaramillo, J.H.; Bhadury, J.; Batta, R. On the Use of Genetic Algorithms to Solve Location Problems. Comput. Oper. Res. 2002, 29, 761–779. [Google Scholar] [CrossRef]
Naser, M.Z.; Alavi, A.H. Error Metrics and Performance Fitness Indicators for Artificial Intelligence and Machine Learning in Engineering and Sciences. Archit. Struct. Constr. 2021. [Google Scholar] [CrossRef]
Farahani, R.Z.; Asgari, N.; Heidari, N.; Hosseininia, M.; Goh, M. Covering Problems in Facility Location: A Review. Comput. Ind. Eng. 2012, 62, 368–407. [Google Scholar] [CrossRef]
Riquelme, N.; Von Lücken, C.; Baran, B. Performance Metrics in Multi-Objective Optimization. In Proceedings of the 2015 Latin American Computing Conference (CLEI), Arequipa, Peru, 19–23 October 2015; pp. 1–11. [Google Scholar]
Grabusts, P. The Choice of Metrics for Clustering Algorithms. Environ. Technol. Resour. Proc. Int. Sci. Pract. Conf. 2011, 2, 70–76. [Google Scholar] [CrossRef]
Chi, G.; Zhu, J. Spatial Regression Models for Demographic Analysis. Popul. Res. Policy Rev. 2008, 27, 17–42. [Google Scholar] [CrossRef]
Wei, P.; Lu, Z.; Song, J. Variable Importance Analysis: A Comprehensive Review. Reliab. Eng. Syst. Saf. 2015, 142, 399–432. [Google Scholar] [CrossRef]
Carvalho, D.V.; Pereira, E.M.; Cardoso, J.S. Machine Learning Interpretability: A Survey on Methods and Metrics. Electronics 2019, 8, 832. [Google Scholar] [CrossRef] [Green Version]
Andrienko, N.; Andrienko, G.; Gatalsky, P. Exploratory Spatio-Temporal Visualization: An Analytical Review. J. Vis. Lang. Comput. 2003, 14, 503–541. [Google Scholar] [CrossRef]
Kasimati, A.; Espejo-García, B.; Darra, N.; Fountas, S. Predicting Grape Sugar Content under Quality Attributes Using Normalized Difference Vegetation Index Data and Automated Machine Learning. Sensors 2022, 22, 3249. [Google Scholar] [CrossRef]
Kai-Yun, L.; Burnside, N.G.; Sampaio de Lima, R.; lPeciña, M.V.; Sepp, K.; Cabral Pinheiro, V.H.; de Lima, B.R.C.A. An Automated Machine Learning Framework in Unmanned Aircraft Systems: New Insights into Agricultural Management Practices Recognition Approaches. Remote Sens. 2021, 13, 3190. [Google Scholar] [CrossRef]
Bruzón, A.G.; Arrogante-Funes, P.; Arrogante-Funes, F.; Martín-González, F.; Novillo, C.J.; Fernández, R.R.; Vázquez-Jiménez, R.; Alarcón-Paredes, A.; Alonso-Silverio, G.A.; Cantu-Ramirez, C.A.; et al. Landslide Susceptibility Assessment Using an AutoML Framework. Int. J. Environ. Res. Public Health 2021, 18, 10971. [Google Scholar] [CrossRef]
D’Orazio, V.; Lin, Y. Forecasting Conflict in Africa with Automated Machine Learning Systems. Int. Interact. 2022, 48, 714–738. [Google Scholar] [CrossRef]
Wang, J.F.; Stein, A.; Gao, B.B.; Ge, Y. A Review of Spatial Sampling. Spat. Stat. 2012, 2, 1–14. [Google Scholar] [CrossRef]
Getis, A.; Aldstadt, J. Constructing the Spatial Weights Matrix Using a Local Statistic. Geogr. Anal. 2004, 36, 90–104. [Google Scholar] [CrossRef]
Hopkins, L.D.; Armstrong, M.P. Analytic and Cartographic Data Storage: A Two-Tiered Approach to Spatial Decision Support Systems. In Proceedings of the Seventh International Symposium on Computer-Assisted Cartography, Washington, DC, USA, 11–14 March 1985. [Google Scholar]
Longley, P.A.; Goodchild, M.F.; Maguire, D.J.; Rhind, D.W. Geographic Information Systems and Science; John Wiley & Sons: Hoboken, NJ, USA, 2005. [Google Scholar]
Geertman, S.; Stillwell, J. Planning Support Systems in Practice; Springer Science & Business Media: Cham, Switzerland, 2012. [Google Scholar]
Alva, P.; Janssen, P.; Stouffs, R. Geospatial Tool-Chains: Planning Support Systems for Organisational Teams. Int. J. Archit. Comput. 2019, 17, 336–356. [Google Scholar] [CrossRef]
Zhang, G.; Zhang, W.; Guhathakurta, S.; Botchwey, N. Development of a Flow-Based Planning Support System Based on Open Data for the City of Atlanta. Environ. Plan. B Urban Anal. City Sci. 2019, 46, 207–224. [Google Scholar] [CrossRef]
Getis, A.; Ord, J.K. The Analysis of Spatial Association by Use of Distance Statistics. In Perspectives on Spatial Data Analysis; Anselin, L., Rey, S.J., Eds.; Advances in Spatial Science; Springer: Berlin/Heidelberg, Germany, 2010; pp. 127–145. [Google Scholar] [CrossRef]
Ward, M.D.; Gleditsch, K.S. Spatial Regression Models; SAGE Publications: London, UK, 2018. [Google Scholar]
Itami, R.M. Simulating Spatial Dynamics: Cellular Automata Theory. Landsc. Urban Plan. 1994, 30, 27–47. [Google Scholar] [CrossRef]
Shrestha, R.; Flacke, J. Leveraging Citizen Science to Advance Interactive Spatial Decision Support Technology: A Swot Analysis. In The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences; Copernicus GmbH: Gottingen, Germany, 2019; Volume XLII. [Google Scholar] [CrossRef] [Green Version]
Maceachren, A.M.; Brewer, I. Developing a Conceptual Framework for Visually-Enabled Geocollaboration. Int. J. Geogr. Inf. Sci. 2004, 18, 1–34. [Google Scholar] [CrossRef]
Daniel, C.; Pettit, C. Charting the Past and Possible Futures of Planning Support Systems: Results of a Citation Network Analysis. Environ. Plan. B Urban Anal. City Sci. 2022, 49, 1875–1892. [Google Scholar] [CrossRef]
Golnaraghi, F.; Kuo, B.C. Automatic Control Systems, 9th ed.; Wiley: Hoboken, NJ, USA, 2009. [Google Scholar]
Samuel, A.L. Some Studies in Machine Learning Using the Game of Checkers. IBM J. Res. Dev. 1959, 3, 210–229. [Google Scholar] [CrossRef]
Seber, G.A.F.; Lee, A.J. Linear Regression Analysis; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
Kleinbaum, D.G.; Klein, M. Introduction to Logistic Regression. In Logistic Regression: A Self-Learning Text; Kleinbaum, D.G., Klein, M., Eds.; Statistics for Biology and Health; Springer: New York, NY, USA, 2010; pp. 1–39. [Google Scholar] [CrossRef]
Rish, I. An Empirical Study of the Naive Bayes Classifier. In Proceedings of the IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, Seattle, WA, USA, 4–6 August 2001. [Google Scholar]
Hinton, G.E. How Neural Networks Learn from Experience. Sci. Am. 1992, 267, 144–151. [Google Scholar] [CrossRef]
Santu, S.K.K.; Hassan, M.M.; Smith, M.J.; Xu, L.; Zhai, C.; Veeramachaneni, K. A Level-wise Taxonomic Perspective on Automated Machine Learning to Date and Beyond: Challenges and Opportunities. arXiv 2020, arXiv:2010.10777. [Google Scholar]
Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.T.; Blum, M.; Hutter, F. Auto-Sklearn: Efficient and Robust Automated Machine Learning. In Automated Machine Learning; Hutter, F., Kotthoff, L., Vanschoren, J., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 113–134. [Google Scholar] [CrossRef] [Green Version]
Madrid, J.G.; Jair Escalante, H.; Morales, E.F.; Tu, W.W.; yu, y.; Sun-Hosoya, L.; Guyon, I.; Sebag, M. Towards AutoML in the Presence of Drift: First Results. arXiv 2018, arXiv:1907.10772. [Google Scholar]
Molnar, C. Interpretable Machine Learning. 2020. Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 20 September 2020).
Pfisterer, F.; Thomas, J.; Bischl, B. Towards Human Centered AutoML. arXiv 2019, arXiv:1911.02391. [Google Scholar]
Bahri, M.; Salutari, F.; Putina, A.; Sozio, M. AutoML: State of the Art with a Focus on Anomaly Detection, Challenges, and Research Directions. Int. J. Data Sci. Anal. 2022, 14, 113–126. [Google Scholar] [CrossRef]
Alsharef, A.; Aggarwal, K.; Sonia; Kumar, M.; Mishra, A. Review of ML and AutoML Solutions to Forecast Time-Series Data. Arch. Comput. Methods Eng. 2022, 29, 5297–5311. [Google Scholar] [CrossRef]
Li, W. GeoAI: Where Machine Learning and Big Data Converge in GIScience. J. Spat. Inf. Sci. 2020, 20, 71–77. [Google Scholar] [CrossRef]
Janowicz, K.; Gao, S.; McKenzie, G.; Hu, Y.; Bhaduri, B. GeoAI: Spatially Explicit Artificial Intelligence Techniques for Geographic Knowledge Discovery and Beyond. Int. J. Geogr. Inf. Sci. 2020, 34, 625–636. [Google Scholar] [CrossRef]
Fang, Z.; Jin, Y.; Yang, T. Incorporating Planning Intelligence into Deep Learning: A Planning Support Tool for Street Network Design. J. Urban Technol. 2022, 29, 99–114. [Google Scholar] [CrossRef]
Myers, D.E. Spatial Interpolation: An Overview. Geoderma 1994, 62, 17–28. [Google Scholar] [CrossRef]
Jiang, Z. A Survey on Spatial Prediction Methods. IEEE Trans. Knowl. Data Eng. 2019, 31, 1645–1664. [Google Scholar] [CrossRef]
Unwin, D. Integration through Overlay Analysis. In Spatial Analytical Perspectives on GIS; Routledge: London, UK, 1996. [Google Scholar]
Das, S.; Li, J.J.; Allston, A.; Kharfen, M. Planning Area-Specific Prevention and Intervention Programs for HIV Using Spatial Regression Analysis. Public Health 2019, 169, 41–49. [Google Scholar] [CrossRef]
Costache, R.; Popa, M.C.; Tien Bui, D.; Diaconu, D.C.; Ciubotaru, N.; Minea, G.; Pham, Q.B. Spatial Predicting of Flood Potential Areas Using Novel Hybridizations of Fuzzy Decision-Making, Bivariate Statistics, and Machine Learning. J. Hydrol. 2020, 585, 124808. [Google Scholar] [CrossRef]
Warth, G.; Braun, A.; Assmann, O.; Fleckenstein, K.; Hochschild, V. Prediction of Socio-Economic Indicators for Urban Planning Using VHR Satellite Imagery and Spatial Analysis. Remote Sens. 2020, 12, 1730. [Google Scholar] [CrossRef]
Brunsdon, C.; Fotheringham, S.; Charlton, M. Geographically Weighted Regression. J. R. Stat. Soc. Ser. D 1998, 47, 431–443. [Google Scholar] [CrossRef]
Khan, S.N.; Li, D.; Maimaitijiang, M. A Geographically Weighted Random Forest Approach to Predict Corn Yield in the US Corn Belt. Remote Sens. 2022, 14, 2843. [Google Scholar] [CrossRef]
Feng, L.; Wang, Y.; Zhang, Z.; Du, Q. Geographically and Temporally Weighted Neural Network for Winter Wheat Yield Prediction. Remote Sens. Environ. 2021, 262, 112514. [Google Scholar] [CrossRef]
Bruno, G.; Giannikos, I. Location and GIS. In Location Science; Laporte, G., Nickel, S., Saldanha da Gama, F., Eds.; Springer International Publishing: Cham, Switzerland, 2015; pp. 509–536. [Google Scholar] [CrossRef]
Keenan, P.B. Spatial Decision Support Systems for Vehicle Routing. Decis. Support Syst. 1998, 22, 65–71. [Google Scholar] [CrossRef]
Keenan, P. Modelling Vehicle Routing in GIS. Oper. Res. 2008, 8, 201. [Google Scholar] [CrossRef]
Laporte, G.; Martello, S. The Selective Travelling Salesman Problem. Discret. Appl. Math. 1990, 26, 193–207. [Google Scholar] [CrossRef] [Green Version]
Kaveh, M.; Kaveh, M.; Mesgari, M.S.; Paland, R.S. Multiple Criteria Decision-Making for Hospital Location-Allocation Based on Improved Genetic Algorithm. Appl. Geomat. 2020, 12, 291–306. [Google Scholar] [CrossRef]
Diemuodeke, E.O.; Addo, A.; Oko, C.O.C.; Mulugetta, Y.; Ojapah, M.M. Optimal Mapping of Hybrid Renewable Energy Systems for Locations Using Multi-Criteria Decision-Making Algorithm. Renew. Energy 2019, 134, 461–477. [Google Scholar] [CrossRef]
Musolino, G.; Rindone, C.; Polimeni, A.; Vitetta, A. Planning Urban Distribution Center Location with Variable Restocking Demand Scenarios: General Methodology and Testing in a Medium-Size Town. Transp. Policy 2019, 80, 157–166. [Google Scholar] [CrossRef]
Wang, Z.; Li, M.; Tang, L.; Huang, J.S. Research and Application of Intersection Traffic Signal Control Algorithm Based on Vehicle Location. Int. J. Commun. Networks Distrib. Syst. 2020, 24, 249–261. [Google Scholar] [CrossRef]
De Montis, A.; Toro, P.D.; Droste-Franke, B.; Stagl, I.O.a.S. Assessing the Quality of Different MCDA Methods. In Alternatives for Environmental Valuation; Routledge: London, UK, 2004. [Google Scholar]
Li, X.; Yeh, A.G.O. Integration of Genetic Algorithms and GIS for Optimal Location Search. Int. J. Geogr. Inf. Sci. 2005, 19, 581–601. [Google Scholar] [CrossRef]
Anselin, L. Local Indicators of Spatial Association—LISA. Geogr. Anal. 1995, 27, 93–115. [Google Scholar] [CrossRef]
Khan, K.; Rehman, S.U.; Aziz, K.; Fong, S.; Sarasvady, S. DBSCAN: Past, Present and Future. In Proceedings of the The Fifth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2014), Bangalore, India, 17–19 February 2014; pp. 232–238. [Google Scholar] [CrossRef]
Ansari, M.Y.; Ahmad, A.; Khan, S.S.; Bhushan, G.; Mainuddin. Spatiotemporal Clustering: A Review. Artif. Intell. Rev. 2020, 53, 2381–2423. [Google Scholar] [CrossRef]
Kulldorff, M. A Spatial Scan Statistic. Commun. Stat. Theory Methods 1997, 26, 1481–1496. [Google Scholar] [CrossRef]
Aldstadt, J. Spatial Clustering. In Handbook of Applied Spatial Analysis: Software Tools, Methods and Applications; Fischer, M.M., Getis, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 279–300. [Google Scholar] [CrossRef]
Irandegani, Z.; Mohammadi, R.; Taleai, M. Investigating Temporal and Spatial Effects of Urban Planning Variables on Crime Rate: A Gwr and Ols Based Approach. In The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences; Copernicus GmbH: Gottingen, Germany, 2019; Volume XLII-4/W18, pp. 559–564. [Google Scholar] [CrossRef] [Green Version]
Ohana-Levi, N.; Ben-Gal, A.; Peeters, A.; Termin, D.; Linker, R.; Baram, S.; Raveh, E.; Paz-Kagan, T. A Comparison between Spatial Clustering Models for Determining N-fertilization Management Zones in Orchards. Precis. Agric. 2020, 22, 99–123. [Google Scholar] [CrossRef]
Fitzmaurice, A.G.; Linley, L.; Zhang, C.; Watson, M.; France, A.M.; Oster, A.M. Novel Method for Rapid Detection of Spatiotemporal HIV Clusters Potentially Warranting Intervention. Emerg. Infect. Dis. 2019, 25, 988–991. [Google Scholar] [CrossRef] [PubMed]
Li, M.; Croitoru, A.; Yue, S. GeoDenStream: An Improved DenStream Clustering Method for Managing Entity Data within Geographical Data Streams. Comput. Geosci. 2020, 144, 104563. [Google Scholar] [CrossRef]
Peterson, B.A.; Brownlee, M.T.J.; Hallo, J.C.; Beeco, J.A.; White, D.L.; Sharp, R.L.; Cribbs, T.W. Spatiotemporal Variables to Understand Visitor Travel Patterns: A Management-Centric Approach. J. Outdoor Recreat. Tour. 2020, 31, 100316. [Google Scholar] [CrossRef]
Grubesic, T.H.; Wei, R.; Murray, A.T. Spatial Clustering Overview and Comparison: Accuracy, Sensitivity, and Computational Expense. Ann. Assoc. Am. Geogr. 2014, 104, 1134–1156. [Google Scholar] [CrossRef]
Surendran, M.S. Review of Spatial Clustering Methods. Int. J. Inf. Technol. Infrastruct. 2012, 2, 15–24. [Google Scholar]
Fritz, C.E.; Schuurman, N.; Robertson, C.; Lear, S. A Scoping Review of Spatial Cluster Analysis Techniques for Point-Event Data. Geospat. Health 2013, 7, 183–198. [Google Scholar] [CrossRef] [Green Version]
O’Sullivan, D.; Perry, G.L.W. Spatial Simulation: Exploring Pattern and Process; John Wiley & Sons: Hoboken, NJ, USA, 2013. [Google Scholar]
Sinclair, T.R.; Soltani, A.; Marrou, H.; Ghanem, M.; Vadez, V. Geospatial Assessment for Crop Physiological and Management Improvements with Examples Using the Simple Simulation Model. Crop Sci. 2020, 60, 700–708. [Google Scholar] [CrossRef]
Chen, L.; Zhang, P.; Lv, G.P.; Shen, Z.Y. Spatial–Temporal Distribution and Limiting Factor Variation of Algal Growth: Three-Dimensional Simulation to Enhance Drinking Water Reservoir Management. Int. J. Environ. Sci. Technol. 2019, 16, 7417–7432. [Google Scholar] [CrossRef]
Wang, Y.; Xu, T.; Niu, X.; Tan, C.; Chen, E.; Xiong, H. STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control. IEEE Trans. Mob. Comput. 2022, 21, 2228–2242. [Google Scholar] [CrossRef]
Hesam, S.; Valizadeh Kamran, K. Intelligent Management Occurrence and Spread of Front Fire in GIS by Using Cellular Automata. Case Study: Golestan Forest. ISPRS Int. Arch. Photogramm. Remote. Sens. Spat. Inf. Sci. 2019, XLII-4/W18, 475–481. [Google Scholar] [CrossRef] [Green Version]
Yu, D.; Yanxu, L.; Bojie, F. Urban Growth Simulation Guided by Ecological Constraints in Beijing City: Methods and Implications for Spatial Planning. J. Environ. Manag. 2019, 243, 402–410. [Google Scholar] [CrossRef]
Parker, D.C.; Meretsky, V. Measuring Pattern Outcomes in an Agent-Based Model of Edge-Effect Externalities Using Spatial Metrics. Agric. Ecosyst. Environ. 2004, 101, 233–250. [Google Scholar] [CrossRef]
Wallentin, G. Spatial Simulation: A Spatial Perspective on Individual-Based Ecology—A Review. Ecol. Model. 2017, 350, 30–41. [Google Scholar] [CrossRef]
Anselin, L. Under the Hood Issues in the Specification and Interpretation of Spatial Regression Models. Agric. Econ. 2002, 27, 247–267. [Google Scholar] [CrossRef]
Bailey, T.C. GIS and Simple Systems for Visual, Interactive, Spatial Analysis. Cartogr. J. 1990, 27, 79–84. [Google Scholar] [CrossRef]
Tyner, J.A. Principles of Map Design; Guilford Publications: New York, NY, USA, 2014. [Google Scholar]
Rivest, S. Toward Better Support for Spatial Decision Making: Defining the Characteristics of Spatial on-Line Analytical Processing (Solap). Geomatica 2001, 55, 539–555. [Google Scholar] [CrossRef]
Kraak, J.M.; Brown, A. Web Cartography; CRC Press: Boca Raton, FL, USA, 2003. [Google Scholar]
Wu, Z.; Chen, Y.; Han, Y.; Ke, T.; Liu, Y. Identifying the Influencing Factors Controlling the Spatial Variation of Heavy Metals in Suburban Soil Using Spatial Regression Models. Sci. Total Environ. 2020, 717, 137212. [Google Scholar] [CrossRef]
Feng, Q.; Flanagan, D.C.; Engel, B.A.; Yang, L.; Chen, L. GeoAPEXOL, a Web GIS Interface for the Agricultural Policy Environmental eXtender (APEX) Model Enabling Both Field and Small Watershed Simulation. Environ. Model. Softw. 2020, 123, 104569. [Google Scholar] [CrossRef]
Ono, J.P.; Castelo, S.; Lopez, R.; Bertini, E.; Freire, J.; Silva, C. PipelineProfiler: A Visual Analytics Tool for the Exploration of AutoML Pipelines. IEEE Trans. Vis. Comput. Graph. 2020, 27, 390–400. [Google Scholar] [CrossRef]
North, C. Toward Measuring Visualization Insight. IEEE Comput. Graph. Appl. 2006, 26, 6–9. [Google Scholar] [CrossRef] [PubMed]
Hallisey, E.J. Cartographic Visualization: An Assessment and Epistemological Review. Prof. Geogr. 2005, 57, 350–364. [Google Scholar] [CrossRef]
Kai-Yun, L.; Sampaio de Lima, R.; Burnside, N.G.; Vahtmäe, E.; Kutser, T.; Sepp, K. Toward Automated Machine Learning-Based Hyperspectral Image Analysis in Crop Yield and Biomass Estimation. Remote Sens. 2022, 14, 1114. [Google Scholar] [CrossRef]
Dilmurat, K.; Sagan, V.; Moose, S. Ai-Driven Maize Yield Forecasting Using Unmanned Aerial Vehicle-Based Hyperspectral and Lidar Data Fusion. ISPRS Ann. Photogramm. Remote. Sens. Spat. Inf. Sci. 2022, V-3-2022, 193–199. [Google Scholar] [CrossRef]
Gerassis, S.; Giráldez, E.; Pazo-Rodríguez, M.; Saavedra, Á.; Taboada, J. AI Approaches to Environmental Impact Assessments (EIAs) in the Mining and Metals Sector Using AutoML and Bayesian Modeling. Appl. Sci. 2021, 11, 7914. [Google Scholar] [CrossRef]
Guo, Y.; Quan, L.; Song, L.; Liang, H. Construction of Rapid Early Warning and Comprehensive Analysis Models for Urban Waterlogging Based on AutoML and Comparison of the Other Three Machine Learning Algorithms. J. Hydrol. 2022, 605, 127367. [Google Scholar] [CrossRef]
Bai, Z.; Liu, Q.; Liu, Y. Groundwater Potential Mapping in Hubei Region of China Using Machine Learning, Ensemble Learning, Deep Learning and AutoML Methods. Nat. Resour. Res. 2022, 31, 2549–2569. [Google Scholar] [CrossRef]
Zhang, X.; Jin, Q.; Yu, T.; Xiang, S.; Kuang, Q.; Prinet, V.; Pan, C. Multi-Modal Spatio-Temporal Meteorological Forecasting with Deep Neural Network. ISPRS J. Photogramm. Remote Sens. 2022, 188, 380–393. [Google Scholar] [CrossRef]
O’Donncha, F.; Hu, Y.; Palmes, P.; Burke, M.; Filgueira, R.; Grant, J. A Spatio-Temporal LSTM Model to Forecast across Multiple Temporal and Spatial Scales. Ecol. Informatics 2022, 69, 101687. [Google Scholar] [CrossRef]
Nikitin, N.O.; Revin, I.; Hvatov, A.; Vychuzhanin, P.; Kalyuzhnaya, A.V. Hybrid and Automated Machine Learning Approaches for Oil Fields Development: The Case Study of Volve Field, North Sea. Comput. Geosci. 2022, 161, 105061. [Google Scholar] [CrossRef]
Arrogante-Funes, P.; Bruzón, A.G.; Arrogante-Funes, F.; Ramos-Bernal, R.N.; Vázquez-Jiménez, R. Integration of Vulnerability and Hazard Factors for Landslide Risk Assessment. Int. J. Environ. Res. Public Health 2021, 18, 11987. [Google Scholar] [CrossRef]
Siriborvornratanakul, T. Human Behavior in Image-Based Road Health Inspection Systems despite the Emerging AutoML. J. Big Data 2022, 9, 96. [Google Scholar] [CrossRef]
Sagi, O.; Rokach, L. Ensemble Learning: A Survey. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2018, 8, e1249. [Google Scholar] [CrossRef]
Snoek, J.; Larochelle, H.; Adams, R.P. Practical Bayesian Optimization of Machine Learning Algorithms. Adv. Neural Inf. Process. Syst. 2012, 25, 1–9. [Google Scholar]
Elsken, T.; Metzen, J.H.; Hutter, F. Neural Architecture Search: A Survey. J. Mach. Learn. Res. 2019, 20, 1997–2017. [Google Scholar]
Al-Sahaf, H.; Bi, Y.; Chen, Q.; Lensen, A.; Mei, Y.; Sun, Y.; Tran, B.; Xue, B.; Zhang, M. A Survey on Evolutionary Machine Learning. J. R. Soc. N. Z. 2019, 49, 205–228. [Google Scholar] [CrossRef]
Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.; Blum, M.; Hutter, F. Efficient and Robust Automated Machine Learning. Adv. Neural Inf. Process. Syst. 2015, 28, 1–9. [Google Scholar]
Olson, R.S.; Bartley, N.; Urbanowicz, R.J.; Moore, J.H. Evaluation of a Tree-Based Pipeline Optimization Tool for Automating Data Science. In Proceedings of the Genetic and Evolutionary Computation Conference, Denver, CO, USA, 20–24 July 2016; pp. 485–492. [Google Scholar]
LeDell, E.; Poirier, S. H2O AutoML: Scalable Automatic Machine Learning. In Proceedings of the AutoML Workshop at ICML, Vienna, Austria, 17–18 July 2020; Volume 2020. [Google Scholar]
Fakoor, R.; Mueller, J.W.; Erickson, N.; Chaudhari, P.; Smola, A.J. Fast, Accurate, and Simple Models for Tabular Data via Augmented Distillation. Adv. Neural Inf. Process. Syst. 2020, 33, 8671–8681. [Google Scholar]
Drori, I.; Krishnamurthy, Y.; Rampin, R.; Lourenco, R.d.P.; Ono, J.P.; Cho, K.; Silva, C.; Freire, J. AlphaD3M: Machine Learning Pipeline Synthesis. arXiv 2021, arXiv:2111.02508. [Google Scholar]
Sprague, R.H. A Framework for the Development of Decision Support Systems. MIS Q. 1980, 4, 1–26. [Google Scholar] [CrossRef]
Keen, P.G.; Hackathorn, R.D. Decision Support Systems and Personal Computing; MIT: Cambridge, MA, USA, 1979. [Google Scholar]
Simon, H.A. The New Science of Management Decision; Harper & Brothers: New York, NY, USA, 1960; p. 50. [Google Scholar]
Omran, M.G.; Engelbrecht, A.P.; Salman, A. An Overview of Clustering Methods. Intell. Data Anal. 2007, 11, 583–605. [Google Scholar] [CrossRef]
Harkanth, S.; Phulpagar, B.D. A Survey on Clustering Methods and Algorithms. Int. J. Comput. Sci. Inf. Technol. 2013, 4, 687–691. [Google Scholar]
Amigó, E.; Gonzalo, J.; Artiles, J.; Verdejo, F. A Comparison of Extrinsic Clustering Evaluation Metrics Based on Formal Constraints. Inf. Retr. 2009, 12, 461–486. [Google Scholar] [CrossRef] [Green Version]
Maulik, U.; Bandyopadhyay, S. Performance Evaluation of Some Clustering Algorithms and Validity Indices. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 1650–1654. [Google Scholar] [CrossRef] [Green Version]
Vazquezl, M.Y.L.; Peñafiel, L.A.B.; Muñoz, S.X.S.; Martinez, M.A.Q. A Framework for Selecting Machine Learning Models Using TOPSIS. In Advances in Artificial Intelligence, Software and Systems Engineering; Advances in Intelligent Systems and Computing; Ahram, T., Ed.; Springer International Publishing: Cham, Switzerland, 2021; pp. 119–126. [Google Scholar]
Mahesh, B. Machine Learning Algorithms—A Review. Int. J. Sci. Res. 2020, 9, 381–386. [Google Scholar]
Mayfield, H.J.; Smith, C.; Gallagher, M.; Hockings, M. Considerations for Selecting a Machine Learning Technique for Predicting Deforestation. Environ. Model. Softw. 2020, 131, 104741. [Google Scholar] [CrossRef]
Sparks, E.R.; Talwalkar, A.; Haas, D.; Franklin, M.J.; Jordan, M.I.; Kraska, T. Automating Model Search for Large Scale Machine Learning. In Proceedings of the Sixth ACM Symposium on Cloud Computing, Kohala Coast, HI, USA, 27–29 August 2015; pp. 368–380. [Google Scholar] [CrossRef]
Real, E.; Liang, C.; So, D.; Le, Q. AutoML-Zero: Evolving Machine Learning Algorithms from Scratch. In Proceedings of the 37th International Conference on Machine Learning—PMLR, Virtual, 13–18 July 2020; pp. 8007–8019. [Google Scholar]
Biljecki, F.; Heuvelink, G.B.M.; Ledoux, H.; Stoter, J. The Effect of Acquisition Error and Level of Detail on the Accuracy of Spatial Analyses. Cartogr. Geogr. Inf. Sci. 2018, 45, 156–176. [Google Scholar] [CrossRef] [Green Version]
Devillers, R.; Bédard, Y.; Jeansoulin, R.; Moulin, B. Towards Spatial Data Quality Information Analysis Tools for Experts Assessing the Fitness for Use of Spatial Data. Int. J. Geogr. Inf. Sci. 2007, 21, 261–282. [Google Scholar] [CrossRef]
Getis, A. Spatial Autocorrelation. In Handbook of Applied Spatial Analysis: Software Tools, Methods and Applications; Fischer, M.M., Getis, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2010; pp. 255–278. [Google Scholar] [CrossRef]
Gilardi, N.; Bengio, S. Local Machine Learning Models for Spatial Data Analysis. J. Geogr. Inf. Decis. Anal. 2000, 4, 11–28. [Google Scholar]
Rouse, L.J.; Bergeron, S.J.; Harris, T.M. Participating in the Geospatial Web: Collaborative Mapping, Social Networks and Participatory GIS. In The Geospatial Web: How Geobrowsers, Social Software and the Web 2.0 Are Shaping the Network Society; Scharl, A., Tochtermann, K., Eds.; Advanced Information and Knowledge Processing; Springer: London, UK, 2007; pp. 153–158. [Google Scholar] [CrossRef]
Zambelli, P.; Gebbert, S.; Ciolli, M. Pygrass: An Object Oriented Python Application Programming Interface (API) for Geographic Resources Analysis Support System (GRASS) Geographic Information System (GIS). ISPRS Int. J. Geo-Inf. 2013, 2, 201–219. [Google Scholar] [CrossRef] [Green Version]
Vonk, G.; Geertman, S. Improving the Adoption and Use of Planning Support Systems in Practice. Appl. Spat. Anal. Policy 2008, 1, 153–173. [Google Scholar] [CrossRef]
Hong, S.R.; Castelo, S.; D’Orazio, V.; Benthune, C.; Santos, A.; Langevin, S.; Jonker, D.; Bertini, E.; Freire, J. Towards Evaluating Exploratory Model Building Process with AutoML Systems. arXiv 2020, arXiv:2009.00449. [Google Scholar]

Figure 1. Two-step process to answer research questions.

Figure 2. Final SDSS and AutoML articles per year (n = 136).

Figure 3. Word cloud of top 100 words in 136 SDSS and AutoML article abstract and titles.

Figure 4. Primary AutoML (n = 17) and SDSS (n = 18) article citation counts [4,6,12,16,17,18,19,21,24,25,26,27,36,37,38,39,40,41,42,43,44,45].

Figure 5. Top 10 most cited supplementary AutoML (n = 21), SDSS (n = 63), and AutoML/SDSS (n = 17) articles [28,29,30,31,32,33,34,35,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64].

Figure 6. Spatial decision support system (SDSS) components.

Figure 7. Generic automated machine-learning (AutoML) approach.

Figure 8. General SDSS with AutoML framework.

Figure 9. SDSS with AutoML research opportunities and implementation challenges.

Table 1. Reviewed spatial problems and approaches in SDSS.

Spatial Problem	Applications	Spatial Methods	ML Methods
Estimation	Land use classification Disease risk calculation Disaster risk prediction	MCDA Spatial Regression GW-ML	SVM RF NN
Optimization	Facility selection Delivery routing Infrastructure placement Traffic control	MCDA PSO	GA
Clustering	Crime hotspots Agricult./disease zoning Social-media analysis Travel analysis	LISA Hotspot analysis SaTScan	K-means DBSCAN
Simulation	Traffic control simulation Wildfire simulation Land use simulation	Cellular Automata ABM Custom models	RL
Insight	Risk factor identification Interactive exploration Data/model interpretation	Spat. Regression Webmapping Interactive models	Feat. selection Feat. importance Pipeline explore

Table 2. Reviewed SDSS with AutoML approaches and applications (n = 17).

AutoML Approach	AutoML Method/Software	Data	SDSS Applications
Ensembling (n = 6)	H $_{2}$ O Extra trees class.	Satellite imagery UAV imagery Sensors Surveys Sociodemographic Simulations	Crop prediction Violence rate prediction Total water storage est. Landslide risk est. Soil estimation
Bayesian (n = 4)	Bayesian Opt. Auto-Sklearn Bayesian Nets MATLAB fitrauto	Satellite imagery UAV imagery Sensors Surveys	Crop prediction/classify Soil estimation Env. impact assessment
Neural Nets (n = 3)	NAS Deep Learning LSTM	Satellite imagery Vehicle imagery	Meteorological forecasting Road health inspection
Evolutionary (n = 2)	Fedot TPOT	Sensors Surveys	Oil-well placement Waterlogging risk est.
Other (n = 2)	Autogluon AutoⁿML AlphaD3M	Sociodemographic Satellite imagerySurveys	Violence rate prediction Water potential mapping

Table 3. Example metrics for spatial problems and tasks.

Task	Metrics
Regression	Error, correlation coefficient, MSE, MAE, RMSE
Classfication	Accuracy, precision, recall, sensitivity, specificity, F1 score, AUC, ROC
Clustering	Euclidean distance, Rand index, entropy, purity, silhouette soefficient, Dunn’s index, Calinski–Harabasz index, homogeneity

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wen, R.; Li, S. Spatial Decision Support Systems with Automated Machine Learning: A Review. ISPRS Int. J. Geo-Inf. 2023, 12, 12. https://doi.org/10.3390/ijgi12010012

AMA Style

Wen R, Li S. Spatial Decision Support Systems with Automated Machine Learning: A Review. ISPRS International Journal of Geo-Information. 2023; 12(1):12. https://doi.org/10.3390/ijgi12010012

Chicago/Turabian Style

Wen, Richard, and Songnian Li. 2023. "Spatial Decision Support Systems with Automated Machine Learning: A Review" ISPRS International Journal of Geo-Information 12, no. 1: 12. https://doi.org/10.3390/ijgi12010012

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Spatial Decision Support Systems with Automated Machine Learning: A Review

Abstract

1. Introduction

2. Methods

2.1. Step One: Literature Search

2.2. Step Two: Review and Discussion

3. Search Results

4. Review Results

4.1. Spatial Decision Support Systems (SDSS)

4.2. Automated Machine Learning (AutoML)

4.3. Spatial Problems in SDSS and ML

4.3.1. Spatial Estimation

4.3.2. Spatial Optimization

4.3.3. Spatial Clustering

4.3.4. Spatial Simulation

4.3.5. Spatial Insight

4.4. SDSS with AutoML

5. Discussion

5.1. SDSS with AutoML Framework

5.1.1. Key Consideration 1: Spatial Problems

5.1.2. Key Consideration 2: Metrics

5.1.3. Key Consideration 3: Potential Solutions

5.2. Implementation Challenges

5.2.1. Data Quality

5.2.2. Model Interpretability

5.2.3. Evidence of Usefulness

5.3. Research Opportunities

5.3.1. Spatial AutoML

5.3.2. Resource-Aware Approaches

5.3.3. Collaborative and Connected Systems

5.3.4. Human-Centered System Design

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI