Next Article in Journal
Prime Graphs with Almost True Twin Vertices
Previous Article in Journal
Lorentzian Structure and Curvature Analysis of Osculating Type-2 Ruled Surfaces via the Type-2 Bishop Frame
Previous Article in Special Issue
A Swap-Integrated Procurement Model for Supply Chains: Coordinating with Long-Term Wholesale Contracts
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

GIS-Integrated Data Analytics for Optimal Location-and-Routing Problems: The GD-ARISE Pipeline

1
Department of Applied Statistics, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of Korea
2
Department of Next Generation Smart Energy System Convergence, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si 13120, Gyeonggi-do, Republic of Korea
*
Author to whom correspondence should be addressed.
Mathematics 2025, 13(21), 3465; https://doi.org/10.3390/math13213465
Submission received: 30 September 2025 / Revised: 21 October 2025 / Accepted: 28 October 2025 / Published: 30 October 2025
(This article belongs to the Special Issue Theoretical and Applied Mathematics in Supply Chain Management)

Abstract

Optimizing the siting and servicing of urban facilities is a core operations research problem that must reconcile heterogeneous demand, spatial constraints, and network-realistic travel. We present GD-ARISE, a GIS-integrated and data analytics pipeline that maintains a pedestrian–road network metric from demand inference through siting to routing. The workflow has three modules: (i) GIS integration that unifies spatial layers on one network and distance metric; (ii) data analytics that builds multi-criteria suitability via the Analytic Hierarchy Process (AHP) and maps scores to adaptive service radii; (iii) optimal location-and-routing that selects nonoverlapping sites with a transparent greedy rule (SCASS) and computes depot-to-depot routes via simulated annealing on the same metric. A case study in Seoul’s Gangnam District yields a high-coverage portfolio and feasible collection routes. We add a theoretical framework that casts SCASS as a conflict-graph problem, document the AHP elicitation with consistency checks, and report robustness analyses including sensitivity to AHP weights and to radius bounds. Results indicate that core hotspots remain stable to weighting, whereas mid-range corridors shift as criteria priorities or spatial parameters change.

1. Introduction

1.1. Motivation

Urban waste disposal and storage can be framed as a rigorously defined operations research location-and-routing problem that is driven by demand data analytics and is GIS-native [1]. In megacities, rapid urbanization and rising population density intensify pressure on municipal services; public waste management often struggles to keep pace. A core cause is geometric mismatch: bins are not where people generate waste, and service routes ignore how people and vehicles actually move [1,2]. The result is a network-based operations research problem in which demand must be inferred, sites must be chosen, and service must be routed using a coherent metric and data stack. This is now feasible at the city scale because municipalities publish high-frequency geospatial datasets (e.g., floating populations, transit nodes, and points of interest), and open platforms such as OpenStreetMap, with OSMnx, enable the computation of network-consistent distances [3]. Planners increasingly require transparent analytics that map directly to policy levers—how many bins to deploy, where to place them, and how to service them given fleet and budget constraints—so there is a timely opportunity to connect demand estimation, siting, and routing under one geometry and one metric [2,4]. Despite practical need, algorithms for urban amenity siting that integrate heterogeneous demand data with geometry-consistent spatial and operational constraints have received limited attention. Few studies, to our knowledge, claim an end-to-end algorithm that explicitly integrates GIS-based geometry, demand-driven multi-criteria scoring that feeds back into spatial parameters, nonoverlap enforced with adaptive radii on a network metric, and routing on the same metric. Most prior workflows stop at ranking without altering geometry, mix Euclidean screening with network routing, or impose fixed buffers that ignore local demand. As a result, geometry-consistent siting-and-routing pipelines remain rare in the literature, leaving a gap for methods that are both operationally credible and analytically transparent.

1.2. Related Works

The literature on multi-criteria decision-making (MCDM), the maximal covering location problem (MCLP), and routing has been extensively developed for facility location in operations research. GIS strengthens these models by linking geospatial data and network analysis [3,5], enabling city-scale location allocation, coverage, and access studies [6]. For multi-criteria synthesis, the Analytic Hierarchy Process (AHP) [7] and related MCDM methods are widely used to build suitability surfaces for siting and infrastructure planning [6]. This approach typically involves weighting heterogeneous GIS layers to derive a composite suitability index. It has been widely applied across various domains, including landfill siting [8,9], healthcare facility planning [10], and renewable energy projects [11,12]. These studies formalize how multi-criteria weighting translates spatially into suitability gradients, underscoring MCDM’s capacity to operationalize complex environmental and social criteria.
Church and ReVelle introduced MCLP, which chooses sites to cover as much demand as possible within a set distance [13]. Many studies have extended this idea to include budgets, fairness goals, and changes over time or uncertainty [14,15,16,17]. Ref. [18] investigated model placement as a single goal of maximizing coverage. In solid-waste systems, recent reviews summarize how covering models are used and point out that results can change a lot depending on the distance measure and the service radius assumed [19,20]. A primary critique of the classical MCLP concerns its rigid, binary definition of coverage. Berman et al. (2003) [21] addressed the model’s unrealistic “all-or-nothing” assumption by introducing a gradual decay function, where service quality diminishes with distance rather than ceasing abruptly. Complementing this, Karasakal and Karasakal (2004) [22] highlighted that strict binary formulations can yield unjustified solutions where partial service is plausible, thereby motivating variants that permit partial coverage. More recent extensions have broadened the MCLP’s focus from pure efficiency to include equity. Blanco and Gázquez (2023) [23] formalized the integration of fairness, employing concepts such as α -fairness and ordered-weighted objectives. Their work underscores that distributional equity is not inherent in the basic model and must be incorporated as an explicit objective. Despite these theoretical advancements, practical siting decisions increasingly rely on spatial data and multi-criteria integration. However, using such derived scores to dynamically adjust geometric or operational parameters, such as service radii or coverage decay, remains limited [24,25,26]. Spatial feasibility often requires nonoverlap [27,28], which connects to disk packing [29] and independent set formulations [30]; in practice, spacing is frequently enforced with ad hoc buffers rather than with network metric optimization. In algorithms, greedy heuristics are common for NP-hard location models because they are fast and transparent [31], though not globally optimal [14,15,32].
For routing, simulated annealing [33,34] and other local search methods perform well on the traveling salesman problem (TSP) [35] and on modern VRP variants [36]. Recent applied work also moves toward joint decisions on where to locate facilities and how to route the service.
Rahmanifar et al. (2024) [37] present a non-linear multiobjective model that integrates warehouse location with vehicle routing in cold-chain logistics, and Hashemi-Amiri et al. (2023) [38] propose a tri-objective mixed-integer linear program (MILP) that unifies facility location, crew scheduling, and routing for municipal solid waste. These studies advance integrated planning, but they do not maintain a single, GIS-consistent distance metric from multi-criteria demand aggregation through siting constraints to downstream routing.

1.3. Contributions

This paper introduces GD-ARISE (GIS-integrated and Data analytic Adaptive Radius Integrated Siting and rEservicing), an end-to-end, GIS-integrated pipeline that maintains geometric consistency from demand analytics to operations. First, all spatial layers—administrative boundaries, pedestrian–road networks, floating populations, transit nodes, and waste-related points of interest—are reconciled onto a single pedestrian–road network and one distance metric, ensuring that measurements and decisions are expressed in the same geometry. Second, multi-criteria demand is constructed via AHP into a composite suitability score, and then mapped to geometry as an adaptive service radius, so that high-suitability areas receive smaller radii while lower-suitability areas receive larger radii to preserve access. Third, spatially constrained selection is formulated with site-specific radii on the network metric and implemented via a transparent greedy rule (SCASS) to produce a maximal nonoverlapping portfolio; the associated conflict structure admits an interpretation as a graph independent set problem. Fourth, depot-to-depot service routes are generated using simulated annealing on the identical network metric used upstream, closing the loop without changing geometry. Finally, the pipeline is demonstrated on waste bin siting and servicing in Seoul’s Gangnam District, where the walkable network is sampled at fine resolution, composite scores and adaptive radii are computed, a nonoverlapping portfolio of sites is selected, and operational routes are produced, including a single depot case in Samsung 1–dong. By maintaining one geometry and one metric across all stages, GD-ARISE turns GIS analytics into operations research decisions about counts, placement, spacing, and service effort under realistic constraints.

2. An Integrated Planning Algorithm: GD-ARISE

We present GD-ARISE, a unified, GIS-integrated workflow for optimal location-and-routing. Figure 1 shows the workflow of the GD-ARISE algorithm, which has three modules: GIS integration, data analytics, and optimal location-and-routing. All steps use one network distance metric on the pedestrian–road network. This keeps geometry and units consistent. In GIS integration, load the administrative boundaries, the pedestrian–road network, floating population layers, transit nodes, and waste-related points of interest. Harmonize coordinate reference systems, clean fields, and clip layers to the study region. In data analytics, generate dense candidate sites along walkable streets. For each candidate site, compute criterion scores from the GIS layers such as population, transit access, and proximity to points of interest. Normalize all scores to [ 0 , 1 ] . Use AHP to set weights and combine the criteria into a composite suitability. In optimal location-and-routing, first select sites with SCASS. Sort candidates by suitability and add a site if its radius does not overlap any already selected radius. Then route service to build depot-to-depot tours. The workflow also reports the coverage attained, the number of selected sites, the route length, and the route time. Because all steps share one metric and one GIS substrate, results are easy to audit, map, and reproduce.
We work on a single geographic region L R 2 with a pedestrian–road network-induced distance d : R 2 × R 2 R 0 (shortest path on the network). The region is discretized into candidate sites J = { j 1 , , j N } sampled from the network at resolution δ > 0 , so adjacent samples are at most δ apart under d. A set of criteria C = { C 1 , , C M } evaluates each site. We assign to every j J a composite suitability S j * [ 0 , 1 ] and an adaptive service radius r j [ R min , R max ] with 0 < R min < R max < . Then, we select a subset J sel J of size P target N while enforcing nonoverlap based on d and { r j } . Finally, we route the service from a designated depot D 0 L . When routing is required, each served site carries demand q j 0 , the fleet has m N vehicles, and each vehicle has capacity Q > 0 . All distances and constraints use the same metric d.
Multi-criteria demand assessment will be performed by mapping heterogeneous raw measurements into a composite demand suitability score S j * and a site-specific spatial footprint r j . The inputs are the raw criterion values or proximity-based syntheses for each C k C , together with a pairwise comparison matrix that encodes decision priorities. Each criterion is normalized to [ 0 , 1 ] via a monotone transformation so that larger values consistently mean greater desirability, and AHP weights are extracted from the principal eigenvector of the comparison matrix, subject to a consistency check. Aggregation by a convex combination yields S j * = k α k S j , k . This first stage resolves two design tensions in a data-driven yet interpretable way: it fuses many incommensurate predictors of demand into a single, dimensionless score, and it ties spatial influence to local desirability so that high-value sites are modeled with tighter catchments while low-value sites expand to preserve access.
For every criterion C k C and site j J , the non-negative value V j , k R 0 quantifies the magnitude of C k at j. Two constructions cover the settings of interest. When C k is directly observed at the site—such as floating population, footfall, or local demand intensity—we set V j , k = P j , where P j denotes the measured quantity at j. When C k reflects the influence of external features distributed in space, we let F k = { f s } s = 1 S k L denote the relevant feature set and assign non-negative weights { ω k , s } s = 1 S k that sum to one and encode the relative importance of individual features or subtypes. A influence kernel K : R 0 R 0 then maps distances to contributions so that
V j , k = s = 1 S k ω k , s K d ( j , f s ) .
In the case study, we adopt K ( ρ ) = ρ with d ( j , f s ) , measured as Euclidean distance.
Because the criteria are measured in heterogeneous units, each raw value is transformed to a standardized unit-interval score via a criterion-specific monotone standardization map f k : R 0 [ 0 , 1 ] defined by S j , k = f k ( V j , k ) . When C k is a benefit-type attribute for which larger values are more desirable, a min–max mapping places all sites on a common scale,
S j , k = V j , k min J V , k max J V , k min J V , k , if max J V , k > min J V , k , 0 , otherwise ,
sending the empirical minimum to 0 and the empirical maximum to 1. To translate composite suitability into a spatial service footprint, two design constants 0 < R min < R max specify the admissible range of coverage radii. The adaptive coverage radius at site j is then defined by the affine mapping:
r j = R max ( R max R min ) S j * .
Because S j * [ 0 , 1 ] , this definition guarantees r j [ R min , R max ] . Differentiating shows that r j / S j * = ( R max R min ) < 0 , so sites with higher suitability are assigned proportionally smaller catchments by concentrating service in areas of strong demand while allowing sites in weaker areas to expand their reach to preserve access. The use of a network-based metric d is essential in urban contexts when it captures the true impedance of travel.
Now, we design an algorithm for selecting locations via adaptive coverage and greedy nonoverlap, so-called SCASS, in the spirit of MCLP. The aim is to choose the optimal number of target sites P target that maximizes the amount of demand covered within the heterogeneous radii { r j } j J .
Proposition 1 
(SCASS). Let L R 2 be a geographic region endowed with a network metric d, let J = { j 1 , , j N } L be candidate sites with adaptive radii { r j } j J determined from composite scores { S j * } j J , and let U L be finite demand nodes with weights { w u } u U . For each j J define the coverage set:
C ( j ) = { u U : d ( u , j ) r j } , and the service disk D j = { x L : d ( x , j ) r j } .
Construct the conflict graph G c = ( J , E ) with an edge { i , j } E iff d ( i , j ) < r i + r j .
(i) 
Nonoverlap ⟺ Independence. A subset J sel J satisfies the SCASS nonoverlap requirement D i D j = for all distinct i , j J sel , if and only if J sel is an independent set in G c .
(ii) 
Adaptive radius MCLP with exclusions. For a given cardinality P target N , the problem of choosing J sel J with | J sel | = P target to maximize covered demand,
u U w u 1 u j J sel C ( j )
subject to SCASS nonoverlap is equivalent to the maximal covering location problem on U with site-specific radii { r j } and the additional constraint that J sel is an independent set of G c .
(iii) 
Maximum-weight independent set (MWIS). If the siting objective is to maximize j J sel S j * subject to nonoverlap and | J sel | = P target , then the problem is a cardinality-constrained MWIS on G c :
max J sel J j J sel S j * s . t . | J sel | = P target , J sel independent in G c .
Equivalently, with binaries x j { 0 , 1 } ,
max x { 0 , 1 } N j = 1 N S j * x j s . t . x i + x j 1 { i , j } E , j = 1 N x j = P target .
Proof. 
For (i), if D i D j , then there exists x with d ( i , j ) d ( i , x ) + d ( x , j ) r i + r j by the triangle inequality; hence { i , j } E and the pair cannot be jointly selected. Conversely, if d ( i , j ) r i + r j , then no x can lie in both disks, so D i D j = . This establishes the equivalence between nonoverlap and independence in G c . Statement (ii) inserts this feasibility into the adaptive radius MCLP coverage objective, so a feasible solution is exactly an independent set of prescribed size. Statement (iii) is a direct translation of the selection objective into a cardinality-constrained MWIS, yielding the stated 0–1 formulation. □
Finally, we optimize the service route problem over the selected sites and depot using a capacitated vehicle-routing model. The node set is V = { 0 , 1 , , K } with K = | J sel | . The node 0 means the depot D 0 of an operations center or garage, and node i corresponds to site j i J sel . Edge costs are the network distances d i j = d ( j i , j j ) with j 0 D 0 . Demands { q j i } i = 1 K and a common capacity Q define feasibility, and the goal is to partition J sel into at most m depot-to-depot tours minimizing total travel cost while serving each site exactly once and respecting capacity. Because exact mixed-integer formulations are NP-hard at realistic scales, we adopt a simulated annealing search over permutations augmented with depot separators. A candidate solution is encoded as a sequence that starts and ends at 0 and contains each customer once, together with m 1 additional depot symbols; cutting at the depot symbols yields the m routes. The total cost is the sum of d i j along the sequence, and capacity feasibility amounts to verifying that the sum of demands on each between-depot segment does not exceed Q. This stage converts a strategic siting outcome into operationally feasible tours that coincide with the same network metric d used in the first two stages.
Routing operates on these selected sites and the designated depot D 0 L . To harmonize indexing, fix an arbitrary bijection between { 1 , , K } and J sel and write j i for the site associated with index i. Define the node set V = { 0 , 1 , , K } , where node 0 corresponds to the depot D 0 and node i { 1 , , K } corresponds to site j i . For any pair ( i , j ) V × V , define d i j = d ( j i , j j ) with the convention j 0 D 0 and exclude self-loops via x i i = 0 . Each customer node i { 1 , , K } is assigned a non-negative service demand q i 0 (e.g., expected daily pickups or deliveries), and vehicles have a common capacity Q > 0 in the same units as the q i . Let m N be the available fleet size. A vehicle route is a directed cycle that starts and ends at the depot, visits a subset of customers exactly once, and respects capacity. We encode routing decisions with binary arc variables x i j { 0 , 1 } that indicate whether a vehicle travels directly from node i to node j. To enforce capacity and eliminate subtours, we introduce continuous load–flow variables using the classical single-commodity formulation: let y i denote the cumulative load delivered up to and including customer i on the route that visits i, measured from zero at the depot. The capacitated vehicle routing problem on ( V , d i j ) is then
min x , y i = 0 K j = 0 K d i j x i j
subject to the depot degree constraints j = 1 K x 0 j = m and i = 1 K x i 0 = m , the customer in- and out-degree constraints i = 0 K x i h = 1 and j = 0 K x h j = 1 for every h { 1 , , K } , and the load–flow constraints
y j y i + q j Q ( 1 x i j ) for all i V , j { 1 , , K } ,
together with the bounds y 0 = 0 and q i y i Q for all i { 1 , , K } , and the binary and no-nnegativity restrictions x i j { 0 , 1 } and y i 0 . The degree constraints ensure that every customer is entered and left exactly once and that exactly m tours depart from and return to the depot. The load–flow inequalities propagate cumulative load along used arcs: if x i j = 1 , then y j y i + q j . So, when a vehicle traverses ( i , j ) , it must have delivered an additional q j units by the time it leaves j; if x i j = 0 , the constraint is slack by at most Q. The bounds y 0 = 0 and y i Q enforce vehicle capacity and, together with flow propagation, preclude subtours disconnected from the depot, since any positive delivery in a closed customer-only cycle would force y to grow without the possibility of resetting to 0. This mixed-integer program is NP-hard; indeed, when m = 1 and Q i = 1 K q i , the problem reduces to the classical traveling salesman problem on { 0 , 1 , , K } .

3. Application

Typical applications of the proposed algorithm include siting clinics, fire stations, micro-mobility docks, or public waste bins, each requiring a coherent pipeline. We illustrate the approach on public waste bin siting and servicing in the Gangnam District of Seoul, a dense mixed-use environment in which pedestrian flows, transit access, and commercial intensity co-produce spatially and temporally concentrated litter generation. Figure 2 presents the detailed computational workflow to apply the proposed GD-ARISE with the case study on public waste-bin siting and servicing in Seoul’s Gangnam District.

3.1. GIS Analytics

The application domain is the Gangnam District of Seoul. We set the study region of the framework to L = G R 2 , and we use a single distance metric d : R 2 × R 2 R 0 throughout all stages, instantiated in practice as the shortest-path metric induced by the pedestrian–road network G so that distances reflect walk times and barriers rather than straight lines. Gangnam’s land incorporates corridors along Teheran–ro, high-street retail near Gangnam Boulevard, entertainment clusters in Apgujeong and Cheongdam, and multiple subway interchanges induce marked spatiotemporal variation in footfall. We represent this with a non-negative pedestrian density field P : G × [ 0 , T ] R 0 over a representative horizon [ 0 , T ] (e.g., one day). Administrative boundaries for G were obtained as polygonal census layers and ingested into a GeoPandas workflow. To ensure geometric consistency across heterogeneous sources, all layers were reprojected to the common geographic CRS EPSG:4326 (WGS84) for integration with web data, and then to EPSG:5186 (Korea 2000/Central Belt) for all computations that require metric accuracy. All geoprocessing and network analyses were performed in Python (v3.11.11) using GeoPandas (v1.0.1) for spatial data handling, OSMnx (v2.0.2) for extracting the pedestrian network from OpenStreetMap, Shapely (v2.1.0) for geometric operations (e.g., buffering and distance), and Folium (v0.19.5) for interactive map visualization. Table 1 summarizes the GIS integration for the Gangnam case. It shows each spatial layer, how we prepare it, and how it is used in the model.
Candidate facility sites are drawn from the walkable subgraph of OpenStreetMap within G. We extracted footways, pedestrian paths, sidewalks, and low-speed residential links using OSMnx, simplified the network to retain unique traversable edges, and then sampled points along these edges at a fixed network spacing δ = 10 m. The result is a finite candidate set J = { j 1 , , j N } G with N = 32,890 points for feasible bin locations. Each j J inherits attributes from intersecting administrative polygons via spatial join (e.g., sub-district codes) and is snapped to the nearest network node to avoid topological artifacts when computing d ( · , · ) . As shown in Figure 3, a random sample of candidate waste bin sites illustrates the spatial spread of the full feasible set across the walkable network. Blue points depict a random subset of 1000 candidates drawn from the full set of 32 , 890 network-anchored sites (black lines: pedestrian–road network; grey polygons: output-area boundaries). The sample visualizes coverage of feasible placements prior to scoring and selection.

3.2. Data Analytics: Demand Criteria and Feature Construction

Waste bin demand is driven by cumulative exposure to pedestrians and by proximity to attractors such as transit nodes or retail frontages. To support coverage modeling and validation, we also assemble a demand representation that is consistent with the framework’s notation. Let U G denote a set of demand nodes at which pedestrian exposure and ancillary variables are tabulated; in practice, U may consist of network vertices in G inside G or centroids of census micro-polygons. All exogenous point datasets (e.g., transit stops and points of interest) are cleaned, deduplicated within a tolerance in EPSG:5186, and converted to GeoDataFrame for nearest-neighbor and kernel computations. Where a dataset is temporally indexed, we aggregate to representative daily means so that demand scores represent a typical day on the planning horizon. The dense sampling of the pedestrian network at δ = 10 m yields | J | = 32 , 890 feasible bin locations, allowing the adaptive radii { r j } to respond to fine-scale variation in the built environment. Distances used in all subsequent computations—spatial-exclusion checks d ( i , j ) r i + r j and routing costs d i j = d ( j i , j j ) —are evaluated with the same network metric d on G . Table 2 summarizes the data analytics used in the Gangnam district case, which lists each variable category, how the data are sourced and preprocessed, and how they enter the model as criteria scores S j , k or routing inputs.
In connection with open data acquisition provided by the Seoul Open Data platform, first, Table 3 summarizes the demographic datasets integrated into the GD-ARISE pipeline for the Gangnam district case study. The datasets include Seoul’s living-population estimates for domestic and foreign residents, and each living-population dataset represents the estimated number of people present in a specific location and time, derived by combining administrative records (resident registry, transport, business, and building databases) with KT (Korea Telecom) big data. The census-block boundary layer (EPSG:5186) provides the spatial framework for aggregating and visualizing these population estimates. Together, these layers constitute the demographic foundation for quantifying spatial patterns of pedestrian exposure and population density within the GD-ARISE framework.
Table 4 summarizes the transit-related GIS layers integrated into the GD-ARISE pipeline to represent multimodal accessibility across Gangnam District. The datasets include bus stop locations and subway entrance coordinates, each obtained from verified public sources and re-projected to a unified coordinate reference system (EPSG:5186) for geometric consistency. The bus stop layer provides detailed node attributes, such as stop IDs, names, and coordinates, while the subway entrance layer contains manually extracted latitude–longitude pairs for all access points within the study area. Together, these layers constitute the transit node component of criterion C 2 , serving as the spatial foundation for the proximity-based accessibility analysis in subsequent stages of the GD-ARISE framework. In addition, Table 5 details the source datasets and preprocessing for the waste-related POIs criterion C 3 .
Now, we discretize G by sampling G at the spatial resolution δ > 0 to obtain a candidate set of feasible bin sites J = { j 1 , , j N } G , and, independently, a demand lattice U = { u 1 , , u M } G . The criterion collection C = { C 1 , , C M } is expressed for waste bin siting, with observable correlates of litter pressure and disposal opportunity, in a manner that matches the demand formulation. A floating population is treated as a direct, site-specific magnitude: block-level counts from the observation period are averaged to obtain a mean daily exposure and then spatially joined to both J and U. When a candidate point or demand node falls within overlapping administrative polygons, the exposure is taken as the mean of the overlapping values, which yields well-defined raw measurements V j , k for the population criterion and preserves mass under areal interpolation. Public transit proximity is encoded via proximity to bus stops and subway exits. After loading both datasets and projecting to EPSG:5186, we compute for each j J the shortest path distance along G to the nearest stop and to the nearest exit. Local waste generation potential is represented by proximity to POIs that tend to generate street litter, such as convenience stores, cafés, food trucks, and public parks. Each POI set is harmonized into a single layer with subtype weights ω k , s 0 that sum to one within the criterion.
As shown in Figure 4, population scores are highly right-skewed, indicating many low-exposure segments and a small fraction of hotspots that dominate the upper tail. Most candidates have low values, with a long tail and few very high-exposure locations created by concentrated corridors. A floating population C 1 is a direct site-specific magnitude derived from block-level daily counts aggregated over the observation period. Let P j denote the mean daily floating population assigned to site j via areal interpolation from its containing (or overlapping) census blocks. To place P j on a unit scale while preserving ranks, we apply the benefit of min–max normalization over all candidates,
S j , pop = P j min i J P i max i J P i min i J P i ,
which yields S j , pop [ 0 , 1 ] and encodes relative pedestrian exposure.
As shown in Figure 5, disposal scores are mostly moderate with a broad mode around the mid-range, plus a spike at zero for candidates far from transit. The distribution is broadly spread with a mid-range mode and a mass at zero reflecting locations beyond the 300 m influence of both bus stops and subway exits. Transit proximate disposal opportunity C 2 is modeled as proximity to the nearest bus stop and the nearest subway exit, recognizing that on–off flows around stations correlate with both waste generation and appropriate placement of receptacles. Let d j , bus and d j , sub denote the network distances from j to the nearest bus stop and the nearest subway exit, respectively. A convex combination of linear distance decay kernels with a common maximum influence range D max , disp = 300 m produces a unit-interval score
S j , disp = ω bus max 0 , 1 d j , bus 300 + ω sub max 0 , 1 d j , sub 300 , where ω bus = 0.75 and ω sub = 0.25 .
which reflects the stronger baseline frequency of bus stops relative to subway exits, while allowing both to contribute when they are nearby.
As shown in Figure 6, shop proximity scores exhibit two masses: a spike at zero for sites with no nearby POIs and a broad peak around 0.5–0.6 where multiple POIs lie within walking range. The spike at zero reflects POI-sparse areas; the main peak indicates neighborhoods with several POIs inside the decay radius. Waste source proximity C 3 aggregates the influence of POIs associated with street litter. Let K = { conv , cafe , truck , park } index convenience stores, cafés, food trucks, and parks, and let d j , k denote the network distance from j to the nearest POI of subtype k. Subtype weights { w k } k K encode relative propensities for waste exposure and satisfy k w k = 1 . Using the same 300 m influence range, we set
S j , shop = k K w k max 0 , 1 d j , k 300 , where ( w conv , w cafe , w truck , w park ) = ( 0.35 , 0.35 , 0.20 , 0.10 ) .
so that co-location near multiple waste-related attractors increases the score while contributions taper linearly to zero at 300 m.
The empirical distributions of the component scores and the composite reveal substantial spatial heterogeneity across Gangnam’s pedestrian network and inform the subsequent radius mapping. The floating population score S j , pop spans [ 0 , 1 ] and is right-skewed, with a mean of 0.1133 , a standard deviation of 0.1380 , and a 75th percentile of 0.1339 , reflecting concentrated corridors of high exposure. The transit disposal score S j , disp ranges over [ 0 , 0.9780 ] and is more uniform, with a mean of 0.4170 , a standard deviation of 0.2257 , and a median of 0.4450 , consistent with the dense but heterogeneous distribution of bus stops and subway exits. The POI proximity score S j , shop is generally small, with a maximum of 0.0987 , a mean of 0.0275 , a standard deviation of 0.0288 , and a 75th percentile of 0.0515 , indicating that only a minority of network points lie within short walks of multiple waste-generating attractors.
We specify three criteria C = { C 1 , C 2 , C 3 } that capture complementary drivers of litter pressure and disposal opportunity: C 1 encodes floating population intensity, C 2 captures transit-proximate disposal likelihood, and C 3 reflects proximity to waste-generating points of interest (POIs). For each j J and criterion C k , a raw, non-negative measurement V j , k is constructed and mapped to a unit-interval score S j , k [ 0 , 1 ] via a monotone transformation f k so that larger values consistently indicate greater suitability for siting a bin at j. With the criterion scores in hand, the relative importance of population exposure, transit opportunity, and POI proximity is computed via the AHP. To clarify the AHP elicitation, we provide the pairwise comparison scale and the exact questions used for expert judgments. AHP pairwise comparison scale used for expert elicitation.
ScaleDefinition
1Equally important
3Slightly more important
5Moderately more important
7Strongly more important
9Absolutely more important
2, 4, 6, 8Intermediate between adjacent judgments
ReciprocalsInverse when criterion B is preferred over A
Pairwise comparison questions used for expert elicitation were as follows:
Q1.
How much more important is distance to waste-source facilities (cafés, convenience stores) than floating population? Answer: 1 / 5 (population is moderately more important than shop).
Q2.
How much more important is disposal likelihood near transit stops (bus, subway) than floating population? Answer: 1 / 5 (population is moderately more important than disposal).
Q3.
How much more important is disposal likelihood near transit stops than waste-source proximity? Answer: 3 (disposal is slightly more important than shop).
Two municipal officers completed the elicitation independently. Their responses were aggregated into the following pairwise comparison matrix:
A = 1 5 5 1 5 1 1 3 1 5 3 1 .
This matrix indicates that population was judged substantially more important than both disposal likelihood and shop proximity, and that disposal likelihood was moderately more important than shop proximity. Using the normalized average method on A, we obtain the AHP weights,
α pop = 0.6864 , α disp = 0.2114 , α shop = 0.1022 ,
reported to four decimal places. The composite suitability at site j is the convex combination given by
S j * = α pop S j , pop + α disp S j , disp + α shop S j , shop = 0.6864 S j , pop + 0.2114 S j , disp + 0.1022 S j , shop
which, by construction, lies in [ 0 , 1 ] and is strictly increasing in each constituent score. This S j * serves as the demand score output used downstream to assign adaptive radii and to prioritize candidates during siting. Aggregation by the AHP weights produces a composite S j * that remains right-skewed, with a mean of 0.2083 , a standard deviation of 0.1199 , a median of 0.1883 , a maximum of 0.9301 , and a 75th percentile of 0.2551 . Under the adaptive radius map r j = R max ( R max R min ) S j * , these statistics translate into smaller service radii in the highest-scoring corridors and larger radii in peripheral or residential areas, ensuring fine spatial granularity where litter pressure is greatest while preserving baseline access elsewhere. All distances entering the kernels are evaluated with the same network metric d as used in the siting and routing stages, maintaining geometric consistency across the full GD-ARISE pipeline.
Figure 7 shows a heatmap for the composite demand surface that is highly clustered with the highest percentiles concentrated along the northern–central corridors and decreasing toward the periphery. Colors show percentile ranks from low (light) to high (dark); road segments and administrative polygons are overlaid for reference. The surface reveals pronounced high-demand bands in the northern–central corridors, tapering toward the southeast and peripheral areas.

3.3. SCASS

This stage specializes the SCASS formulation to the Gangnam candidate set J = { j 1 , , j N } with N = 32 , 890 points sampled every δ = 10 m along the pedestrian–road network G , using the same network-based distance metric d as in the scoring stage. The input at each site j J is the composite suitability S j * [ 0 , 1 ] obtained from the AHP-based demand assessment. Two preprocessing choices are made to align data coverage with plausible service contexts while preserving the mathematical structure of SCASS. First, a proximity sanity check is applied to screen out isolated candidates: for each j we test whether there exists at least one demand facility (bus stop, subway exit, or waste-related POI) within 1 km under d. In the Gangnam dataset, every candidate satisfies this condition, so the working set remains J. Second, each candidate is assigned an adaptive coverage radius r j via the affine map
r j = R max ( R max R min ) S j * ,
so that higher suitability implies a smaller catchment, consistent with dense placement in demand hotspots, while lower suitability enlarges catchments to preserve baseline access elsewhere. Applied to Gangnam with ( R min , R max ) = ( 30 m , 150 m ) and P target = 500 , the greedy procedure selects | J sel | = 347 sites before all remaining candidates conflict with at least one already-selected site; this binding of nonoverlap rather than the numerical target determines the achieved count. The realized J sel concentrates in high-scoring corridors around major commercial and transit axes, while respecting minimum separations implied by the adaptive radii and thereby avoiding redundant service areas.
As shown in Figure 8, the final set of selected sites for this analysis is displayed. Network edges are shown in black; selections were obtained by the SCASS greedy nonoverlap procedure on the pedestrian–road network. Table 6 summarizes key parameters and resulting outcomes for the SCASS stage. It lists the candidate set size and sampling, the proximity filter, the adaptive radius bounds and mapping, the target portfolio size, the number of selected sites, and the shared network distance metric.

3.4. Collection Route Optimization: Samsung 1–dong

To instantiate routing optimization on a concrete sub-area, we focus on Samsung 1–dong within Gangnam and route a single collection vehicle to visit the K = 37 waste–bin sites that were previously selected there. The depot is fixed at D 0 = (37.50867° N, 127.086466° E), and we retain the notation of the main framework by labeling the selected bins { j 1 , , j 37 } and defining the node set V = { 0 , 1 , , 37 } with j 0 D 0 . Consistent with the end-to-end pipeline, inter-node travel costs are evaluated using the same distance metric d induced by the pedestrian–road network G restricted to the Samsung 1–dong extent. Because all bins must be serviced exactly once and, for this small cluster, a single vehicle is sufficient, we solve the capacitated vehicle routing model in the special case m = 1 and Q i = 1 37 q i , which reduces to a traveling salesman problem on the node set V. A route is represented by a closed walk ( π 0 , π 1 , , π 37 , π 38 ) with π 0 = π 38 = 0 and { π 1 , , π 37 } = { 1 , , 37 } . Its network cost is
C ( π ) = t = 0 37 d π t , π t + 1 ,
and the optimization objective is to find π that minimizes C ( π ) .
As shown in Figure 9, simulated annealing produces a single depot-to-depot tour that visits all selected Samsung 1–dong sites in sequence on the same network used for siting. The green marker denotes the depot; blue points mark the 37 selected sites labeled in visit order; orange lines depict the annealed tour; the black polygon outlines the Samsung 1–dong boundary. The route is the outcome of simulated annealing, evaluated using the network metric consistent with location selection. The resulting annealed tour visits each of the 37 Samsung 1–dong bins exactly once and returns to the depot. Because the encoding, acceptance logic, and termination criteria conform exactly to the routing specification, this Samsung 1–dong case demonstrates that the GD-ARISE pipeline maintains geometric consistency from demand scoring through siting to routing, and that the final routing layer can be solved to high quality with modest computation even when distances respect real network impedances rather than idealized straight lines.
For the computation of the simulated annealing (SA) algorithm, the parameters of the number of nodes in the route (n), the number of iterations per temperature stage (M), the initial temperature ( T 0 ), the stopping temperature ( T stop ), and the cooling rate ( α , 0 < α < 1 ) are considered, so that the total number of temperature stages ( L a ) and the time complexity ( L b ) are respectively computed as L a = ln ( T stop / T 0 ) ln ( α ) and L b = O L a · ( M · n + n 2 ) . For the single vehicle case, the parameters were set to M = 1000 , n = 39 , T 0 = 1000 , T stop = 10 8 , and α = 0.995 , yielding L a = 5054 and L b = 204,793,134 . The optimized SA route resulted in a total travel distance of approximately 11.43 km , with a mean leg distance of about 301 m . The self-intersection count was none, and the tortuosity was 2.44 times the straight round-trip length from the depot. Figure 10 shows the total route length decreased sharply before about 1000 steps, from approximately 19 , 500 m to 11 , 430 m , and then stabilized.
For the two-vehicle case, instead of one long tour (TSP), the route can first be divided into two subroutes, each optimized independently by SA. The sites were partitioned via the K-means clustering method to minimize overlap between vehicle coverages and to balance total travel distances, among many other alternative clustering methods. As shown in Figure 11, K-means clustering partitioned the candidate sites between the two vehicles, and inspection of the routes confirms that the two vehicles divided the four census blocks in Samsung 1–dong, with each vehicle traversing two blocks. This process results in two single depot-to-depot tours assigned to the two vehicles, each visiting all trash bins in Samsung 1–dong in sequence, as in the single vehicle case. The green marker denotes the depot; purple points mark the 37 selected sites; the blue lines depict the annealed tour of the 26 selected sites visited by Vehicle A, labeled in visit order; the orange lines depict the annealed tour of the 11 selected sites visited by Vehicle B, labeled in visit order; the black polygon outlines the Samsung 1–dong boundary.
After clustering, Vehicle A was assigned n A = 28 nodes and Vehicle B was assigned n B = 13 nodes, excluding the depot. All other SA parameters were kept identical to the single-vehicle case, except for n. The estimated number of operations was 145,474,336 for Vehicle A, and 66,556,126 for Vehicle B. When the two vehicles divided the candidate sites for collection, the total travel distance of both vehicles combined was approximately 15.81 km , which is longer than the single-vehicle distance of about 11.43 km . However, the individual travel burdens were reduced, with Vehicle A and Vehicle B traveling approximately 8.77 km and 7.04 km , respectively. Each vehicle’s route had a mean leg distance of approximately 324.8 m and 587.0 m , tortuosity values of 1.83 and 1.84 , and a self-intersection count of zero for both.
Figure 12 shows the total route lengths of both vehicles decreased rapidly during the early temperature steps, with Vehicle A dropping from approximately 14 , 000 m to 8700 m and Vehicle B from about 7600 m to 7040 m within the first 1000 steps, after which both stabilized. The number of crossings followed a similar trend, falling from roughly 30 and 0 at the beginning to nearly zero within 1000 steps for Vehicle A and Vehicle B, respectively.

3.5. Sensitivity Analysis

We analyze the impact of parameter settings on GD-ARISE’s results. Two sensitivity experiments have been conducted by focusing on (i) the AHP weighting in the multi-criteria scoring stage and (ii) the adaptive coverage parameters in the SCASS site-selection stage. These experiments give ideas about how variations in subjective or geometric settings propagate through the pipeline and influence the resulting suitability distribution and selected facility portfolio.

3.5.1. AHP Sensitivity

To gauge the effect of AHP weighting, we repeated the analysis with near-uniform weights across the three criteria. Under uniform weights, the composite suitability distribution broadened and shifted upward (Figure 13), implying that more candidates attained moderate-to-high suitability. The feasible nonoverlapping selection also expanded and reconfigured, with changes concentrated in mid-range corridors while core hotspots remained stable (Figure 14). Overall, equal weighting increases admissible placements and shifts marginal sites rather than overturning the highest-demand areas.

3.5.2. SCASS Sensitivity

To assess sensitivity to spatial parameters in SCASS, we increased the adaptive-radius bounds from ( R min , R max ) = ( 30 m , 150 m ) to ( 100 m , 200 m ) . Larger radii produced fewer but broader catchments: the feasible nonoverlapping set shrank (from 347 to 180 sites) while coverage areas grew across the entire distribution (range, mean, and quartiles all increased). Spatially, intensified exclusions reduced site density and shifted feasible locations, with removals and additions concentrated where spacing became binding (Figure 15). Overall, expanding ( R min , R max ) trades off count for reach, yielding sparser deployments with larger service footprints.

3.6. Managerial Implications

The Gangnam application converts analytics into direct choices about how many bins to deploy, where to place them, and how to service them under real network constraints. The realized portfolio J sel provides a defensible deployment count given nonoverlap and adaptive radii; managers can tune P target and ( R min , R max ) to meet coverage or budget targets. Adaptive radii yield a clear siting rule—smaller in high-pressure corridors, larger in quieter areas—so spacing reflects actual walkability rather than Euclidean distance, and nearby nonconflicting alternatives can be identified when frontage or regulations prevent a proposed point.
Two indicators support oversight: the share of demand nodes U covered by j J sel D j , and the distribution of separations relative to r i + r j , revealing where spacing is tight or slack. The routing layer turns network distance into service hours with average speed and per-stop time, enabling staffing and shift design and making comparisons to incumbent routes straightforward. Sensitivity to AHP weights highlights “swing” sites and shows whether small preference shifts change coverage or route time; robust outcomes ease stakeholder agreement, while sensitive zones can be prioritized for pilots or field checks. Because all stages share one metric and dataset, the pipeline can be re-run seasonally as demand or networks change, with re-routing for minor updates and re-siting reserved for larger deviations. The same geometry-consistent workflow transfers to curb-level assets, such as micro-mobility docks or hydration stations, by swapping criteria and data sources while retaining the sensitivity and monitoring protocol.

4. Conclusions

This paper framed public waste bin placement and servicing as a GIS-grounded operations research problem and developed a GD-ARISE data analytic pipeline that unifies multi-criteria demand assessment, spatially constrained site selection, and operational routing on a shared network metric. The Gangnam district case demonstrated that, with readily available administrative layers, pedestrian network data, floating population estimates, transit nodes, and points of interest, the pipeline can sample the walkable network at fine resolution, construct composite suitability scores, translate them into adaptive service radii, select a maximal nonoverlapping portfolio of facility sites, and generate depot-to-depot routes that are feasible on the actual network.
Methodologically, four elements define our contribution. First, GIS integration unifies all spatial layers on one pedestrian–road network and one distance metric. Second, demand data analytics converts heterogeneous inputs into transparent multi-criteria scores. Third, site selection is cast as coverage with nonoverlap on the same network. A fast greedy procedure enforces spacing with a degree-based performance guarantee and scales on GIS graphs. Fourth, routing closes the loop on the identical metric used upstream. Together, these steps form a reproducible, end-to-end pipeline for location-and-routing that is defensible, scalable, and operationally relevant for municipal planning. The Gangnam application highlights several substantive findings. Composite demand is highly skewed and spatially clustered along commercial corridors and transit interchanges; adaptive radii concentrate bins where pressure is greatest while preserving access in quieter areas; nonoverlap constraints prevent redundant service footprints and enforce minimum spacing that respects sidewalk conditions; and simulated annealing delivers shorter, operationally coherent tours once distances reflect real network impedances. These outcomes suggest that a GIS-integrated operations research approach can simultaneously improve perceived cleanliness, reduce reactive cleanup, and control service effort by aligning siting and routing with observed urban activity. While our case study centers on waste bins, the framework generalizes to other street-level amenities—micro-mobility docks, kiosks, sensors, hydration stations—where demand is heterogeneous, space is scarce, and operations matter.
The work also has limitations that point to avenues for extension. Criteria weights were elicited via AHP and calibrated to available data; while consistency checks were enforced, robustness to alternative weight sets and to additional criteria (e.g., complaints, street-furniture density, event schedules) merits further study. The adaptive radius mapping is intentionally simple and interpretable; in settings with strong regulatory or equity requirements, radius rules could be learned from outcomes, made time-dependent, or augmented with minimum-access guarantees. Finally, SCASS employs a greedy selection to ensure transparency and scalability, but we did not benchmark its potential suboptimality against exact or metaheuristic methods; future work should quantify optimality gaps (e.g., via MILP on subregions or lightweight local improvements) to bound performance under the stated constraints.

Author Contributions

Conceptualization, H.-T.H.; Methodology, J.-J.W., J.-S.L., and H.-T.H.; Software, J.-J.W. and J.-S.L.; Formal analysis, J.-J.W.; Data curation, J.-J.W. and J.-S.L.; Writing—original draft, J.-S.L. and H.-T.H.; Writing—review and editing, H.-T.H.; Visualization, J.-S.L.; Supervision, H.-T.H.; Funding acquisition, H.-T.H. All authors have read and agreed to the published version of the manuscript.

Funding

This Research was supported by the MSIT (Ministry of Science and ICT), Republic of Korea, under the ITRC (Information Technology Research Center) support program (RS-2025-00259004) supervised by the IITP (Institute for Information and Communications Technology Planning and Evaluation) and by the Korea Institute of Energy Technology Evaluation and Planning(KETEP) grant funded by the Korea government(MOTIE) (20214000000060, Department of Next Generation Energy System Convergence based-on Techno-Economics-STEP).

Data Availability Statement

The data presented in this study are openly available from the Seoul Open Data Portal (public repository) at the following URLs (accessed on 2 July 2025): https://data.seoul.go.kr/dataList/OA-14979/F/1/datasetView.do; https://data.seoul.go.kr/dataList/OA-14980/F/1/datasetView.do; https://data.seoul.go.kr/dataList/OA-14978/F/1/datasetView.do; https://data.seoul.go.kr/dataVisual/seoul/seoulLivingPopulation.do; https://data.seoul.go.kr/dataList/OA-15067/S/1/datasetView.do; https://data.seoul.go.kr/dataList/OA-18699/S/1/datasetView.do; https://data.seoul.go.kr/dataList/OA-15004/F/1/datasetView.do; In addition, one auxiliary dataset (subway entrance coordinates used for accessibility analysis) was derived from a public-domain web map and is available at: http://map.esran.com/ (accessed on 2 July 2025) [Seoul Open Data Portal].

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript (Figure 1):
CRSCoordinate Reference System
MCDMMulti-Criteria Decision-Making
AHPAnalytic Hierarchy Process
CSACoverage Suitability Analysis
MCLPMaximal Covering Location Problem
SCASSSpatially-Constrained Adaptive Site Selection
RORoute Optimization
GD-ARISEGIS-integrated and Data analytic Adaptive Radius Integrated Siting and rEservicing
OROperations Research
POIPoint of Interest
PRNPedestrian–Road Network

References

  1. Hess, C.; Dragomir, A.G.; Doerner, K.F.; Vigo, D. Waste collection routing: A survey on problems and methods. Cent. Eur. J. Oper. Res. 2024, 32, 399–434. [Google Scholar] [CrossRef]
  2. Han, J.; Zhang, J.; Guo, H.; Zhang, N. Optimizing location-routing and demand allocation in the household waste collection system using a branch-and-price algorithm. Eur. J. Oper. Res. 2024, 316, 958–975. [Google Scholar] [CrossRef]
  3. Boeing, G. OSMnx: New methods for acquiring, modeling, analyzing, and visualizing complex street networks. Comput. Environ. Urban Syst. 2017, 65, 126–139. [Google Scholar] [CrossRef]
  4. Mahéo, A.; Rossit, D.G.; Kilby, P. Solving the integrated bin allocation and collection routing problem for municipal solid waste: A Benders decomposition approach. Ann. Oper. Res. 2023, 322, 441–465. [Google Scholar] [CrossRef]
  5. Haklay, M.; Weber, P. OpenStreetMap: User-generated street maps. IEEE Pervasive Comput. 2008, 7, 12–18. [Google Scholar] [CrossRef]
  6. Nyimbili, P.H.; Erden, T. A hybrid approach integrating entropy–AHP and GIS for suitability assessment of urban emergency facilities. ISPRS Int. J. -Geo-Inf. 2020, 9, 419. [Google Scholar] [CrossRef]
  7. Saaty, T.L. The Analytic Hierarchy Process; McGraw–Hill: New York, NY, USA, 1980. [Google Scholar]
  8. Şener, B.; Süzen, M.L.; Doyuran, V. Landfill site selection by using geographic information systems. Environ. Geol. 2006, 49, 376–388. [Google Scholar] [CrossRef]
  9. Rahmat, Z.G.; Niri, M.V.; Alavi, N.; Goudarzi, G.; Babaei, A.A.; Baboli, Z.; Hosseinzadeh, M. Landfill site selection using GIS and AHP: A case study: Behbahan, Iran. KSCE J. Civ. Eng. 2017, 21, 111–118. [Google Scholar] [CrossRef]
  10. Ahmed, A.; Kheraj, T.; Mohammadi, A.; Bergquist, R. Hybrid GIS-MCDM approach for Hospital Site Selection Suitability Analysis in Poonch District, Jammu and Kashmir, India. GeoJournal 2024, 89, 186. [Google Scholar] [CrossRef]
  11. Mostafaeipour, A.; Hosseini Dehshiri, S.S.; Hosseini Dehshiri, S.J.; Almutairi, K.; Taher, R.; Issakhov, A.; Techato, K. A thorough analysis of renewable hydrogen projects development in Uzbekistan using MCDM methods. Int. J. Hydrogen Energy 2021, 46, 31174–31190. [Google Scholar] [CrossRef]
  12. Janmontree, J.; Zadek, H.; Ransikarbum, K. Analyzing solar location for green hydrogen using multi-criteria decision analysis. Renew. Sustain. Energy Rev. 2025, 209, 115102. [Google Scholar] [CrossRef]
  13. Church, R.L.; ReVelle, C.S. The maximal covering location problem. Pap. Reg. Sci. 1974, 32, 101–118. [Google Scholar] [CrossRef]
  14. Daskin, M.S. Network and Discrete Location: Models, Algorithms, and Applications; Wiley: Hoboken, NJ, USA, 1995. [Google Scholar]
  15. ReVelle, C.; Eiselt, H.A. Location analysis: A synthesis and survey. Eur. J. Oper. Res. 2005, 165, 1–19. [Google Scholar] [CrossRef]
  16. Marsh, M.T.; Schilling, D.A. Equity measurement in facility location analysis: A review and framework. Eur. J. Oper. Res. 1994, 74, 1–17. [Google Scholar] [CrossRef]
  17. Daskin, M.S. A maximum expected covering location model: Formulation, properties and heuristic solution. Transp. Sci. 1983, 17, 48–70. [Google Scholar] [CrossRef]
  18. Bonnet, B.; Dessavre, D.G.; Kraus, K.; Ramirez-Marquez, J.E. Optimal placement of public-access AEDs in urban environments. Comput. Ind. Eng. 2015, 90, 269–280. [Google Scholar] [CrossRef]
  19. Farahani, R.Z.; Asgari, N.; Heidari, N.; Hosseininia, M.; Goh, M. Covering problems in facility location: A review. Comput. Ind. Eng. 2012, 62, 368–407. [Google Scholar] [CrossRef]
  20. Adeleke, O.J.; Olukanni, D.O. Facility location problems: Models, techniques, and applications in waste management. Recycling 2020, 5, 10. [Google Scholar] [CrossRef]
  21. Berman, O.; Krass, D.; Drezner, Z. The gradual covering decay location problem on a network. Eur. J. Oper. Res. 2003, 151, 474–480. [Google Scholar] [CrossRef]
  22. Karasakal, O.; Karasakal, E.K. A maximal covering location model in the presence of partial coverage. Comput. Oper. Res. 2004, 31, 1515–1526. [Google Scholar] [CrossRef]
  23. Blanco, V.; Gázquez, R. Fairness in maximal covering location problems. Comput. Oper. Res. 2023, 157, 106287. [Google Scholar] [CrossRef]
  24. Berman, O.; Drezner, Z.; Krass, D.; Wesolowsky, G.O. The variable radius covering problem. Eur. J. Oper. Res. 2009, 196, 516–525. [Google Scholar] [CrossRef]
  25. Özkan, B.; Özceylan, E.; Sarıçiçek, İ. GIS-based MCDM modeling for landfill site suitability analysis: A comprehensive review of the literature. Environ. Sci. Pollut. Res. 2019, 26, 30711–30730. [Google Scholar] [CrossRef] [PubMed]
  26. Araújo, E.J.; Chaves, A.A.; Lorena, L.A.N. A mathematical model for the coverage location problem with overlap control. Comput. Ind. Eng. 2020, 146, 106548. [Google Scholar] [CrossRef]
  27. Cherri, L.H.; Carravilla, M.A.; Ribeiro, C.; Toledo, F.M.B. Optimality in nesting problems: New constraint programming models and a new global constraint for non-overlap. Oper. Res. Perspect. 2019, 6, 100125. [Google Scholar] [CrossRef]
  28. Gola, A.; Kłosowski, G.; Świć, A. Facility Layout Problem with Alternative Facility Variants. Appl. Sci. 2023, 13, 5032. [Google Scholar] [CrossRef]
  29. Hifi, M.; M’Hallah, R. A literature review on circle and sphere packing problems: Models and methodologies. Adv. Oper. Res. 2009, 2009, 150624. [Google Scholar] [CrossRef]
  30. Erlebach, T.; Jansen, K.; Seidel, E. Polynomial-time approximation schemes for geometric intersection graphs. SIAM J. Comput. 2005, 34, 1302–1323. [Google Scholar] [CrossRef]
  31. Osman, I.H.; Laporte, G. Metaheuristics: A bibliography. Ann. Oper. Res. 1996, 63, 513–623. [Google Scholar] [CrossRef]
  32. ReVelle, C.; Schlossberg, M.; Williams, J.C. Solving the maximal covering location problem with heuristic concentration. Comput. Oper. Res. 2008, 35, 427–435. [Google Scholar] [CrossRef]
  33. Kirkpatrick, S.; Gelatt, C.D.; Vecchi, M.P. Optimization by simulated annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef]
  34. Aarts, E.; Korst, J. Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing; Wiley: Hoboken, NJ, USA, 1989. [Google Scholar]
  35. Lin, S.; Kernighan, B.W. An effective heuristic algorithm for the traveling-salesman problem. Oper. Res. 1973, 21, 498–516. [Google Scholar] [CrossRef]
  36. Yu, V.F.; Lin, C.-H.; Maglasang, R.S.; Lin, S.-W.; Chen, K.-F. An efficient simulated annealing algorithm for the vehicle routing problem in omnichannel distribution. Mathematics 2024, 12, 3664. [Google Scholar] [CrossRef]
  37. Rahmanifar, G.; Mohammadi, M.; Golabian, M.; Sherafat, A.; Hajiaghaei-Keshteli, M.; Fusco, G.; Colombaroni, C. Integrated location and routing for cold chain logistics networks with heterogeneous customer demand. J. Ind. Inf. Integr. 2024, 38, 100573. [Google Scholar] [CrossRef]
  38. Hashemi-Amiri, O.; Ji, R.; Tian, K. An integrated location–scheduling–routing framework for a smart municipal solid waste system. Sustainability 2023, 15, 7774. [Google Scholar] [CrossRef]
Figure 1. Workflow of the GD-ARISE.
Figure 1. Workflow of the GD-ARISE.
Mathematics 13 03465 g001
Figure 2. Computational workflow of the GD-ARISE for Gangnam District case.
Figure 2. Computational workflow of the GD-ARISE for Gangnam District case.
Mathematics 13 03465 g002
Figure 3. 1000 sample candidates of potential bin locations across Gangnam district.
Figure 3. 1000 sample candidates of potential bin locations across Gangnam district.
Mathematics 13 03465 g003
Figure 4. Histogram of the population score (floating population, min–max normalized).
Figure 4. Histogram of the population score (floating population, min–max normalized).
Mathematics 13 03465 g004
Figure 5. Histogram of the disposal score (transit proximity).
Figure 5. Histogram of the disposal score (transit proximity).
Mathematics 13 03465 g005
Figure 6. Histogram of the shop score (proximity to convenience stores, cafés, food trucks, and parks).
Figure 6. Histogram of the shop score (proximity to convenience stores, cafés, food trucks, and parks).
Mathematics 13 03465 g006
Figure 7. Composite demand heatmap across Gangnam district.
Figure 7. Composite demand heatmap across Gangnam district.
Mathematics 13 03465 g007
Figure 8. Final selected waste bin sites (red) and their adaptive coverage areas (green).
Figure 8. Final selected waste bin sites (red) and their adaptive coverage areas (green).
Mathematics 13 03465 g008
Figure 9. Optimized collection route in Samsung 1–dong.
Figure 9. Optimized collection route in Samsung 1–dong.
Mathematics 13 03465 g009
Figure 10. Variation of total route length and crossings with temperature (iteration) steps.
Figure 10. Variation of total route length and crossings with temperature (iteration) steps.
Mathematics 13 03465 g010
Figure 11. Optimized collection routes for two vehicles in Samsung 1–dong.
Figure 11. Optimized collection routes for two vehicles in Samsung 1–dong.
Mathematics 13 03465 g011
Figure 12. Variation of total route length and crossings with temperature steps for the two-vehicle case.
Figure 12. Variation of total route length and crossings with temperature steps for the two-vehicle case.
Mathematics 13 03465 g012
Figure 13. Histograms of composite suitability scores S j * of candidate sites under baseline AHP weights (left) and changed weights for sensitivity analysis (right).
Figure 13. Histograms of composite suitability scores S j * of candidate sites under baseline AHP weights (left) and changed weights for sensitivity analysis (right).
Mathematics 13 03465 g013
Figure 14. Spatial comparison between the baseline and AHP-sensitivity selections.
Figure 14. Spatial comparison between the baseline and AHP-sensitivity selections.
Mathematics 13 03465 g014
Figure 15. Spatial comparison between the baseline and SCASS-sensitivity selections.
Figure 15. Spatial comparison between the baseline and SCASS-sensitivity selections.
Mathematics 13 03465 g015
Table 1. GIS integration workflow for Gangnam case.
Table 1. GIS integration workflow for Gangnam case.
Variable CategorySource and PreprocessingRole in Model
Administrative boundariesCensus polygons; reproject WGS84 → EPSG:5186; spatial joins to pointsStudy region G, attribution of J and U
Pedestrian network G OSM walkable edges via OSMnx; simplification; snapping toleranceDistance metric d; candidate sampling; routing graph
Candidate locations JPoints every δ = 10  m along G ; N = 32 , 890 Feasible siting set
Demand nodes UNetwork vertices or grid centroids within GCoverage evaluation, calibration, validation
Table 2. Data analytics workflow for Gangnam District case.
Table 2. Data analytics workflow for Gangnam District case.
Variable CategorySource and PreprocessingRole in Model
Floating populationDaily counts by block; temporal mean; areal join to J and UCriterion C 1 : raw V j , 1 S j , 1
Transit nodesBus stops and subway exits; nearest-neighbor on G Proximity criterion C 2 : raw via kernel S j , 2
Waste-related POIsConvenience stores, cafés, food trucks, parks; subtype weights ω Proximity criterion C 3 : raw via kernel S j , 3
Depot D 0 , fleet and service paramsOperations center location; ( m , Q , v speed , τ ) Routing inputs and constraints
Table 3. Population and boundary datasets used for demographic analysis ( C 1 ).
Table 3. Population and boundary datasets used for demographic analysis ( C 1 ).
DatasetFormat/UnitDescriptionSource
Seoul Living Population (Domestic Residents)CSV (by census block)
  • Daily population counts (15–21 May 2025)
  • Variables: DateID, TimeType, DistrictCode, CensusBlockCode, TotalPopulation
  • Estimated using public data (resident registry, transport, business, and building DBs) combined with KT telecom big data
https://data.seoul.go.kr/dataList/OA-14979/F/1/datasetView.do (accessed on 2 July 2025) (Seoul Open Data)
Seoul Living Population (Short-term Foreign Residents)CSV (by census block)
  • Estimated short-term foreign resident population by census block
  • Method identical to domestic dataset; includes mobility-based estimation using transport and telecom data
https://data.seoul.go.kr/dataList/OA-14980/F/1/datasetView.do (accessed on 2 July 2025) (Seoul Open Data)
Seoul Living Population (Long-term Foreign Residents)CSV (by census block)
  • Long-term foreign residents measured via public and telecom data integration
  • Same variable structure as domestic dataset
https://data.seoul.go.kr/dataList/OA-14978/F/1/datasetView.do (accessed on 2 July 2025) (Seoul Open Data)
Census Block BoundarySHP/SBN (EPSG:5186)
  • Geospatial boundaries of census blocks where living population is estimated
  • Spatial unit for integrating demographic data
https://data.seoul.go.kr/dataVisual/seoul/seoulLivingPopulation.do (accessed on 2 July 2025) (Seoul Open Data)
Table 4. Transit-related datasets used for transit proximate disposal C 2 .
Table 4. Transit-related datasets used for transit proximate disposal C 2 .
DatasetFormat/UnitDescriptionSource
Seoul Bus Stop LocationsCSV (EPSG:5186)
  • NodeID, StopID
  • StopName
  • Coordinates (X, Y)
  • StopType
https://data.seoul.go.kr/dataList/OA-15067/S/1/datasetView.do (accessed on 2 July 2025) (Seoul Open Data)
Subway Entrance CoordinatesCSV (manual extraction)
  • Latitude–Longitude pairs (manual extraction)
  • Subway entrance locations for accessibility criterion C 2
http://map.esran.com/ (accessed on 2 July 2025) (Seoul Open Data)
Table 5. POI datasets used for waste source facilities C 3 .
Table 5. POI datasets used for waste source facilities C 3 .
DatasetFormat / UnitDescriptionSource
Gangnam Food-Service Licensing (cafe_conv_stfood)CSV (EPSG:5186)
  • Fields: business name, type, address, coordinates (X, Y)
  • Filtered to cafés, convenience stores, food trucks; active only
  • Updated: 26 May 2025
https://data.seoul.go.kr/dataList/OA-18699/S/1/datasetView.do (accessed on 2 July 2025) (Seoul Open Data)
Gangnam Urban ParksCSV (EPSG:5186)
  • Fields: park name, type, latitude–longitude
  • Subtype of C 3 (parks; weight = 0.10 )
  • Updated: 1 July 2024
https://data.seoul.go.kr/dataList/OA-15004/F/1/datasetView.do (accessed on 2 July 2025) (Seoul Open Data)
Table 6. Parameters and outcomes for adaptive site selection.
Table 6. Parameters and outcomes for adaptive site selection.
QuantityValueDefinition/Role
Candidate set size N 32 , 890 Points sampled every δ = 10  m along G within G
Proximity sanity radius1000 mFilter to exclude candidates with no nearby facilities under d
Adaptive radius bounds R min = 30  m, R max = 150  mAffine map r j = R max ( R max R min ) S j *
Target portfolio P target 500Desired number of nonoverlapping bins
Selected sites | J sel | 347Maximal nonoverlapping set obtained by greedy scan under d
Distance metric dNetwork shortest pathUsed for radii, nonoverlap checks, coverage, and routing
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Won, J.-J.; Lee, J.-S.; Ha, H.-T. GIS-Integrated Data Analytics for Optimal Location-and-Routing Problems: The GD-ARISE Pipeline. Mathematics 2025, 13, 3465. https://doi.org/10.3390/math13213465

AMA Style

Won J-J, Lee J-S, Ha H-T. GIS-Integrated Data Analytics for Optimal Location-and-Routing Problems: The GD-ARISE Pipeline. Mathematics. 2025; 13(21):3465. https://doi.org/10.3390/math13213465

Chicago/Turabian Style

Won, Jun-Jae, Jong-Seung Lee, and Hyung-Tae Ha. 2025. "GIS-Integrated Data Analytics for Optimal Location-and-Routing Problems: The GD-ARISE Pipeline" Mathematics 13, no. 21: 3465. https://doi.org/10.3390/math13213465

APA Style

Won, J.-J., Lee, J.-S., & Ha, H.-T. (2025). GIS-Integrated Data Analytics for Optimal Location-and-Routing Problems: The GD-ARISE Pipeline. Mathematics, 13(21), 3465. https://doi.org/10.3390/math13213465

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop