A Spatial Alignment Problem

Mikler, Armin R.; Tiwari, Chetan; Patterson, Murray

doi:10.3390/a19060475

Open AccessArticle

A Spatial Alignment Problem

by

Armin R. Mikler

¹

,

Chetan Tiwari

^1,2

and

Murray Patterson

^1,*

¹

Department of Computer Science, Georgia State University, Atlanta, GA 30302, USA

²

Department of Geosciences, Georgia State University, Atlanta, GA 30302, USA

^*

Author to whom correspondence should be addressed.

Algorithms 2026, 19(6), 475; https://doi.org/10.3390/a19060475

Submission received: 21 April 2026 / Revised: 29 May 2026 / Accepted: 5 June 2026 / Published: 11 June 2026

(This article belongs to the Special Issue Advances in Parameterized Complexity: Theory, Algorithms, and Applications)

Download

Browse Figures

Versions Notes

Abstract

This work concerns the harmonization of geospatial data to improve linkages between place-based characteristics and health outcomes. Such data are typically available as geographic layers, each representing a distinct attribute (e.g., income or distance to a clinic). Since layers are typically constructed independently, their boundaries tend to be spatially incongruent, which can create inconsistencies and introduce bias. This motivates developing algorithmic approaches for aligning such layers while aiming to preserve spatial integrity. This paper formalizes the problem of aligning k collections of m spatial supports over n spatial units in a d-dimensional Euclidean space such that maximum distortion to any collection is minimized. In the above setting, k is the number of layers; n is an indivisible population unit (e.g., census tract); m denotes supports, which are larger regions aggregating a set of contiguous units in order to capture broader regional patterns or enhance statistical stability; and

d = 2

. It is shown that: (1) the one-dimensional case is solvable in time polynomial in k, m, and n; (2) the two-dimensional case is NP-hard for two collections of two supports each; and (3) a heuristic can be provided for aligning a set of collections in the two-dimensional case, which is of practical importance.

Keywords:

resource allocation; disease maps; spatial alignment; NP-hardness; approximation algorithms; heuristics

1. Introduction

Public health decision-making often requires analyzing spatial data to understand how to best allocate resources or to determine possible associations between regional characteristics and health outcomes [1,2,3]. Regional characteristics can be measured at the level of a fundamental spatial unit such as a census tract, or as an aggregation of these spatial units based on a shared common property [4,5,6,7]. For example, regions may be delineated based on travel time to healthcare facilities, the spatial distribution of vulnerable populations, or patterns of social and economic disparities. However, these independently derived regions representing different attributes will almost always have misaligned boundaries, creating challenges for data harmonization.

In major public health efforts such as disease control and prevention, effective intervention hinges upon combining diverse data sources such as socioeconomic factors, environmental exposure, or healthcare access. In this context, spatial discordance necessitates the development of robust methodologies for data harmonization [8]. Such methodologies are crucial for enabling the precise integration of information across heterogeneous regions, thereby facilitating a more nuanced understanding of complex spatial processes and ultimately informing the design of targeted evidence-based public health strategies.

Figure 1 showcases an example where hospital service regions (second panel) are misaligned with the populations that they serve (third panel). Such misalignment may create inefficiencies in the allocation of limited resources such as language translators (fourth panel). The leftmost map illustrates a base set of census tracts (first panel) which are aggregated into larger regions using various criteria and methodologies, as shown in subsequent maps (second, third, and fourth panels). For instance, regions can be defined based on service areas determined by driving time distance (second panel), equal populations of females (third panel), or the distribution of language vulnerable subpopulations (fourth panel). Each of these regionalizations serves a specific purpose and is generated using a different analytical approach (UPAS [9], AZTool [10,11], ArcGIS Pro 3.5). However, the resulting regions are spatially incongruent and have boundaries that rarely coincide, posing a significant obstacle to the data harmonization required for optimal allocation of resources such as language translators.

Figure 2 gives a high-level outline of the proposed approach for reconciling misaligned boundaries using a simplified example with two sets of regions constructed from a base set of units (represented by rectangles). The first region consists of hospital service areas based on driving time distances (S1, S2, S3), while the second is a single cluster of a language-vulnerable population (hash shaded rectangles). Existing approaches may rely on GIS-based intersection operations that simply allocate each hashed block to its corresponding hospital service area based on overlap [12]. This strategy results in the allocation of a translator in each of the three hospitals despite the fact that there exists only one cluster of a language-vulnerable population.

The left panel of Figure 2 illustrates the pitfalls of employing a naive GIS approach to analyze the intersection of misaligned spatial datasets. This straightforward approach has a fundamental limitation: it requires prioritizing one set of regional boundaries, leading to the other set being fragmented, distorted, or even completely disregarded. In this instance, the hospital service areas (S1, S2, S3) are prioritized, while the vulnerable population region is subjected to a basic GIS operation, such as “intersection”, to assess its relationship with the driving time service regions. This operation, commonly found in most GIS software packages, can be executed using various rules to determine how spatial units from the vulnerable population zone are assigned to the hospital service areas. These rules might include centroid-based assignment, where the unit is allocated to the region containing its centroid; predominant area overlap, where the unit is assigned to the region with the largest intersecting area; or even weighted assignment based on the proportion of overlap with different regions. However, despite offering some flexibility in how individual units are assigned, these rules all fundamentally prioritize the predefined hospital service areas and can lead to undesirable consequences, such as the fragmentation of the vulnerable population zone. Observe how two units containing the vulnerable population are assigned to S3, while the remaining two units are allocated to S1 and S2, respectively. This fragmentation disrupts the inherent spatial cohesion of the vulnerable population, leading to the suboptimal allocation of resources. In essence, this naive approach prioritizes the predefined hospital service areas, imposing their boundaries onto the vulnerable population zone with no attempt to reconcile or compromise between the two.

In contrast, the right panel of Figure 2 showcases an illustrative example of our proposed algorithm for reconciling misaligned spatial boundaries. Instead of imposing one set of boundaries upon another, our algorithm facilitates a dynamic interplay in which both the hospital service areas (S1, S2, S3) and the vulnerable population region (collection of hashed units) can cede and acquire base spatial units to achieve a more harmonious alignment that represents a compromise between the two layers. In the reconfigured hospital service areas (S1, S2, and S3), the constituent units of S3 now incorporate a majority of the vulnerable population region.

Simultaneously, to maintain the integrity of hospital service areas, a base spatial unit containing a vulnerable population has been assigned to S1. While this specific strategy might not be ideal, it highlights the algorithm’s ability to negotiate and compromise between competing spatial objectives, ensuring that no single region is disproportionately affected by the reconciliation process. These targeted adjustments are achieved through the strategic reassignment of base units, highlighting the algorithm’s capacity to fine-tune boundaries with minimal alteration of the overall structure of the predefined regions. This approach involves an optimization framework that balances competing objectives, such as minimizing changes in the assignment of base units while preserving the overall shape and contiguity of the regions.

The challenges posed by misaligned spatial datasets extend beyond the specific example illustrated in Figure 2. Another critical application of boundary reconciliation lies in the analysis of geographic patterns derived from disease maps, which are essential tools in spatial epidemiology and public health [13,14,15]. Disease maps often face the issue of “small numbers”, particularly in regions with sparse populations or infrequent disease cases, where rate calculations can become unstable and yield misleading spatial patterns [16]. To address this, various smoothing techniques are applied to stabilize estimates and provide a clearer depiction of disease distribution. Methods such as empirical Bayes smoothing [17,18], kernel density smoothing [19,20], headbanging [21,22], and spatial interpolation are commonly used. Each creates smoothed regions that aggregate smaller spatial units into larger regions with sufficient populations in order to support reliable rate estimates [16,19,23]. However, these smoothed regions may not align with boundaries of other relevant datasets such as demographic, environmental, or socioeconomic variables, which may be needed for a deeper understanding of disease dynamics. This misalignment can complicate efforts to examine spatial relationships between disease patterns and potential risk factors. Thus, boundary reconciliation becomes essential for integrating these disease maps with other data layers, ensuring that associations between disease rates and covariates are examined using the same underlying geographic regions.

The rest of the paper is structured as follows. In Section 2, we formally introduce the problem that we set out to solve. This problem (Problem 1) is to align k collections of m spatial supports over a universal set U of n spatial units in a d-dimensional Euclidean space such that the maximum distortion to any collection is minimized. In Section 3, we show that if the Euclidean geospatial region that the set U of units is over is one-dimensional (i.e., if

d = 1

), then this problem is solvable in polynomial time in k, m, and n. In Section 4, we show that if the geospatial region is two-dimensional—which is the typical case for maps (e.g., Figure 1 and Figure 2)—then the problem is NP-hard, even in the case of two collections with two supports each. Finally, in Section 5 we outline a heuristic for this problem in the general two-dimensional case. Section 6 concludes the paper and outlines future work.

2. Problem Formulation

We now formally introduce the problem of aligning collections of spatial supports which share a common set of spatial units. For example, Figure 3a,b depicts two collections of four supports each (green, yellow, orange, and blue) which share a common set of 16 spatial units (rectangular blocks). The goal is to swap units from one support to another within each collection (change the colors of blocks) until the collections are identical, i.e., are aligned, as depicted in Figure 3c. Note that there are many different ways to align the supports, i.e., the alignment depicted in Figure 3c is not unique. With this in mind, it would be preferable to align such collections using the minimum number of (possibly weighted) swaps. This optimization problem is easy in some cases, and (NP-) hard in others.

In general, the alignment problem is on a set U of n spatial units, with each unit

u \in U

having a population count

p (u)

that is within a certain spatial boundary and disjoint from other units. These spatial units can represent census tracts or ZIP code tabulation areas. Constructing maps (e.g., choropleth maps, which reflect certain rates within a population, such as cancer incidence) provides an intuitive way to portray the geospatial patterns of such rates. This can provide decision support in public health surveillance, which in turn can aid officials to form the appropriate policies. Building such a map at the level of an individual unit can produce misleading results due to small populations in some units, resulting in statistically unstable rates. To remedy this issue, sets

s \subseteq U

of contiguous units are aggregated to create larger spatial supports with adequate population counts in order to ensure stable rate calculation, as depicted in Figure 3a.

Suppose a certain rate, e.g., prostate cancer incidence, can be mostly explained by a factor such as age; in this case, we want to create several maps to represent each age stratum in order to more clearly portray age as a factor in determining the rate. For the sake of illustration, suppose that

S

and

T

, depicted in Figure 3a,b are two such maps, represented as collections of supports over U. In

S

, the populations

p_{S} (u)

of each unit u in

s_{1}, s_{2}, s_{3}, s_{4}

are 20, 20, 10, 15, respectively—e.g.,

p_{S} (u_{1, 1}) = 20

(

u_{1, 1} \in s_{1}

), and

p_{S} (u_{4, 4}) = 10

(

u_{4, 4} \in s_{3}

)—while the populations

p_{T} (u)

of each unit u in

t_{1}, t_{2}, t_{3}, t_{4}

are 15, 15, 12, 20, respectively. In this way, the total population in any support (of

S

or

T

) is 60. We want to consolidate the information across these maps onto a single map, however, which requires us to align their collections of supports. To align collections of supports is to modify the supports of all collections in terms of the units they contain, such that the resulting supports remain contiguous and the resulting collections are identical. This can be viewed as “swapping” units between neighboring supports until the desired alignment is reached. For example, Figure 3c depicts an alignment of collections

S

and

T

. Such an alignment is obtained from

S

by swapping

u_{1, 2}

and

u_{2, 2}

from

s_{4}

(blue) to

s_{1}

(green),

u_{2, 3}

from

s_{3}

(orange) to

s_{1}

(green), and

u_{2, 4}

from

s_{3}

(orange) to

s_{4}

(blue). The alignment is obtained from

T

by swapping

u_{2, 1}

and

u_{3, 1}

from

t_{2}

(yellow) to

t_{1}

(green) and swapping

u_{3, 2}

from

t_{3}

(orange) to

t_{2}

(yellow).

Since any collection of contiguous supports is an alignment, including supports that may not be currently present, e.g., a hypothetical

s_{5}

, it is desirable to produce an alignment that minimizes the maximum number of changes in any one collection. Since

S

and

T

disagree on seven units

u_{2, 1}, u_{3, 1}, u_{1, 2}, u_{2, 2}, u_{3, 2}, u_{2, 3}, u_{2, 4}

(annotated with the red dots in Figure 3c), one collection must have at least four changes (the other collection having three changes); hence, the alignment depicted in Figure 3c satisfies this criterion. This need for adjustment leads to the notion of a distance

d (S, T)

between a pair

S

and

T

of collections of spatial supports, that is, the number of swaps needed to transform

S

into

T

; this is simply the number of units on which the pair of collections disagree. Here,

d (S, T) = 7

; since an alignment is just another collection, if we denote the alignment of Figure 3c as collection

A

of supports, then

d (S, A) = 4

and

d (T, A) = 3

. Note that this distance is symmetric.

The units of the different collections being swapped contain populations; hence, each swap has an associated cost, namely, the population

p_{C} (u)

of the unit u in the collection

C

being swapped. For example, in

S

, swapping

u_{1, 2}

from

s_{4}

(blue) to

s_{1}

(green) costs

p_{S} (u_{1, 2}) = 15

. This idea leads to the notion of a weighted distance

d_{w} (S, A)

between a pair

S

and

A

of collections of spatial supports, or the overall cost of the swaps needed to transform

S

into

A

. Here,

d_{w} (S, A) = p_{S} (u_{1, 2}) + p_{S} (u_{2, 2}) + p_{S} (u_{2, 3}) + p_{S} (u_{2, 4}) = 15 + 15 + 10 + 10 = 50

, while

d_{w} (T, A) = p_{T} (u_{2, 1}) + p_{T} (u_{3, 1}) + p_{T} (u_{3, 2}) = 15 + 15 + 12 = 42

. Note that this weighted distance is not symmetric, i.e.,

d_{w} (S, T) \neq d_{w} (T, S)

in general. More precisely, it is desirable to produce an alignment

A

that minimizes the maximum weighted distance

d_{w} (C, A)

between any collection

C

of spatial supports and

A

. After careful inspection, no alignment can achieve such a weighted distance of less than 50; hence, the alignment depicted in Figure 3c satisfies this weighted criterion as well. We formalize the alignment problem as follows.

Problem 1

(The Alignment Minimization Problem). Input: A base set

U = {u_{1}, \dots, u_{n}}

of units over some d-dimensional Euclidean geospatial region and set

𝒞 = {C_{1}, \dots, C_{k}}

of collections of spatial supports. Each unit

u \in U

has a population

p_{C} (u)

from

p_{C} : U \to N

specific to each collection

C \in 𝒞

. Each

C \in 𝒞

is a collection

{s_{C}^{1}, \dots s_{C}^{m}}

of contiguous supports such that (a)

s \subseteq U

for each

s \in C

, (b)

s \cap t = \emptyset

for any pair

s, t \in C

of distinct supports, and (c)

⋃_{s \in C} s = U

. Output: A collection

A

of contiguous supports which satisfies properties (a–c) above such that

\max {d_{w} (C, A) | C \in 𝒞}

is minimized.

Note that properties (a–c) ensure that the set of supports partitions the base set U, that is, (a) the supports contain sets of contiguous units, (b) pairs of distinct supports are disjoint, and (c) the supports cover the base set U.

In Section 3, we show that if the Euclidean geospatial region that the set U of units is over is one-dimensional, i.e., if

d = 1

, then Problem 1 is solvable in polynomial time in k, m, and n. In Section 4, we show that if the geospatial region is two-dimensional, i.e.,

d = 2

, which is the typical case in this context of constructing age-adjusted maps, then the same problem is NP-hard even in the case of two collections (

k = 2

) each with two supports. Finally, in Section 5 we outline a heuristic for the problem in the two-dimensional case, i.e., when

d = 2

. Section 6 concludes the paper and outlines future work.

3. Tractability Results

In this section, we show that if the Euclidean geospatial region that the set U of units is over is one-dimensional, then Problem 1 is solvable in an amount of time that is a polynomial function of k, m, and n.

Consider the set of four collections of three spatial supports over the common set of nine spatial units depicted in Figure 4. Because the one-dimensional case is so restrictive, each support (e.g., the orange support) is adjacent to at most two other supports (to the left, e.g., the green support and the right, e.g., the blue support). Hence, the set of supports in each collection can be enumerated from left to right (from 1 to m), and the i-th supports (

i \in [1, m]

) of each collection must align. Disagreements between the i-th and

(i + 1)

-th supports are then contained in a window of width at most n (the number

| U |

of units). Aligning these two supports involves scanning this window to find the separator (between a pair of units) with the minimum cost. For example, in Figure 4, the disagreements between the first (green) and second (orange) supports are contained in the transparent window on the left, with two choices of separator. Suppose that each unit (of each collection) of U has a population size of 1. Placing the separator between

u_{2}

and

u_{3}

implies coloring

u_{3}

of each collection orange, which has a cost of 3. Placing the separator between

u_{3}

and

u_{4}

instead implies coloring

u_{3}

of each collection green, which has a lower cost of 1. The disagreements between the second (orange) and third (blue) supports in Figure 4 are contained in the transparent window on the right, with three choices of separator. Of these three choices, placing the separator between

u_{6}

and

u_{7}

costs 2, while the other two choices cost 4 each.

The window between a pair of neighboring supports has width at most n (the number

| U |

of units) and height k (the number of collections). For

n + 1

separators, each step of a scan within the window from left to right involves updating k units for at most

k (n + 1)

operations. There are

m - 1

such windows (between each pair of neighboring supports of a set of m supports); hence, the overall number of operations is at most

k (n + 1) (m - 1) \in O (k m n)

. Note that windows may overlap, since the supports are nonempty and are enumerated from left to right, but no window will be contained in another. This implies that the best separator of a window that begins to the left of another window will also be to the left of the best separator of the other window. If they are at the same position, this simply means that the corresponding support (say i) is empty; in this case, the optimal alignment of m supports comprises

m - 1

supports.

4. Hardness Results

In this section, we show that if the Euclidean geospatial region that the set U of units is over is two-dimensional, then Problem 1 is NP-hard. The construction involves two collections, each with two supports. This implies that the alignment minimization problem in d dimensions is NP-hard for

k, m, d \geq 2

.

Theorem 1.

Problem 1 in d dimensions is NP-hard for

k, m, d \geq 2

.

Proof.

We first consider the following decision version of Problem 1.

Problem 2

(The Alignment Decision Problem). Input: A base set

U = {u_{1}, \dots, u_{n}}

of units over some d-dimensional Euclidean geospatial region and set

𝒞 = {C_{1}, \dots, C_{k}}

of collections of spatial supports. Each unit

u \in U

has population

p_{C} (u)

from

p_{C} : U \to N

, specific to each collection

C \in 𝒞

. Each

C \in 𝒞

is a collection

{s_{C}^{1}, \dots s_{C}^{m}}

of contiguous supports such that (a)

s \subseteq U

for each

s \in C

, (b)

s \cap t = \emptyset

for any pair

s, t \in C

of distinct supports, and (c)

⋃_{s \in C} s = U

. Decision: Does there exist a collection

A

of contiguous supports which satisfy properties (a–c) above such that

\max {d_{w} (C, A) | C \in 𝒞} = D

?

Clearly, if the Alignment Decision Problem in 2 is NP-hard, then so is its optimization version, i.e., the Alignment Minimization Problem in 1. We now construct a polynomial (Karp) reduction from the following Partitioning Problem to the Alignment Decision Problem in 2.

Problem 3

(The Partitioning Problem). Input: A multiset

X = {x_{1}, \dots, x_{n}}

of positive integers. Decision: Does there exist a partition of X into two disjoint (

X_{1} \cap X_{2} = \emptyset

) subsets

X_{1} \subseteq X

and

X_{2} \subseteq X

such that the difference between the sum

\sum_{x \in X_{1}} x

of elements in

X_{1}

and the sum

\sum_{x \in X_{2}} x

of elements in

X_{2}

is Δ?

The Partitioning Problem in 3 is NP-hard, as demonstrated in [24]. We now build a reduction from Problem 3 to Problem 2 in order to demonstrate that the latter problem is NP-hard.

Given an instance

X = {x_{1}, \dots, x_{n}}

of the Partitioning Problem in 3, we construct the base set

U \cup {a, b}

of spatial units, where

U = {u_{1}, \dots, u_{n}}

, as depicted in Figure 5. We then introduce the two collections

S = {s_{1}, s_{2}}

and

T = {t_{1}, t_{2}}

of spatial supports, where

s_{1} = {a} \cup U

,

s_{2} = {b}

,

t_{1} = {a}

, and

t_{2} = U \cup {b}

. For each collection

C \in {S, T}

,

p_{C} (u_{i}) = x_{i}

for each

i \in {1, \dots, n}

and

p_{C} (a) = p_{C} (b) = S + 1

, where

S = \sum_{x \in X} x

. The idea is that units a and b have sufficiently large populations that they remain in different supports in any alignment of

S

and

T

within a given distance threshold. In this case, the alignment is obtained by swapping only the elements of U in either

S

or

T

, which corresponds to a partition of U. We prove the following claim to complete the proof.

Claim. There exists a partition of X where the difference between the sum of the two parts is

Δ

if and only if there exists a collection

A

of contiguous supports which satisfies properties (a–c) of Problem 2 such that

\max {d_{w} (S, A), d_{w} (T, A)} = \frac{S + Δ}{2}

.

(⇒) Suppose that there exists a partition

(X_{1}, X_{2})

of X such that the difference between

S_{1} = \sum_{x \in X_{1}} x

and

S_{2} = \sum_{x \in X_{2}} x

is

Δ

. Then, consider collection

A

with supports

{a} \cup U_{1}

and

U_{2} \cup {b}

, where

U_{1}

(resp.

U_{2}

) is the set of units corresponding to

X_{1}

(resp.

X_{2}

). By inspecting Figure 5, it is clear that

A

satisfies properties (a–c). It follows that

d_{w} (S, A) = S_{2}

, since the units of

U_{2}

need to be swapped from

s_{1}

to

s_{2}

in order to transform collection

S

into

A

. Conversely,

d_{w} (T, A) = S_{1}

, since the units of

U_{1}

need to be swapped from

t_{2}

to

t_{1}

in order to transform collection

T

into

A

. Suppose, without loss of generality, that

S_{1}

is the larger sum, i.e.,

S_{1} > S_{2}

; hence,

S_{1} - S_{2} = Δ

. Since

S_{1} + S_{2} = S

, it follows that

S_{1} - (S - S_{1}) = Δ

; then,

2 S_{1} = S + Δ

and

S_{1} = \frac{S + Δ}{2}

. Since

S_{1}

is the larger sum, it follows that

\max {d_{w} (S, A), d_{w} (T, A)} = S_{1} = \frac{S + Δ}{2}

.

(⇐) Suppose that there exists a collection

A

of contiguous supports which satisfies properties (a–c) of Problem 2 such that

\max {d_{w} (S, A), d_{w} (T, A)} = \frac{S + Δ}{2}

. Since

Δ \leq S

, it follows that

\frac{S + Δ}{2} \leq S

; hence, both

d_{w} (S, A) \leq S

and

d_{w} (T, A) \leq S

. Therefore, it must be the case that units a and b are in two different supports of

A

, otherwise a swap of weight at least

S + 1

would be needed to transform

S

or

T

into

A

, contradicting the assumption that

d_{w} (S, A) \leq S

and

d_{w} (T, A) \leq S

. Suppose, without loss of generality, that

d_{w} (S, A) = \frac{S + Δ}{2}

. Then, the support of

A

that contains unit a must agree with

s_{1}

on a subset

U_{1} \subseteq U

of units with

\sum_{u \in U_{1}} p_{S} (u) = \frac{S - Δ}{2}

. Since

\sum_{u \in U} p_{S} (u) = S

, it follows that

U_{2} = U ∖ U_{1}

has

\sum_{u \in U_{2}} p_{S} (u) = S - \frac{S - Δ}{2} = \frac{S + Δ}{2}

. Since

\frac{S + Δ}{2} - \frac{S - Δ}{2} = Δ

, it follows that

(U_{1}, U_{2})

is a partition of U such that the difference between the sum of the (populations of the) two parts is

Δ

; hence, there exists such a partition of X. □

5. A Heuristic for the Two-Dimensional Case

In this section, we give a (polynomial time) heuristic for the Alignment Minimization Problem in 1 when the geospatial region is two-dimensional, i.e.,

d = 2

. We first give a heuristic for a pair of collections (

k = 2

) in two dimensions (in Section 5.1), then show how this can be extended to a general set of (more than two) collections (

k \geq 2

) in two dimensions (in Section 5.2), i.e., to the Alignment Minimization Problem when

d = 2

.

5.1. Aligning a Pair of Collections in Two Dimensions

Figure 3 depicts an example of this special case of aligning only two collections of supports, in this case (a) and (b). Thus, it is a special case of the Alignment Minimization Problem in 1 when

k = 2

and

d = 2

. Since there are only two collections, the idea is that if each support in one collection is matched up with another support in the other collection, then these pairs of supports are aligned with each other. For example, in the instance depicted in Figure 3,

s_{i} \in S

is matched with

t_{i} \in T

for each

i \in {1, 2, 3, 4}

. We first need the following definition of a shared units graph.

Definition 1

(Shared Units Graph

G_{x}

). The shared units graph for a pair

S, T

of collections of spatial supports is the weighted graph

G_{x} = (S, T, E = S \times T, w : E \to R)

, where each edge

e \in E

has weight

w (e) = \sum_{u \in s \cap t} p_{S} (u) + p_{T} (u)

.

The shared units graph

G_{x}

gives a measure of the weighted overlap of the supports between each collection. This information is used to match the supports between each collection. More precisely, supports between each collection are paired according to a maximum-weight perfect matching in

G_{x}

. For example, Figure 6 depicts the shared units graph

G_{x}

of the instance depicted in Figure 3, where, e.g.,

w (s_{1}, t_{1}) = 35

because

s_{1} \cap t_{1} = u_{1, 1}

and

p_{S} (u_{1, 1}) + p_{T} (u_{1, 1}) = 20 + 15 = 35

. After careful inspection, the maximum-weight perfect matching of the graph depicted in Figure 6 is

{(s_{1}, t_{1}), (s_{2}, t_{2}), (s_{3}, t_{3}), (s_{4}, t_{4})}

of weight 35 + 70 + 88 + 70 = 263. Thus, the supports for this instance of Figure 3 are paired accordingly, as represented by the matching colors.

In general, a maximum-weight perfect matching M in a weighted graph

G = (V, E, w)

can be found in time

O (| V | \log | V | + | V | \cdot | E |)

using a Fibonacci heap [25]. Note that, in general, the number of supports of one collection may be different than the other. Suppose, without loss of generality, that

| S | > | T |

for some pair

S, T

of collections of supports. In this case, a perfect matching M is found in the shared units graph

G_{x}

for

S, T

, and each remaining unmatched support in

S

is associated with one of the supports in

T

. After this process, each support in

t \in T

is matched with a support

s \in S

, and possibly with another subset

S \subseteq S

of supports. The idea is that

{s} \cup S

should be a contiguous set of supports in

S

; hence, the criteria for associating each unmatched support in

S

with some support

t \in T

is that the resulting subset

S \subseteq S

associated with t be such that

{s} \cup S

while maximizing

w (s^{'}, t)

for each

s^{'} \in S

. Since we would expect

| S | - | T |

to typically be a constant, all combinations could be tried to achieve this; hence, the overall procedure of pairing each support (or set of supports) of

S

with a support in

T

takes polynomial time. In cases where this does not hold (

| S | - | T |

is not a constant), a more efficient algorithm for determining (or approximating) this would be possible, which is the subject of future work.

The purpose of pairing each support s (or set S of supports) in one collection

S

with another support t in the other collection

T

is to determine how the trading procedure for aligning pair

S, T

of collections operates. In particular, each support

s \in S

(resp., set

S \subseteq S

) and its counterpart

t \in T

swap units with their respective neighbors until they are aligned (are on the same set of units). Figure 3c depicts an alignment of collections

S

(of Figure 3a) and

T

(of Figure 3b) according to the maximum-weight matching

{(s_{1}, t_{1}), (s_{2}, t_{2}), (s_{3}, t_{3}), (s_{4}, t_{4})}

in the shared units graph

G_{x}

of

S, T

depicted in Figure 6. While Figure 3c depicts an optimal alignment of these collections

S

and

T

, we outline a polynomial-time heuristic for the general case, since it is NP-hard (see Section 4).

Aligning a pair of collections of supports in two dimensions is a partitioning problem (the NP-hardness proof of this case based on a reduction from the Partitioning Problem in 3). Hence, we apply a straightforward greedy partitioning heuristic to the problem, which is slightly more general than the longest processing time first (LPT) scheduling heuristic [26,27]. In LPT scheduling, we are given a set of numbers and a positive integer m, and the goal is to partition this set into m subsets such that the largest sum of any subset (in terms of the values of its elements) is minimized. This problem is NP-hard because its decision version (the Partitioning Problem in 3) is NP-hard. The LPT scheduling heuristic is to order the elements of the set from largest to smallest and to iteratively place each element from this sorted list in the subset (of m subsets) with the smallest sum so far, until all elements are placed. In our variation we have two sets, one for each of the pair of collections; that is, given the set

U^{'} \subseteq U

of units on which the pair, say

S, T

, of collections disagree (based on the pairing of supports between

S

and

T

), we first sort

U^{'}

in descending order of population according to both

p_{S}

and

p_{T}

separately. We then partition

U^{'}

into two parts S and T, representing

S

and

T

, respectively. This is an iterative process which considers the part with the currently lower population (breaking ties arbitrarily) and adds the next element to this part according to its ordering. For example, if part S has the currently lower population, then

\sum_{u \in S} p_{S} (u) < \sum_{u \in T} p_{T} (u)

, and we would add the next largest element of

U^{'}

(according to

S

) to S. The iteration terminates when all elements of

U^{'}

have been assigned to either S or T.

For example, consider the instance

S, T

depicted in Figure 3, where the populations

p_{S} (u)

of each unit u in

s_{1}, s_{2}, s_{3}, s_{4}

of

S

are 20, 20, 10, and 15, respectively, while the populations

p_{T} (u)

of each unit u in

t_{1}, t_{2}, t_{3}, t_{4}

of

T

are 15, 15, 12, and 20, respectively. Here, the set

U^{'}

of units on which

S, T

disagree is

U^{'} = {u_{2, 1}, u_{3, 1}, u_{1, 2}, u_{2, 2}, u_{3, 2}, u_{2, 3}, u_{2, 4}}

, annotated with the red dots in Figure 3c Note that

U^{'}

is currently ordered in reverse lexicographic order, starting from the lower left corner (

u_{1, 1}

) and moving upward to the right row-by-row. By sorting this order in a stable way (the order of identical elements is not disturbed) according to

p_{S}

, it becomes

u_{2, 1} (20), u_{3, 1} (20), u_{3, 2} (20), u_{1, 2} (15), u_{2, 2} (15), u_{2, 3} (10), u_{2, 4} (10)

. By sorting this order in a stable way according to

p_{T}

, it becomes

u_{2, 4} (20), u_{2, 1} (15), u_{3, 1} (15), u_{1, 2} (15), u_{2, 2} (15),

u_{2, 3} (15), u_{3, 2} (12)

. The iteration then takes the steps indicated by Table 1, starting with empty parts S and T. After this process completes, the resulting parts S and T join (take their current color in) the corresponding supports

S

and

T

, respectively, producing the alignment. For example, the partitioning outlined in Table 1 produces the alignment depicted in Figure 3c, which is optimal.

In general, our greedy approach does not produce an optimal solution to Problem 1; however, an upper bound on the quality of the solution

\max {d_{w} (S, A), d_{w} (T, A)}

can be obtained based on known approximation factors for LPT scheduling [26,27]. Given some instance

S, T

of the Alignment Minimization Problem, let

σ (u) = {p_{S} (u), p_{T} (u)}

. For example, from the instance mentioned above,

σ (u_{2, 1}) = {p_{S} (u_{2, 1}), p_{T} (u_{2, 1})} = {20, 15}

. For some set

U^{'}

of units, let

M (U^{'}) = {\max (σ (u)) | u \in U^{'}}

, the maximum values of the pairs

σ (u)

of populations represented by each u. Note that our greedy approach obtains a partitioning by effectively applying LPT scheduling to

M (U^{'})

, where

U^{'}

is the set of units on which a pair

S, T

of collections disagree (see Table 1); hence, we can bound its quality based on known bounds for LPT scheduling. It is known that applying LPT scheduling to a set guarantees a solution that is within a factor of

\frac{4 m - 1}{3 m}

times the optimal (minimum) largest sum of any of the m subsets [26,27]. Supposing that we partition

M (U^{'})

into a pair (

m = 2

) of parts using LPT scheduling, let A be the part with the larger sum and

A^{*}

the part with the larger sum in the optimal partitioning of

M (U^{'})

into two parts. It then follows that

A \leq \frac{4 (2) - 1}{3 (2)} = \frac{7}{6} \cdot A^{*} .

Because the elements we are partitioning are indivisible, we know that

A^{*} \leq \frac{\sum (M (U^{'}))}{2} + \max (M (U^{'})),

where

\sum (X) = \sum_{x \in X} x

, a short form for the sum of all values in a set X of values. It follows that

A \leq \frac{7}{6} [\frac{\sum (M (U^{'}))}{2} + \max (M (U^{'}))] .

(1)

Let part S be the set of units represented by part A. The units of S were chosen based on the largest values from

M (U^{'})

at the time, as represented by A. The units of S are used to transform collection

S

of supports into another collection

A

of supports, while the remaining units of

U^{'} ∖ S

are used to transform collection

T

into

A

. It follows that

d_{w} (S, A) = m (S)

, where

m (S) = {\min (σ (u)) | u \in S}

, the minimum values of the pairs

σ (u)

of populations represented by each u. Since

m (S) \leq A

by design and since A is the larger part, i.e.,

m (U^{'} ∖ S) \leq A

as well, it follows from Equation (1) that

\max {d_{w} (S, A), d_{w} (T, A)} \leq \frac{7}{6} [\frac{\sum (M (U^{'}))}{2} + \max (M (U^{'}))] .

(2)

Since a typical instance

S, T

will contain many units u which do not differ much in

p_{S} (u)

and

p_{T} (u)

, nor is

\max (M (U^{'}))

and

\min (m (U^{'}))

expected to differ by much, each collection

S

and

T

will typically contribute close to half of its weight to the alignment

A

.

There remain some small and final details to address in this heuristic. One detail is that the units of

U^{'}

on which

S

and

T

disagree cannot be placed into parts arbitrarily; rather, the parts must be such that swapping their units results in an alignment

A

with supports that are contiguous (see the Alignment Minimization Problem in 1). The example outlined in Table 1 happens to create a contiguous set of supports, as depicted in Figure 3c. However, if units

u_{1, 2}

and

u_{2, 2}

were assigned to part S, for example, instead of to part T, then the green support would not be contiguous. In a general instance, such a constraint only needs to be minded for each contiguous set

U^{'}

of units on which

S

and

T

disagree. For each such contiguous set, some small local shuffles could be applied to each ordering of

U^{'}

according to

p_{S}

and

p_{T}

, respectively. Another solution could be to apply the iteration to S and T as-is while skipping any greedy choice that violates contiguity. In any case, the iteration will be no worse than (unordered) list scheduling [26]. In this case, it is known that applying list ordering to a set guarantees a solution within a factor of

2 - \frac{1}{m}

times the optimal (minimum) largest sum of any of the m subsets. Since

m = 2

in this case, it follows that this factor is

\frac{3}{2}

, and the same analysis as above can be applied. Since there will be few such constraints in the typical instance and since they only apply to contiguous sets of units of

U^{'}

, which will be typically small, the solution is expected to be much closer to

\frac{7}{6}

(see Equation (2)) than

\frac{3}{2}

in practice. The other detail is the unmatched supports (in, e.g.,

S

) associated with some support (e.g.,

t \in T

). In this case, the support in the final alignment

A

that represents these will present another alignment subproblem within that support, where this support could be split into several parts. Since such supports are expected to be small in general, all ways to align this support could be tried. Nonetheless, a more systematic procedure for minding such constraints is the subject of future work, along with a more definite approximation factor.

5.2. The Alignment Minimization Problem in Two Dimensions

We now outline how to extend the techniques used in the heuristic of Section 5.1 to a set

𝒞 = {C_{1}, \dots, C_{k}}

of (more than two) collections in two dimensions, i.e., to the Alignment Minimization Problem when

d = 2

. We first need to match supports across all collections

𝒞

in order to align them. This amounts to finding a maximum-weight perfect matching in a complete k-uniform hypergraph across all (k) collections of supports. We need the following definition of a shared units hypergraph, analogous to the shared units graph of Definition 1.

Definition 2

(Shared Units Hypergraph

H_{x}

). The shared units hypergraph for a set

𝒞 = {C_{1}, \dots, C_{k}}

of collections of spatial supports is the weighted hypergraph

H_{x} = (C_{1}, \dots, C_{k}, E = C_{1} \times \dots \times C_{k}, w (e) : E \to R)

, where each hyperedge

e \in E

has weight

w (e) = \sum_{(s, t) \in e^{2}} \sum_{u \in s \cap t}

p_{C (s)} (u) + p_{C (t)} (u)

, with

C (s)

as the collection that support s belongs to.

The weight of a hyperedge e of

H_{x}

is effectively the weighted overlap of the set of k supports, one from each collection

C_{1}, \dots, C_{k}

, represented by e in terms of the weighted overlap between each pair

s, t

of supports from e. Finding a perfect matching in

H_{x}

is NP-hard [28,29]. This problem is a special case of the k-set packing problem, which can be approximated within a factor of

\frac{k + 1 + ε}{3}

times the optimal packing [30,31]. Since k is typically a small constant (less than 10, for example), this bound is acceptable in practice. When the number of supports in the collections differ, a perfect matching M (of size

{arg min}_{C \in 𝒞} | C |

) is found in the shared units hypergraph

H_{x}

, and each remaining unmatched support s in any collection

C \in 𝒞

is associated with one of the hyperedges in M of maximum overlap with s. Similarly to the case with a pair of collections, the hyperedge that s joins should maintain a contiguous set of supports in

C (s)

, the collection that support s belongs to. Again, since

{arg max}_{C \in 𝒞} | C | - {arg min}_{C \in 𝒞} | C |

should typically be a constant, all combinations of hyperedges for s to join could be tried to achieve contiguity; however, a more efficient algorithm for determining these choices is the subject of future work.

Analogously to the case with a pair of collections (of Section 5.1), to match sets of supports across collections is to determine how the trading procedure for aligning collections

𝒞

operates. In particular, each set of supports from matching M in

H_{x}

(with the extra unmatched supports joined later) swaps units with its respective neighbors until they are aligned. Similar to the case with pairs, the matching M gives rise to a set

U^{'} \subseteq U

of units on which some pair

C, C^{'} \in 𝒞

of collections disagree. Each such unit must be assigned to some support (in M) in a way that minimizes overall cost. Aligning the units of

U^{'}

in this way is again a type of partitioning problem, which could also be approximated using LPT scheduling; however, a slightly more general partitioning problem is more appropriate in this case. In particular, this case is more closely related to the problem of fair item allocation [32] with additive preferences [33] and positively valued goods. Note that there also exist versions with negatively weighted goods or chores [34].

The input to this problem is a set N of

| N | = n

agents and a set M of

| M | = m

items. We use the elements

i \in N

of a set N and its corresponding indices

i \in {1, \dots, n}

interchangeably when the context is clear. Each agent

i \in N

attaches a value

v_{i} (j)

to item

j \in M

, where

v_{i} (j) \in Z^{+} \forall i \in N \forall j \in M

. We also overload the meaning of v for subsets

S \subseteq M

, where

v_{i} (S) = \sum_{j \in S} v_{i} (j)

, since the values are additive. Let

Π_{n} (M)

be the collection of all partitionings of set M into n parts. The goal is to find a partitioning in

Π_{n} (M)

that gives each agent their fairest share of value from the items. A common formalization for this is the maximin share [35] of agent

i \in N

from a set M of items, which is

μ_{i}^{n} (M) = \max_{(M_{1}, M_{2}, \dots, M_{n}) \in Π_{n} (M)} \min_{k \in N} v_{i} (M_{k}) .

(3)

The idea is that if agent

i \in N

divides items M into n parts, then other agents choose how these n parts are distributed among the n agents, then agent i would partition the items such that the value

v_{i}

of the smallest part

M_{k}

is maximized. In fair item allocation, the goal is to partition the items such that each agent

i \in N

has a value that is closest to their maximin share

μ_{i}

as possible. An important approximation result is that a partitioning

(M_{1}, \dots, M_{n}) \in Π_{n} (M)

which satisfies

v_{i} (M_{i}) \geq \frac{2}{3} μ_{i}^{n} (M) \forall i \in N

(4)

can be found in polynomial time [35].

Our problem of aligning each unit of

U^{'}

is closely related to this problem, in that each collection

C \in 𝒞

is an agent and each unit

u \in U^{'}

is an item that gets assigned to some collection when aligned, where

v_{C} (u) = p_{C} (u)

. The only difference is that the collection

C

to which u is assigned avoids the cost

p_{C} (u)

, while every other collection

C^{'} \in 𝒞 ∖ {C}

incurs (at most) its corresponding cost

p_{C^{'}} (u)

. Since we want to minimize the maximum cost to any collection (see Problem 1), for each collection

i \in N = 𝒞

, given set

M = U^{'}

of units, we aim to minimize

γ_{i}^{n} (M) = \min_{(M_{1}, M_{2}, \dots, M_{n}) \in Π_{n} (M)} \max_{k \in N} \sum_{j \in N ∖ {k}} v_{i} (M_{j}) .

(5)

Note that this is equivalent to

γ_{i}^{n} (M) = \min_{(M_{1}, M_{2}, \dots, M_{n}) \in Π_{n} (M)} \max_{k \in N} C_{i} (M) - v_{i} (M_{k}),

where

C_{i} (M) = \sum_{j \in M} v_{i} (j)

. Since

C_{i} (M)

does not depend on the partition chosen from

Π_{n} (M)

, it follows that

γ_{i}^{n} (M) = C_{i} (M) + \min_{(M_{1}, M_{2}, \dots, M_{n}) \in Π_{n} (M)} \max_{k \in N} - v_{i} (M_{k}) .

In pulling the minus sign through to the front, it follows that

γ_{i}^{n} (M) = C_{i} (M) - \max_{(M_{1}, M_{2}, \dots, M_{n}) \in Π_{n} (M)} \min_{k \in N} v_{i} (M_{k}) .

We can then substitute the left-hand side of Equation (3) with the right-hand side to obtain

γ_{i}^{n} (M) = C_{i} (M) - μ_{i}^{n} (M) .

(6)

Then, based on the result of Equation (4), it follows that a partitioning

(M_{1}, \dots, M_{n}) \in Π_{n} (M)

which satisfies

\sum_{j \in N ∖ {i}} v_{i} (M_{j}) \leq C_{i} (M) - \frac{2}{3} μ_{i}^{n} (M) \forall i \in N

(7)

can be found in polynomial time.

Placing this result in the notation of our problem, where

N = 𝒞

, and

M = U^{'}

, it follows that

\sum_{j \in N ∖ {i}} v_{i} (M_{j}) = \sum_{C^{'} \in 𝒞 ∖ {C}} p_{C} ({U^{'}}_{C^{'}})

, where

{U^{'}}_{C^{'}}

are the units from

U^{'}

assigned to collection

C^{'}

in the alignment

A

represented by partitioning

(U_{1}, \dots, U_{n}) \in Π_{n} (U^{'})

and the meaning of

p_{C}

has been overloaded for sets, where

p_{C} (U^{'}) = \sum_{u \in U^{'}} p_{C} (u)

. Observe that

d_{w} (C, A) = \sum_{C^{'} \in 𝒞 ∖ {C}} p_{C} ({U^{'}}_{C^{'}})

. Then, it follows from Equation (7) that

d_{w} (C, A) \leq \sum_{u \in U^{'}} p_{C} (u) - \frac{2}{3} μ_{C}^{n} (U^{'}) \forall C \in 𝒞,

(8)

where

U^{'}

is the set of units on which some pair of collections of

𝒞

disagree and

μ_{C}^{n} (U^{'})

is the maximin share of collection

C

from set

U^{'}

of units, where the value of a unit is

p_{C} (u)

. This guarantees a bound on

\max {d_{w} (C, A) | C \in 𝒞}

(see Problem 1) which can be obtained in polynomial time. The approximation result of Equation (4) from [35] relies on a complex preprocessing step from [36] to guarantee this theoretical bound. However, it is practical to use a more straightforward approach based on the envy graph procedure [34,35,37], which is a common approach used for fair item allocation. Such an approach iterates through the items, assigning them to agents. if an envy cycle—a directed cycle on a set of agents where each agent places more value on the intermediate set of items of its neighbor—ever arises during this process, then this cycle is broken by shifting this cycle one step in opposite direction. This process continues until all items are assigned to some agent. While there are many theoretical results in this area of fair item allocation, there exist some practical results (such as Spliddit [38], based on theoretical results in [39]). This is similar to the case of pairs in two dimensions (Section 5.1), maintaining contiguity, and how to manage the unmatched supports that are associated after the matching is computed. We plan to use or follow these ideas in devising a practical algorithm for our problem. An efficient implementation addressing all of these details is the subject of future work.

6. Conclusions

In this paper, we introduce an alignment problem for reconciling misaligned boundaries of regions comprising spatial units. While the rather trivial case in one dimension is tractable, the general problem is combinatorially (NP-) hard. We devise some heuristics for the case in two dimensions, since it has applications to geospatial problems such as resource allocation or building disease maps.

Future work entails further investigation into small details such as efficient approaches for maintaining contiguity in the trading procedure, as well as how to systematically handle collections with different numbers of supports. Developing an implementation which works efficiently in practice is also the subject of future work, and would allow for application to real geospatial problems such as automated map building and resource allocation.

Author Contributions

Conceptualization, all authors; methodology, A.R.M. and M.P.; validation, M.P.; formal analysis, M.P.; investigation, M.P.; resources, all authors; data curation, C.T.; writing—original draft preparation, C.T. and M.P.; writing—review and editing, all authors; visualization, C.T.; supervision, C.T. and M.P.; project administration, C.T.; funding acquisition, A.R.M. and C.T. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

This article does not have any associated data.

Acknowledgments

The authors would like to thank Emma L. McDaniel for advice on the problem formulation, as well as Alexander Zelikovsky for some helpful discussions on interpreting the approximation results.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

NP	Non-deterministic polynomial time
LPT	Longest processing time first

References

Boscoe, F.P.; Ward, M.H.; Reynolds, P. Current practices in spatial analysis of cancer data: Data characteristics and data sources for geographic studies of cancer. Int. J. Health Geogr. 2004, 3, 28. [Google Scholar] [CrossRef][Green Version]
Tatalovich, Z.; Chtourou, A.; Zhu, L.; Dellavalle, C.; Hanson, H.A.; Henry, K.A.; Penberthy, L. Landscape analysis of environmental data sources for linkage with SEER cancer patients database. JNCI Monogr. 2024, 2024, 132–144. [Google Scholar] [CrossRef] [PubMed]
Jarup, L. Health and environment information systems for exposure and disease mapping, and risk assessment. Environ. Health Perspect. 2004, 112, 995–997. [Google Scholar] [CrossRef] [PubMed]
Tatalovich, Z.; Stinchcomb, D.G.; Ng, D.; Yu, M.; Lewis, D.R.; Zhu, L.; Feuer, E.J. Developing geographic areas for cancer reporting using automated zone design. Am. J. Epidemiol. 2022, 191, 2109–2119. [Google Scholar] [CrossRef] [PubMed]
Sahar, L.; Foster, S.L.; Sherman, R.L.; Henry, K.A.; Goldberg, D.W.; Stinchcomb, D.G.; Bauer, J.E. GIScience and cancer: State of the art and trends for cancer surveillance and epidemiology. Cancer 2019, 125, 2544–2560. [Google Scholar] [CrossRef]
Roquette, R.; Nunes, B.; Painho, M. The relevance of spatial aggregation level and of applied methods in the analysis of geographical distribution of cancer mortality in mainland Portugal (2009–2013). Popul. Health Metr. 2018, 16, 6. [Google Scholar] [CrossRef]
Maantay, J. Mapping environmental injustices: Pitfalls and potential of geographic information systems in assessing environmental health and equity. Environ. Health Perspect. 2002, 110, 161–171. [Google Scholar] [CrossRef]
VoPham, T.; White, A.J.; Jones, R.R. Geospatial Science for the Environmental Epidemiology of Cancer in the Exposome Era. Cancer Epidemiol. Biomark. Prev. 2024, 33, 451–460. [Google Scholar] [CrossRef]
Jimenez, T.; Mikler, A.R.; Tiwari, C. A Novel Space Partitioning Algorithm to Improve Current Practices in Facility Placement. IEEE Trans. Syst. Man Cybern. 2012, 42, 1194–1205. [Google Scholar] [CrossRef]
Martin, D.; Nolan, A.; Tranmer, M. The application of zone-design methodology in the 2001 UK Census. Environ. Plan. A 2001, 33, 1949–1962. [Google Scholar] [CrossRef]
Martin, D. Extending the automated zoning procedure to reconcile incompatible zoning systems. Int. J. Geogr. Inf. Sci. 2003, 17, 181–196. [Google Scholar] [CrossRef]
Herring, J.R. (Ed.) OpenGIS Implementation Standard for Geographic Information–Simple Feature Access–Part 1: Common Architecture; Version 1.2.1, OGC 06-103r4; Open Geospatial Consortium: Arlington, VA, USA, 2011. [Google Scholar]
Bithell, J.F. A classification of disease mapping methods. Stat. Med. 2000, 19, 2203–2215. [Google Scholar] [CrossRef] [PubMed]
Rushton, G. Public Health, GIS, and Spatial Analytic Tools. Annu. Rev. Public Health 2003, 24, 43–56. [Google Scholar] [CrossRef]
Beyer, K.M.M.; Tiwari, C.; Rushton, G. Five essential properties of disease maps. Ann. Assoc. Am. Geogr. 2012, 102, 1067–1075. [Google Scholar] [CrossRef]
Rushton, G.; Lolonis, P. Exploratory spatial analysis of birth defect rates in an urban population. Stat. Med. 1996, 15, 717–726. [Google Scholar] [CrossRef]
Clayton, D.; Kaldor, J. Empirical Bayes estimates of age-standardized relative risks for use in disease mapping. Biometrics 1987, 43, 671–681. [Google Scholar] [CrossRef]
MacNab, Y.C. Bayesian disease mapping: Past, present, and future. Spat. Stat. 2022, 50, 100593. [Google Scholar] [CrossRef]
Tiwari, C.; Rushton, G. Using spatially adaptive filters to map late stage colorectal cancer incidence in Iowa. In Developments in Spatial Data Handling: 11 th International Symposium on Spatial Data Handling; Springer: Berlin/Heidelberg, Germany, 2005; pp. 665–676. [Google Scholar]
Shi, X. Selection of bandwidth type and adjustment side in kernel density estimation over inhomogeneous backgrounds. Int. J. Geogr. Inf. Sci. 2010, 24, 643–660. [Google Scholar] [CrossRef]
Hansen, K.M. Head-banging: Robust smoothing in the plane. IEEE Trans. Geosci. Remote Sens. 1991, 29, 369–378. [Google Scholar] [CrossRef]
Mungiole, M.; Pickle, L.W.; Simonson, K.H. Application of a weighted head-banging algorithm to mortality data maps. Stat. Med. 1999, 18, 3201–3209. [Google Scholar] [CrossRef]
Talbot, T.O.; Kulldorff, M.; Forand, S.P.; Haley, V.B. Evaluation of spatial filters to create smoothed maps of health data. Stat. Med. 2000, 19, 2399–2408. [Google Scholar] [CrossRef] [PubMed]
Korf, R.E. Multi-way number partitioning. In Proceedings of the the 21st International Joint Conferences on Artificial Intelligence (IJCAI), Pasadena, CA, USA, 11–17 July 2009; pp. 538–543. [Google Scholar]
Fredman, M.; Tarjan, R. Fibonacci Heaps and Their Uses in Improved Network Optimization Algorithms. J. ACM 1987, 34, 596–615. [Google Scholar] [CrossRef]
Graham, R.L. Bounds on Multiprocessing Timing Anomalies. SIAM J. Appl. Math. 1969, 17, 416–429. [Google Scholar] [CrossRef]
Coffman, E.G.; Sethi, R. A generalized bound on LPT sequencing. In Proceedings of the 1976 ACM SIGMETRICS Conference on Computer Performance Modeling Measurement and Evaluation, Cambridge, MA, USA, 29–31 March 1976; pp. 306–310. [Google Scholar] [CrossRef]
Karp, R.M. Reducibility among combinatorial problems. In Complexity of Computer Computations; Springer: Berlin/Heidelberg, Germany, 1972; pp. 85–103. [Google Scholar]
Garey, M.R.; Johnson, D.S. Computers and Intracability: A Guide to the Theory of NP-Completeness; W.H. Freeman and Compnany: New York, NY, USA, 1979. [Google Scholar]
Cygan, M. Improved Approximation for 3-Dimensional Matching via Bounded Pathwidth Local Search. In Proceedings of the 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, Berkeley, CA, USA, 26–29 October 2013; pp. 509–518. [Google Scholar] [CrossRef]
Fürer, M.; Yu, H. Approximating the k-Set Packing Problem by Local Improvements. In International Symposium on Combinatorial Optimization; Springer: Berlin/Heidelberg, Germany, 2014; pp. 408–420. [Google Scholar]
Demko, S.; Hill, T.P. Equitable distribution of indivisible objects. Math. Soc. Sci. 1988, 16, 145–158. [Google Scholar] [CrossRef][Green Version]
Bouveret, S.; Endriss, U.; Lang, J. Fair Division under Ordinal Preferences: Computing Envy-Free Allocations of Indivisible Goods. In Proceedings of the 2010 Conference on ECAI 2010: 19th European Conference on Artificial Intelligence, Lisbon, Portugal, 16–20 August 2010; pp. 387–392. [Google Scholar]
Aziz, H.; Rauchecker, G.; Schryen, G.; Walsh, T. Algorithms for Max-Min Share Fair Allocation of Indivisible Chores. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence; AAAI Press: Washington, DC, USA, 2017; Volume 31. [Google Scholar]
Barman, S.; Krishnamurthy, S.K. Approximation Algorithms for Maximin Fair Division. ACM Trans. Econ. Comput. 2020, 8, 5. [Google Scholar] [CrossRef]
Bouveret, S.; Lemaître, M. Characterizing conflicts in fair division of indivisible goods using a scale of criteria. Proc. Auton. Agents Multi-Agent Syst. 2016, 30, 259–290. [Google Scholar] [CrossRef]
Lipton, R.J.; Markakis, E.; Mossel, E.; Saberi, A. On approximately fair allocations of indivisible goods. In Proceedings of the 5th ACM Conference on Electronic Commerce, New York, NY, USA, 17–20 May 2004; pp. 125–131. [Google Scholar] [CrossRef]
Goldman, J.; Procaccia, A.D. Spliddit: Unleashing fair division algorithms. SIGecom Exch. 2015, 13, 41–46. [Google Scholar] [CrossRef]
Procaccia, A.D.; Wang, J. Fair enough: Guaranteeing approximate maximin shares. In Proceedings of the Fifteenth ACM Conference on Economics and Computation, Palo Alto, CA, USA, 8–12 June 2014; pp. 675–692. [Google Scholar] [CrossRef]

Figure 1. Regions formed by aggregating spatial units such as census tracts.

Figure 2. Addressing spatial misalignment between regions.

Figure 3. Collections (a)

S = {s_{1}, s_{2}, s_{3}, s_{4}}

and (b)

T = {t_{1}, t_{2}, t_{3}, t_{4}}

of spatial supports over the set

U = {u_{1, 1}, u_{1, 2}, \dots, u_{4, 4}}

of 16 spatial units, and an alignment (c) of

S

and

T

. The red dots mark the units (

u_{2, 1}, u_{3, 1}, u_{1, 2}, u_{2, 2}, u_{3, 2}, u_{2, 3}, u_{2, 4}

) on which

S

and

T

disagree.

Figure 3. Collections (a)

S = {s_{1}, s_{2}, s_{3}, s_{4}}

and (b)

T = {t_{1}, t_{2}, t_{3}, t_{4}}

of spatial supports over the set

U = {u_{1, 1}, u_{1, 2}, \dots, u_{4, 4}}

of 16 spatial units, and an alignment (c) of

S

and

T

. The red dots mark the units (

u_{2, 1}, u_{3, 1}, u_{1, 2}, u_{2, 2}, u_{3, 2}, u_{2, 3}, u_{2, 4}

) on which

S

and

T

disagree.

Figure 4. Four collections of three spatial supports (green, orange, and blue) over the set

U = {u_{1}, u_{2}, \dots, u_{9}}

of nine spatial units. All disagreements between (the supports of) any pair of collections are contained in the two transparent windows.

Figure 4. Four collections of three spatial supports (green, orange, and blue) over the set

U = {u_{1}, u_{2}, \dots, u_{9}}

of nine spatial units. All disagreements between (the supports of) any pair of collections are contained in the two transparent windows.

Figure 5. Base set

U \cup {a, b} = {a, u_{1}, \dots, u_{n}, b}

of spatial units.

Figure 5. Base set

U \cup {a, b} = {a, u_{1}, \dots, u_{n}, b}

of spatial units.

Figure 6. The shared units graph

G_{x}

(Definition 1) of the instance

S, T

depicted in Figure 3, where the populations

p_{S} (u)

of each unit u in

s_{1}, s_{2}, s_{3}, s_{4}

of

S

are 20, 20, 10, and 15, respectively, while the populations

p_{T} (u)

of each unit u in

t_{1}, t_{2}, t_{3}, t_{4}

of

T

are 15, 15, 12, and 20, respectively. Edges with weights of zero are not shown for easier readability.

Figure 6. The shared units graph

G_{x}

(Definition 1) of the instance

S, T

depicted in Figure 3, where the populations

p_{S} (u)

of each unit u in

s_{1}, s_{2}, s_{3}, s_{4}

of

S

are 20, 20, 10, and 15, respectively, while the populations

p_{T} (u)

of each unit u in

t_{1}, t_{2}, t_{3}, t_{4}

of

T

are 15, 15, 12, and 20, respectively. Edges with weights of zero are not shown for easier readability.

Table 1. Steps taken by the greedy approach to create parts S and T with the respective total populations in parentheses.

Step	Action	Part S	Part T
0	initialize S & T	∅	∅
1	$S \leftarrow u_{2, 1} (20)$	${u_{2, 1}}$ (20)	∅
2	$T \leftarrow u_{2, 4} (20)$	${u_{2, 1}}$ (20)	${u_{2, 4}}$ (20)
3	$S \leftarrow u_{3, 1} (20)$	${u_{2, 1}, u_{3, 1}}$ (40)	${u_{2, 4}}$ (20)
4	$T \leftarrow u_{1, 2} (15)$	${u_{2, 1}, u_{3, 1}}$ (40)	${u_{2, 4}, u_{1, 2}}$ (35)
5	$T \leftarrow u_{2, 2} (15)$	${u_{2, 1}, u_{3, 1}}$ (40)	${u_{2, 4}, u_{1, 2}, u_{2, 2}}$ (50)
6	$S \leftarrow u_{3, 2} (20)$	${u_{2, 1}, u_{3, 1}, u_{3, 2}}$ (60)	${u_{2, 4}, u_{1, 2}, u_{2, 2}}$ (50)
7	$T \leftarrow u_{2, 3} (15)$	${u_{2, 1}, u_{3, 1}, u_{3, 2}}$ (60)	${u_{2, 4}, u_{1, 2}, u_{2, 2}, u_{2, 3}}$ (65)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mikler, A.R.; Tiwari, C.; Patterson, M. A Spatial Alignment Problem. Algorithms 2026, 19, 475. https://doi.org/10.3390/a19060475

AMA Style

Mikler AR, Tiwari C, Patterson M. A Spatial Alignment Problem. Algorithms. 2026; 19(6):475. https://doi.org/10.3390/a19060475

Chicago/Turabian Style

Mikler, Armin R., Chetan Tiwari, and Murray Patterson. 2026. "A Spatial Alignment Problem" Algorithms 19, no. 6: 475. https://doi.org/10.3390/a19060475

APA Style

Mikler, A. R., Tiwari, C., & Patterson, M. (2026). A Spatial Alignment Problem. Algorithms, 19(6), 475. https://doi.org/10.3390/a19060475

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Spatial Alignment Problem

Abstract

1. Introduction

2. Problem Formulation

3. Tractability Results

4. Hardness Results

5. A Heuristic for the Two-Dimensional Case

5.1. Aligning a Pair of Collections in Two Dimensions

5.2. The Alignment Minimization Problem in Two Dimensions

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI