5.1. Aligning a Pair of Collections in Two Dimensions
Figure 3 depicts an example of this special case of aligning only two collections of supports, in this case (a) and (b). Thus, it is a special case of the Alignment Minimization Problem in 1 when
and
. Since there are only two collections, the idea is that if each support in one collection is matched up with another support in the other collection, then these pairs of supports are aligned with each other. For example, in the instance depicted in
Figure 3,
is matched with
for each
. We first need the following definition of a shared units graph.
Definition 1 (Shared Units Graph ). The shared units graph for a pair of collections of spatial supports is the weighted graph , where each edge has weight .
The shared units graph
gives a measure of the weighted overlap of the supports between each collection. This information is used to match the supports between each collection. More precisely, supports between each collection are paired according to a maximum-weight perfect matching in
. For example,
Figure 6 depicts the shared units graph
of the instance depicted in
Figure 3, where, e.g.,
because
and
. After careful inspection, the maximum-weight perfect matching of the graph depicted in
Figure 6 is
of weight 35 + 70 + 88 + 70 = 263. Thus, the supports for this instance of
Figure 3 are paired accordingly, as represented by the matching colors.
In general, a maximum-weight perfect matching
M in a weighted graph
can be found in time
using a Fibonacci heap [
25]. Note that, in general, the number of supports of one collection may be different than the other. Suppose, without loss of generality, that
for some pair
of collections of supports. In this case, a perfect matching
M is found in the shared units graph
for
, and each remaining unmatched support in
is associated with one of the supports in
. After this process, each support in
is matched with a support
, and possibly with another subset
of supports. The idea is that
should be a contiguous set of supports in
; hence, the criteria for associating each unmatched support in
with some support
is that the resulting subset
associated with
t be such that
while maximizing
for each
. Since we would expect
to typically be a constant, all combinations could be tried to achieve this; hence, the overall procedure of pairing each support (or set of supports) of
with a support in
takes polynomial time. In cases where this does not hold (
is not a constant), a more efficient algorithm for determining (or approximating) this would be possible, which is the subject of future work.
The purpose of pairing each support
s (or set
S of supports) in one collection
with another support
t in the other collection
is to determine how the trading procedure for aligning pair
of collections operates. In particular, each support
(resp., set
) and its counterpart
swap units with their respective neighbors until they are aligned (are on the same set of units).
Figure 3c depicts an alignment of collections
(of
Figure 3a) and
(of
Figure 3b) according to the maximum-weight matching
in the shared units graph
of
depicted in
Figure 6. While
Figure 3c depicts an optimal alignment of these collections
and
, we outline a polynomial-time heuristic for the general case, since it is NP-hard (see
Section 4).
Aligning a pair of collections of supports in two dimensions is a partitioning problem (the NP-hardness proof of this case based on a reduction from the Partitioning Problem in 3). Hence, we apply a straightforward greedy partitioning heuristic to the problem, which is slightly more general than the longest processing time first (LPT) scheduling heuristic [
26,
27]. In LPT scheduling, we are given a set of numbers and a positive integer
m, and the goal is to partition this set into
m subsets such that the largest sum of any subset (in terms of the values of its elements) is minimized. This problem is NP-hard because its decision version (the Partitioning Problem in 3) is NP-hard. The LPT scheduling heuristic is to order the elements of the set from largest to smallest and to iteratively place each element from this sorted list in the subset (of
m subsets) with the smallest sum so far, until all elements are placed. In our variation we have two sets, one for each of the pair of collections; that is, given the set
of units on which the pair, say
, of collections disagree (based on the pairing of supports between
and
), we first sort
in descending order of population according to both
and
separately. We then partition
into two parts
S and
T, representing
and
, respectively. This is an iterative process which considers the part with the currently lower population (breaking ties arbitrarily) and adds the next element to this part according to its ordering. For example, if part
S has the currently lower population, then
, and we would add the next largest element of
(according to
) to
S. The iteration terminates when all elements of
have been assigned to either
S or
T.
For example, consider the instance
depicted in
Figure 3, where the populations
of each unit
u in
of
are 20, 20, 10, and 15, respectively, while the populations
of each unit
u in
of
are 15, 15, 12, and 20, respectively. Here, the set
of units on which
disagree is
, annotated with the red dots in
Figure 3c Note that
is currently ordered in reverse lexicographic order, starting from the lower left corner (
) and moving upward to the right row-by-row. By sorting this order in a stable way (the order of identical elements is not disturbed) according to
, it becomes
. By sorting this order in a stable way according to
, it becomes
. The iteration then takes the steps indicated by
Table 1, starting with empty parts
S and
T. After this process completes, the resulting parts
S and
T join (take their current color in) the corresponding supports
and
, respectively, producing the alignment. For example, the partitioning outlined in
Table 1 produces the alignment depicted in
Figure 3c, which is optimal.
In general, our greedy approach does not produce an optimal solution to Problem 1; however, an upper bound on the quality of the solution
can be obtained based on known approximation factors for LPT scheduling [
26,
27]. Given some instance
of the Alignment Minimization Problem, let
. For example, from the instance mentioned above,
. For some set
of units, let
, the maximum values of the pairs
of populations represented by each
u. Note that our greedy approach obtains a partitioning by effectively applying LPT scheduling to
, where
is the set of units on which a pair
of collections disagree (see
Table 1); hence, we can bound its quality based on known bounds for LPT scheduling. It is known that applying LPT scheduling to a set guarantees a solution that is within a factor of
times the optimal (minimum) largest sum of any of the
m subsets [
26,
27]. Supposing that we partition
into a pair (
) of parts using LPT scheduling, let
A be the part with the larger sum and
the part with the larger sum in the optimal partitioning of
into two parts. It then follows that
Because the elements we are partitioning are indivisible, we know that
where
, a short form for the sum of all values in a set
X of values. It follows that
Let part
S be the set of units represented by part
A. The units of
S were chosen based on the largest values from
at the time, as represented by
A. The units of
S are used to transform collection
of supports into another collection
of supports, while the remaining units of
are used to transform collection
into
. It follows that
, where
, the minimum values of the pairs
of populations represented by each
u. Since
by design and since
A is the larger part, i.e.,
as well, it follows from Equation (
1) that
Since a typical instance
will contain many units
u which do not differ much in
and
, nor is
and
expected to differ by much, each collection
and
will typically contribute close to half of its weight to the alignment
.
There remain some small and final details to address in this heuristic. One detail is that the units of
on which
and
disagree cannot be placed into parts arbitrarily; rather, the parts must be such that swapping their units results in an alignment
with supports that are contiguous (see the Alignment Minimization Problem in 1). The example outlined in
Table 1 happens to create a contiguous set of supports, as depicted in
Figure 3c. However, if units
and
were assigned to part
S, for example, instead of to part
T, then the green support would not be contiguous. In a general instance, such a constraint only needs to be minded for each contiguous set
of units on which
and
disagree. For each such contiguous set, some small local shuffles could be applied to each ordering of
according to
and
, respectively. Another solution could be to apply the iteration to
S and
T as-is while skipping any greedy choice that violates contiguity. In any case, the iteration will be no worse than (unordered) list scheduling [
26]. In this case, it is known that applying list ordering to a set guarantees a solution within a factor of
times the optimal (minimum) largest sum of any of the
m subsets. Since
in this case, it follows that this factor is
, and the same analysis as above can be applied. Since there will be few such constraints in the typical instance and since they only apply to contiguous sets of units of
, which will be typically small, the solution is expected to be much closer to
(see Equation (
2)) than
in practice. The other detail is the unmatched supports (in, e.g.,
) associated with some support (e.g.,
). In this case, the support in the final alignment
that represents these will present another alignment subproblem within that support, where this support could be split into several parts. Since such supports are expected to be small in general, all ways to align this support could be tried. Nonetheless, a more systematic procedure for minding such constraints is the subject of future work, along with a more definite approximation factor.
5.2. The Alignment Minimization Problem in Two Dimensions
We now outline how to extend the techniques used in the heuristic of
Section 5.1 to a set
of (more than two) collections in two dimensions, i.e., to the Alignment Minimization Problem when
. We first need to match supports across all collections
in order to align them. This amounts to finding a maximum-weight perfect matching in a complete
k-uniform hypergraph across all (
k) collections of supports. We need the following definition of a shared units hypergraph, analogous to the shared units graph of Definition 1.
Definition 2 (Shared Units Hypergraph ). The shared units hypergraph for a set of collections of spatial supports is the weighted hypergraph , where each hyperedge has weight , with as the collection that support s belongs to.
The weight of a hyperedge
e of
is effectively the weighted overlap of the set of
k supports, one from each collection
, represented by
e in terms of the weighted overlap between each pair
of supports from
e. Finding a perfect matching in
is NP-hard [
28,
29]. This problem is a special case of the
k-set packing problem, which can be approximated within a factor of
times the optimal packing [
30,
31]. Since
k is typically a small constant (less than 10, for example), this bound is acceptable in practice. When the number of supports in the collections differ, a perfect matching
M (of size
) is found in the shared units hypergraph
, and each remaining unmatched support
s in any collection
is associated with one of the hyperedges in
M of maximum overlap with
s. Similarly to the case with a pair of collections, the hyperedge that
s joins should maintain a contiguous set of supports in
, the collection that support
s belongs to. Again, since
should typically be a constant, all combinations of hyperedges for
s to join could be tried to achieve contiguity; however, a more efficient algorithm for determining these choices is the subject of future work.
Analogously to the case with a pair of collections (of
Section 5.1), to match sets of supports across collections is to determine how the trading procedure for aligning collections
operates. In particular, each set of supports from matching
M in
(with the extra unmatched supports joined later) swaps units with its respective neighbors until they are aligned. Similar to the case with pairs, the matching
M gives rise to a set
of units on which some pair
of collections disagree. Each such unit must be assigned to some support (in
M) in a way that minimizes overall cost. Aligning the units of
in this way is again a type of partitioning problem, which could also be approximated using LPT scheduling; however, a slightly more general partitioning problem is more appropriate in this case. In particular, this case is more closely related to the problem of fair item allocation [
32] with additive preferences [
33] and positively valued goods. Note that there also exist versions with negatively weighted goods or chores [
34].
The input to this problem is a set
N of
agents and a set
M of
items. We use the elements
of a set
N and its corresponding indices
interchangeably when the context is clear. Each agent
attaches a value
to item
, where
. We also overload the meaning of
v for subsets
, where
, since the values are additive. Let
be the collection of all partitionings of set
M into
n parts. The goal is to find a partitioning in
that gives each agent their fairest share of value from the items. A common formalization for this is the maximin share [
35] of agent
from a set
M of items, which is
The idea is that if agent
divides items
M into
n parts, then other agents choose how these
n parts are distributed among the
n agents, then agent
i would partition the items such that the value
of the smallest part
is maximized. In fair item allocation, the goal is to partition the items such that each agent
has a value that is closest to their maximin share
as possible. An important approximation result is that a partitioning
which satisfies
can be found in polynomial time [
35].
Our problem of aligning each unit of
is closely related to this problem, in that each collection
is an agent and each unit
is an item that gets assigned to some collection when aligned, where
. The only difference is that the collection
to which
u is assigned avoids the cost
, while every other collection
incurs (at most) its corresponding cost
. Since we want to minimize the maximum cost to any collection (see Problem 1), for each collection
, given set
of units, we aim to minimize
Note that this is equivalent to
where
. Since
does not depend on the partition chosen from
, it follows that
In pulling the minus sign through to the front, it follows that
We can then substitute the left-hand side of Equation (
3) with the right-hand side to obtain
Then, based on the result of Equation (
4), it follows that a partitioning
which satisfies
can be found in polynomial time.
Placing this result in the notation of our problem, where
, and
, it follows that
, where
are the units from
assigned to collection
in the alignment
represented by partitioning
and the meaning of
has been overloaded for sets, where
. Observe that
. Then, it follows from Equation (
7) that
where
is the set of units on which some pair of collections of
disagree and
is the maximin share of collection
from set
of units, where the value of a unit is
. This guarantees a bound on
(see Problem 1) which can be obtained in polynomial time. The approximation result of Equation (
4) from [
35] relies on a complex preprocessing step from [
36] to guarantee this theoretical bound. However, it is practical to use a more straightforward approach based on the envy graph procedure [
34,
35,
37], which is a common approach used for fair item allocation. Such an approach iterates through the items, assigning them to agents. if an envy cycle—a directed cycle on a set of agents where each agent places more value on the intermediate set of items of its neighbor—ever arises during this process, then this cycle is broken by shifting this cycle one step in opposite direction. This process continues until all items are assigned to some agent. While there are many theoretical results in this area of fair item allocation, there exist some practical results (such as Spliddit [
38], based on theoretical results in [
39]). This is similar to the case of pairs in two dimensions (
Section 5.1), maintaining contiguity, and how to manage the unmatched supports that are associated after the matching is computed. We plan to use or follow these ideas in devising a practical algorithm for our problem. An efficient implementation addressing all of these details is the subject of future work.