3.1.1. Parameterized Intractability of Biclique and Applications to Parameterized Inapproximability
In this subsubsection, we will discuss the parametrized inapproximability of the one-sided biclique problem, and show how both that result and its proof technique lead to more inapproximability results.
We begin our discussion by formally stating the k
problem where we are given as input a graph G
and an integer k
, and the goal is to determine whether G
contains a complete bipartite subgraph with k
vertices on each side. The complexity of k
was a long standing open problem and was resolved only recently by Lin [23
] where he showed that it is
-hard. In fact, he showed a much stronger result and this shall be the focus of attention in this subsubsection.
]). Given a bipartite graph and as input, it is W-hard to distinguish between the following two cases:
We shall refer to the gap problem in the above theorem as the One-Sided k
problem. To prove the above result, Lin introduced a technique which we shall refer to as Gadget Composition. The gadget composition technique has found more applications since [23
]. We provide below a failed approach (given in [23
]) to prove the above theorem; nonetheless it gives us good insight into how the gadget composition technique works.
Suppose we can construct a set family of subsets of for some integers k, n and (for example, and ) such that:
Then we can combine with an instance of k-Clique to obtain a gap instance of One-Sided k-Biclique as follows. Given a graph G and parameter k with , we construct our instance of One-Sided k-Biclique, say by setting and , where for any and , we have that if and only if . Let . It is easy to check that if G has a k-vertex clique, say is a clique in G, then Property 2 implies that . It follows that the set of s vertices in L given by are neighbors of every vertex in . On the other hand, if G contains no k-vertex clique, then any s distinct vertices in L (i.e., s edges in G) must have at least vertices in G as their end points. Say was the set of all vertices contained the s edges. By Property 1, we know that , and thus any s distinct vertices in L have at most ℓ common neighbors in R.
It is indeed very surprising that this technique can yield non-trivial inapproximability results, as the gap is essentially produced from the gadget and is oblivious to the input! This also stands in stark contrast to the PCP theorem and hardness of approximation results in NP, where all known results were obtained by global transformations on the input. The key difference between the parameterized and NP worlds is the notion of locality. For example, consider the k-Clique problem, if a graph does not have a clique of size k, then given any k vertices, a random vertex pair in these k vertices does not have an edge with probability at least . It is philosophically possible to compose the input graph with a simple error correcting code to amplify this probability to a constant, as we are allowed to blowup the input size by any function of k. In contrast, when k is not fixed, like in the NP world, k is of the same magnitude as the input size, and thus we are only allowed to blow up the input size by factor. Nonetheless, we have to point out that the gadgets typically needed to make the gadget composition technique work must be extremely rich in combinatorial structure (and are typically constructed from random objects or algebraic objects), and were previously studied extensively in the area of extremal combinatorics.
Returning to the reduction above from k-Clique to One-Sided k-Biclique, it turns out that we do not know how to construct the set system , and hence the reduction does not pan out. Nonetheless Lin constructed a variant of , where Property 2 was more refined and the reduction from k-Clique to One-Sided k-Biclique, went through with slightly more effort.
Before we move on to discussing some applications of Theorem 1 and the gadget composition technique, we remark about known stronger time lower bound for One-Sided k
under stronger running time hypotheses. Lin [23
] showed a lower bound of
for One-Sided k
assuming ETH. We wonder if this can be further improved.
Open Question 1 (Lower bound of One-Sided k-Biclique under ETH and SETH) Can the running time lower bound on One-Sided k-Biclique be improved to under ETH? Can it be improved to under SETH?
We remark that a direction to address the above question was detailed in [24
]. While on the topic of the k
problem, it is worth noting that the lower bound of
for One-Sided k
assuming ETH yields a running time lower bound of
for the k
problem (due to the soundness parameters in Theorem 1). However, assuming randomized ETH, the running time lower bound for the k
problem can be improved to
]. Can this improved running time lower bound be obtained just under (deterministic) ETH? Finally, we remark that we shall discuss about the hardness of approximation of the k
problem in Section 3.2.3
Inapproximability of -Dominating Set via Gadget Composition.
We shall discuss about the inapproximability of k
in detail in the next subsubsection. We would like to simply highlight here how the above framework was used by Chen and Lin [25
] and Lin [26
] to obtain inapproximability results for k
], the authors starting from Theorem 1, obtain the W-hardness of approximating k
to a factor of almost two. Then they amplify the gap to any constant by using a specialized graph product.
We now turn our attention to a recent result of Lin [26
] who provided strong inapproximability result for k
(we refer the reader to Section 3.1.2
to obtain the context for this result). Lin’s proof of inapproximability of k
is a one-step reduction from an instance of k
on a universe of size
is the number of subsets given in the collection) to an instance of k
on a universe of size
with a gap of
. Lin then uses this gap-producing self-reduction to provide running time lower bounds (under different time hypotheses) for approximating k
-set cover to a factor of
. Recall that k
is essentially [27
] equivalent to k
Elaborating, Lin designs a gadget by combining the hypercube partition gadget of Feige [29
] with a derandomizing combinatorial object called universal set, to obtain a gap gadget, and then combines the gap gadget with the input k
instance (on small universe but with no gap) to obtain a gap k
instance. This is another success story of the gadget composition technique.
Finally, we remark that Lai [30
] recently extended Lin’s inapproximability results for dominating set (using the same proof framework) to rule out constant-depth circuits of size
for any computable function f
Even Set. A recent success story of Theorem 1 is its application to resolve a long standing open problem called k-Minimum Distance Problem (also referred to as k-Even Set), where we are given as input a generator matrix of a binary linear code and an integer k, and the goal is to determine whether the code has distance at most k. Recall that the distance of a linear code is where denote the 0-norm (aka the Hamming norm).
], the authors showed that k
-hard under randomized reductions. The result was obtained by starting from the inapproximability result stated in Theorem 1 followed by a series of intricate reductions. In fact they proved the following stronger inapproximability result.
]). For any , given input , it is W-hard (under randomized reductions) to distinguish between
Completeness: Distance of the code generated by is at most k, and,
Soundness: Distance of the code generated by is more than .
We emphasize that even to obtain the W-hardness of k-Even Set (with no gap), they needed to start from the gap problem given in Theorem 1.
The proof of the above theorem proceeds by first showing FPT hardness of approximation of the non-homogeneous variant of k-Minimum Distance Problem called the k-Nearest Codeword Problem. In k-Nearest Codeword Problem, we are given a target vector (in ) in addition to , and the goal is to find whether there is any (in ) such that the Hamming norm of is at most k. As an intermediate step of the proof of Theorem 2, they showed that k-Nearest Codeword Problem is W-hard to approximate to any constant factor.
An important intermediate problem which was studied by [31
] to prove the inapproximability of k
-Nearest Codeword Problem
, was the k
-Linear Dependent Set
problem where given a set
vectors over a finite field
and an integer k
, the goal is to decide if there are k
that are linearly dependent. They ruled out constant factor approximation algorithms for this problem running in FPT time. Summarizing, the high level proof overview of Theorem 2 follows by reducing One-Sided k
-Linear Dependent Set
, which is then reduced to k
-Nearest Codeword Problem
, followed by a final randomized reduction to k
-Minimum Distance Problem
Finally, we note that there is no reason to define k
-Minimum Distance Problem
only for binary code, but can instead be defined over larger fields as well. It turns out that [31
] cannot rule out FPT algorithms for k
-Minimum Distance Problem
, when p
is fixed and is not part of the input. Thus we have the open problem.
Open Question 2.Is it W-hard to decide k-Minimum Distance Problem over with , when p is fixed and is not part on the input?
Shortest Vector Problem.
Theorem 1 (or more precisely the constant inapproximability of k
-Linear Dependent Set
stated above) was also used to resolve the complexity of the parameterized k
-Shortest Vector Problem
in lattices, where the input (in the
norm) is an integer
and a matrix
representing the basis of a lattice, and we want to determine whether the shortest (non-zero) vector in the lattice has length at most k
, i.e., whether
. Again, k
is the parameter of the problem. It should also be noted here that (as in [32
]), we require the basis of the lattice to be integer valued, which is sometimes not enforced in literature (e.g., [33
]). This is because, if
is allowed to be any matrix in
, then parameterization is meaningless because we can simply scale
down by a large multiplicative factor.
], the authors showed that k
-Shortest Vector Problem
-hard under randomized reductions. In fact they proved the following stronger inapproximability result.
]). For any , there exists a constant such that given input , it is W-hard (under randomized reductions) to distinguish between
Completeness: The norm of the shortest vector of the lattice generated by is , and,
Soundness: The norm of the shortest vector of the lattice generated by is .
Notice that Theorem 2 rules out FPT approximation algorithms with any constant approximation ratio for k-Even Set. In contrast, the above result only prove FPT inapproximability with some constant ratio for k-Shortest Vector Problem in norm for . As with k-Even Set, even to prove the W-hardness of k-Shortest Vector Problem (with no gap), they needed to start from the gap problem given in Theorem 1.
The proof of the above theorem proceeds by first showing FPT hardness of approximation of the non-homogeneous variant of k-Shortest Vector Problem called the k-Nearest Vector Problem. In k-Nearest Vector Problem, we are given a target vector (in ) in addition to , and the goal is to find whether there is any (in ) such that the norm of is at most k. As an intermediate step of the proof of Theorem 2, they showed that k-Nearest Vector Problem is W-hard to approximate to any constant factor. Summarizing, the high level proof overview of Theorem 3 follows by reducing One-Sided k-Biclique to k-Linear Dependent Set, which is then reduced to k-Nearest Vector Problem, followed by a final randomized reduction to k-Shortest Vector Problem.
An immediate open question left open from their work is whether Theorem 3 can be extended to k-Shortest Vector Problem in the norm. In other words,
Open Question 3 (Approximation of k-Shortest Vector Problem in norm]). Is k-Shortest Vector Problem in the norm in FPT?
3.1.2. Parameterized Inapproximability of Dominating Set
In the k-Dominating Set problem we are given an integer k and a graph G on n vertices as input, and the goal is to determine if there is a dominating set of size at most k. It was a long standing open question to design an algorithm which runs in time (i.e., FPT-time), that would find a dominating set of size at most whenever the graph G has a dominating set of size k, for any computable functions T and F.
The first non-trivial progress on this problem was by Chen and Lin [25
] who ruled out the existence of such algorithms (under
) for all constant functions F
, where c
is any universal constant). We discussed their proof technique in the previous subsubsection. A couple of years later, Karthik C. S. et al. [35
] completely settled the question, by ruling out the existence of such an algorithm (under
) for any computable function F
. Thus, k
was shown to be totally inapproximable. We elaborate on their proof below.
]). Let be any computable function. Given an instance of k-Dominating Set as input, it is -hard to distinguish between the following two cases:
The overall proof follows by reducing k-Multicolor Clique to the gap k-Dominating Set with parameters as given in the theorem statement. In the k-Multicolor Clique problem, we are given an integer k and a graph G on vertex set as input, where each is an independent set of cardinality n, and the goal is to determine if there is a clique of size k in G. Following a straightforward reduction from the k-Clique problem, it is fairly easy to see that k-Multicolor Clique is -hard.
The reduction from k
to the gap k
proceeds in two steps. In the first step we reduce k
. This is the step where we generate the gap. In the second step, we reduce k
to gap k
. This step is fairly standard and mimics ideas from Feige’s proof of the NP-hardness of approximating the Max Coverage problem [29
Before we proceed with the details of the above two steps, let us introduce a small technical tool from coding theory that we would need. We need codes known in literature as good codes, these are binary error correcting codes whose rate and relative distances are both constants bounded away from 0 (see [36
] (Appendix E.1.2.5) for definitions). The reader may think of them as follows: for every
, we say that
is a good code if (i)
, for some universal constant
, (ii) for any distinct
we have that c
have different values on at least
fraction of coordinates, for some universal constant
. An encoding of
is an injective function
. The encoding is said to be efficient if
can be computed in
time for any
Let us fix
as in the theorem statement. We further define
From -Multicolor Clique to -Gap CSP. Starting from an instance of k-Multicolor Clique, say G on vertex set , we write down a set of constraints on a variable set as follows. For every , such that , define to be the set of all edges in G whose end points are in and . An assignment to variable is an element of , i.e., a pair of vertices, one from and the other from . Suppose that was assigned the edge , where and . Then we define the assignment of to be and the assignment of to be . We define , where the constraint is defined to be satisfied if the assignment to all of are the same. We refer to the problem of determining if there is an assignment to the variables in X such that all the constraints are satisfied as the k-CSP problem. Notice that while this is a natural way to write k-Multicolor Clique as a CSP, where we have tried to check if all variables having a vertex in common, agree on its assignment, there is no gap yet in the k-CSP problem. In particular, if there was a clique of size k in G then there is an assignment to the variables of X (by assigning the edges of the clique in G to the corresponding variable in X) such that all the constraints in are satisfied; however, if every clique in G is of size less than k then there every assignment to the variables of X may violate only one constraint in (and not more).
In order to amplify the gap, we rewrite the set of constraints in a different way to obtain the set of constraints , on the same variable set X, as follows. Suppose that was assigned the edge , where and , then for , we define the assignment of to be the coordinate of . Recall that and therefore we can label all vertices in by vectors in . We define , where the constraint is defined to be satisfied if and only if the following holds for all : the assignment to all of are the same. Again notice that there is an assignment to the variables of X such that all the constraints in are satisfied if and only if the same assignment also satisfies all the constraints in .
However, rewriting as allows us to simply apply the error correcting code (with parameters and , and encoding function ) to the constraints in , to obtain a gap! In particular, we choose ℓ to be such that . Consider a new set of constraints , on the same variable set X, as follows. For any and , we denote by , the coordinate of . We define , where the constraint is defined to be satisfied if and only if the following holds for all : the assignment to all of are the same.
Notice, as before, that there is an assignment to the variables of X such that all the constraints in are satisfied if and only if the same assignment also satisfies all the constraints in . However, for every assignment to X that violates at least one constraint in , we have that the same assignment violates at least fraction of the constraints in . To see this, consider an assignment that violates the constraint in . This implies that there is some such that the assignment to all of are not the same. Let us suppose, without loss of generality, that the assignment to and are different. In other words, we have that , where we think of as bit vectors. Let such that if and only if . By the distance of the code we have that . Finally, notice that for all , we have that the assignment does not satisfy constraint in . We refer to the problem of distinguishing if there is an assignment to X such that all the constraints are satisfied or if every assignment to X does not satisfy a constant fraction of the constraints, as the k-Gap CSP problem.
In order to rule out
approximation FPT algorithms for k
, we will need that for every assignment to X
that violates at least one constraint in
, we have that the same assignment violates at least
fraction of the constraints in
(instead of just
; note that
is very close to 1, whereas
can be at most half). To boost the gap [37
] we apply a simple repetition/direct-product trick to our constraint system. Starting from
, we construct a new set of constraints
, on the same variable set X
, as follows.
. For every
, we define
to be satisfied if and only if for all
, the constraint
It is easy to see that and have the same set of completely satisfying assignments. However, for every assignment to X that violates fraction of constraints in , we have that the same assignment violates at least fraction of the constraints in . To see this, consider an assignment that violates fraction of constraints in , say it violates all constraints , for every . This implies that the assignment satisfies constraint if and only if . This implies that the fraction of constraints in that the assignment can satisfy is upper bounded by .
to gap -Dominating Set.
In the second part, starting from the aforementioned instance of k
(after boosting the gap), we construct an instance H
. The construction is due to Feige [29
] and it proceeds as follows. Let
be the set of all functions from
. The graph H
is on vertex set
, i.e., B
is simply the edge set of G
. We introduce an edge between all pairs of vertices in B
. We introduce an edge between
if and only if the following holds.
Notice that the number of vertices in H is , for some computable function . It is not hard to check that the following hold:
(Completeness) If there is an assignment to X that satisfies all constraints in , then the corresponding vertices in B dominate all vertices in the graph H.
(Soundness) If each assignment can only satisfy fraction of constraints in , then any dominating set of H has size at least .
We skip presenting details of this part of the proof here. The proofs have been derived many times in literature; if needed, the readers may refer to Appendix A of [35
]. This completes our sketch of the proof of Theorem 4.
A few remarks are in order. First, the k
problem described in the proof above, is formalized as the k
problem in [35
] (and was originally introduced in [39
]). In particular, the formalism of k
(which may be thought of as the parameterized label cover problem) is generic enough to be used as an intermediate gap problem to reduce to both k
(as in [35
]) and k
(as in [39
]). Moreover, it was robust enough to capture stronger running time lower bounds (under stronger hypotheses); this will elaborated below. However, in order to keep the above proof succinct, we skipped introducing the k
problem, and worked with k
, which was sufficient for the above proof.
Second, Karthik C. S. et al. [35
] additionally showed that for every computable functions
and every constant
Assuming the Exponential Time Hypothesis (ETH), there is no -approximation algorithm for k-Dominating Set that runs in time.
Assuming the Strong Exponential Time Hypothesis (SETH), for every integer , there is no -approximation algorithm for k-Dominating Set that runs in time.
In order to establish Theorem 4 and the above two results, Karthik C. S. et al. [35
] introduced a framework to prove parameterized hardness of approximation results. In this framework, the objective was to start from either the
hypothesis, ETH, or SETH, and end up with the gap k
, i.e., they design reductions from instances of k
, 3-CNF−SAT, and ℓ
-CNF−SAT, to an instance of gap k
. A prototype reduction in this framework has two modular parts. In the first part, which is specific to the problem they start from, they generate a gap and obtain hardness of gap k
. In the second part, they show a gap preserving reduction from gap k
to gap k
, which is essentially the same as the reduction from k
in the proof of Theorem 4.
The first part of a prototype reduction from the computational problem underlying a hypothesis of interest to gap k
follows by the design of an appropriate communication protocol. In particular, the computational problem is first reduced to a constraint satisfaction problem (CSP) over k
(or some function of k
) variables over an alphabet of size n
. The predicate of this CSP would depend on the computational problem underlying the hypothesis from which we started. Generalizing ideas from [40
], they then show how a protocol for computing this predicate in the multiparty (number of players is the number of variables of the CSP) communication model, can be combined with the CSP to obtain an instance of gap k
. For example, for the
hypothesis and ETH, the predicate is a variant of the equality function, and for SETH, the predicate is the well studied disjointness function. The completeness and soundness of the protocols computing these functions translate directly to the completeness and soundness of k
Third, we recall that Lin [26
] recently provided alternate proofs of Theorem 4 and the above mentioned stronger running time lower bounds. While we discussed about his proof technique in Section 3.1.1
, we would like to discuss about his result here. Following the right setting of parameters in the proof of Theorem 4 (for example set
), we can obtain that approximating k
to a factor of
-hard. Lin improved the exponent of
in the approximation factor to
for any computable function h
. Can this inapproximability be further improved? On the other hand, can we do better than the simple polynomial time greedy algorithm which provides a (1 +
) factor approximation? This leads us to the following question:
Open Question 4 (Tight inapproximability of k-Dominating Set) Is there a factor approximation algorithm for k-Dominating Set running in time ?
We conclude the discussion on k-Dominating Set with an open question on -hardness of approximation. As noted earlier, k-Dominating Set is a -complete problem, and Theorem 4 shows that the problem is -hard to approximate to any factor. However, is there some computable function F for which approximating k-Dominating Set is in ? In other words we have:
Open Question 5 (-completeness of approximating k-Dominating Set) Can we base total inapproximability of k-Dominating Set on ?
3.1.3. Parameterized Inapproximability of Steiner Orientation by Gap Amplification
Gap amplification is a widely used technique in the classic literatures on (NP-)hardness of approximation (e.g., [41
]). In fact, the arguably simplest proof of the PCP theorem, due to Dinur [43
], is indeed via repeated gap amplification. The overall idea here is simple: we start with a hardness of approximation for a problem with small factor (e.g.,
). At each step, we perform an operation that transforms an instance of our problem to another instance, in such a way that the gap becomes bigger; usually this new instance will also be bigger than our instance. By repeatedly applying this operation, one can finally arrive at a constant, or even super constant, factor hardness of approximation.
There are two main parameters that determine the success/failure of such an approach: how large the new instance is compared to the old instance (i.e., size blow-up) and how large the new gap is compared to the old gap, in each operation. To see how these two come into the picture, let us first consider a case study where a (straightforward) gap amplification procedure does not work: k-Clique. The standard way to amplify the gap for k-Clique is through graph product. Recall that the (tensor) graph product of a graph with itself, denoted by , is a graph whose vertex set is and there is an edge between and if and only if and . It is not hard to check that, if we can find a clique of size t in , then we can find one of size in G (and vice versa). This implies that, if we have an instance of clique that is hard to approximate to within a factor of , then we may take the graph product with itself which yields an instance of Clique that is hard to approximate to within a factor of .
Now, let us imagine that we start with the hard instance of an exact version of k-Clique. We may think of this as being hard to approximate to within a factor of . Hence, we may apply the above gap amplification procedure times, resulting in an instance of Clique that is hard to approximate to within a factor of , which is a constant bounded away from one (i.e., ). The bad news here is that the number of the vertices of the final graph is , where n is the number of vertices of the initial graph. This does not give any lower bound, because we can solve k-Clique in the original graph in time trivially! In the next subsection, we will see a simple way to prove hardness of approximating k-Clique, assuming stronger assumptions. However, it remains an interesting and important open question how to prove such hardness from a non-gap assumption:
Open Question 6.Is it W-hard or ETH-hard to approximatek-Cliqueto within a constant factor in FPT time?
Having seen a failed attempt, we may now move on to a success story. Remarkably, Wlodarczyk [44
] recently managed to use gap amplification to prove hardness of approximation for connectivity problems, including the k
problem. Here we are given a mixed graph G
, whose edges are either directed or undirected, and a set of k
. The goal is to orient all the undirected edges in such a way that maximizes the number of
that can be reached from
. The problem is known to be in XP [45
] but is W-hard even when all terminal pairs can be connected [46
]. Starting from this W-hardness, Wlodarczyk [44
] devises a gap amplification step that implies a hardness of approximation with factor
for the problem. Due to the technicality of the gap amplification step, we will not go into the specifics in this survey. However, let us point out the differences between this gap amplification and the (failed) one for Clique above. The key point here is that the new instance of Wlodarczyk’s gap amplification has size of the form
as in the graph product. This means that, even if we are applying Wlodarczyk’s gap amplification step
times, or, more generally,
times, it only results in an instance of size
, which is still FPT! Since the technique is still quite new, it is an exciting frontier to examine whether other parameterized problems allow such similar gap amplification steps.