Next Article in Journal
Student Experience, Satisfaction and Commitment in Blended Learning: A Structural Equation Modelling Approach
Next Article in Special Issue
A New Alternative to Szeged, Mostar, and PI Polynomials—The SMP Polynomials
Previous Article in Journal
The History of a Lemma from Archimedes to Newton
Previous Article in Special Issue
A Combinatorial Approach to Study the Nordhaus–Guddum-Type Results for Steiner Degree Distance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Red–Blue k-Center Clustering with Distance Constraints

by
Marzieh Eskandari
1,2,*,
Bhavika B. Khare
3,
Nirman Kumar
3 and
Bahram Sadeghi Bigham
2,4
1
Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
2
Department of Computer Science, Faculty of Mathematical Sciences, Alzahra University, Tehran 19938 93973, Iran
3
Department of Computer Science, University of Memphis, Memphis, TN 38152, USA
4
Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan 66731 45137, Iran
*
Author to whom correspondence should be addressed.
Mathematics 2023, 11(3), 748; https://doi.org/10.3390/math11030748
Submission received: 31 December 2022 / Revised: 24 January 2023 / Accepted: 30 January 2023 / Published: 2 February 2023

Abstract

:
We consider a variant of the k-center clustering problem in I R d , where the centers can be divided into two subsets—one, the red centers of size p, and the other, the blue centers of size q, such that p + q = k , and each red center and each blue center must be a distance of at least some given α 0 apart. The aim is to minimize the covering radius. We provide a bi-criteria approximation algorithm for the problem and a polynomial time algorithm for the constrained problem where all centers must lie on a given line . Additionally, we present a polynomial time algorithm for the case where only the orientation of the line is fixed in the plane ( d = 2 ), although the algorithm works even in I R d by constraining the line to lie in a plane and with a fixed orientation.
MSC:
11Y16; 68Q25; 68W25; 65D18; 91C20

1. Introduction

The classic k-center problem is to find k balls of minimum radius whose union covers a set P of n points in a metric space for a given positive integer k. This problem provides a simple geometric model for the following facility location problem. We want to place k facilities (such as supermarkets) to serve the customers in a city. It is natural to assume that the clients will go to the facility closest to their home, so we want to locate k facilities such that the maximum distance between a customer’s home and the nearest facility is minimized. The k-center problem is known to be NP-hard for Euclidean spaces [1]. We consider a variant of the k-center problem where the centers can be divided into two subsets—one, the red centers of size p, and the other, the blue centers of size q, where p + q = k , and such that each red center and each blue center must be at least some given α 0 distance apart, with the aim of minimizing the covering radius. For α = 0 , we have the k-center problem. As a motivating example, suppose we want to open two types of facilities with the same service (e.g., ‘Shell’ and ‘BP’ gas stations). Each client must be close to one of these facilities within the minimum possible distance, but the facilities should be separated from each other to avoid the disadvantages of being near competitors (such as crowding, spying, and over-sharing).
Sylvester [2] presented the 1-center problem in 1857, and Megiddo [3] gave a linear time algorithm for solving this problem in 1983, using linear programming. Hwang et al. [4] showed that in the plane, the k-center problem can be solved in n O ( k ) . Agarwal and Procopiuc [5] presented an n O ( k 1 1 / d ) -time algorithm for solving the k-center problem in I R d and a ( 1 + ϵ ) -approximation algorithm with running time O ( n log k ) + ( k / ϵ ) O ( k 1 1 / d ) . Gonzalez [6] studied approximating the discrete version (centers must belong to point set). Another faster algorithm for small dimensions of the problem was introduced later by Sayan Bandyapadhyay et al. [7]. Jianguang Lu et al. [8] defined the uncertain constrained k-means problem and proposed a ( 1 + ϵ ) -approximation algorithm for the problem.
Lukas Drexler et al. [9] introduced a new version of the problem entitled a connected k-center problem or the connected k-diameter problem in which every cluster induces a connected subgraph. Some researchers studied constrained versions of the k-center problem in which the centers are constrained to a line. Brass et al. [10] proposed an O ( n log 2 n ) -time algorithm when the line is fixed in advance. In addition, they solved the general case where the line has an arbitrary orientation in O ( n 4 log 2 n ) expected time. Other variations have also been considered [11,12,13] for k = 1 . For k 2 , variants have been studied, as this has applications to the placement of base stations in wireless sensor networks [14,15,16,17] and privacy preserving in social networks [18]. Another variant is the α -connected two-center problem, where the goal is to find two balls of minimum radius r whose union covers the points, and the distance of the two centers is at most 2 ( 1 α ) r , for 0 α 1 . Hwang et al. [4] presented an O ( n 2 log 2 n ) expected-time algorithm.
Kavand et al. studied another variant of the 2-center problem, which they termed as the ( n , 1 , 1 , α ) -center [19]. The aim was to find two balls, each of which covers the whole points, and to minimize the radius of the bigger one, and the distance between the two centers is at least α . They presented an O ( n log n ) -time algorithm for this problem, and a linear-time algorithm for its constrained version using the farthest point Voronoi diagram. Hee-Kap Ahn et al. [20] gave exact and approximation algorithms for two-center problems when the input is a set of disks in the plane. A different version of red–blue k-center clustering is introduced in [8] in which there are n colored red or blue points along with an integer k and a coverage requirement for each color. The goal is to find the smallest radius ρ such that there exist balls of radius ρ around k of the points that meet the coverage requirements.
Recently, Eskandari et al. have worked on this problem by generalizing it to the case of two types of centers, red and blue, where every pair of red and blue centers is separated by at least α , and balls around all red centers and balls around all blue centers cover the point set [21]. After that, Yuan Sha [22] improved their algorithm and gave improved approximation algorithms for the problem for the line-constrained version.
This paper considers a similar but different generalization of the k-center problem with the α -separability assumption, which we denote as the ( n , p q , α ) problem. Given a set P of n points in a metric space and integers p , q 1 , we want to find p + q balls of two different types, called red and blue, with the minimum radius such that P is covered by these p + q balls, and the center of each red ball is at least α distance from the center of each blue ball. We present an O ( 1 ) factor approximation algorithm that guarantees 3 α / 4 separability for the problem and a polynomial time algorithm for the constrained problem where all centers must lie on a given line . Additionally, we present a polynomial time algorithm for the case where only the orientation of the line is fixed in the plane ( d = 2 ).
Paper organization. In Section 2, we provide definitions and notations. In Section 3, we present an O ( 1 ) factor approximation algorithm that guarantees 3 α / 4 separability. In Section 4, we present a polynomial time algorithm for the constrained problem and related variants.

2. Problem and Definitions

Let dist ( x , y ) denote the distance between points x , y in the metric space M . For a point x M and a number r 0 , the ball B ( x , r ) is the set of points with distance at most r from x, i.e., B ( x , r ) = { y M | dist ( x , y ) r } is the closed ball of radius r with center x.
In the α -separated red–blue ( p + q ) -center clustering problem, we are given a set P with n points in M , integers p > 0 , q > 0 , and a real number α 0 . For a given number r 0 , p points c 1 , , c p in M (with possibly repeating points) called the red centers, and q points d 1 , , d q in M (with possibly repeating points) called the blue centers, are said to be a feasible solution for the problem, with radius of covering r if they satisfy,
  • Covering constraints:
    P i = 1 p B ( c i , r ) j = 1 q B ( d j , r ) .
  • Separation constraint: For each 1 i p , 1 j q , we have dist ( c i , d j ) α , i.e., the red and blue centers are separated by at least a distance of α .
The balls B ( c i , r ) , 1 i p are the red balls and B ( d j , r ) , 1 j q are the blue balls. If there exists a feasible solution for a certain value of r, such a value is said to be feasible for the problem. The goal of the problem is to find the minimum possible value of r that is feasible.
We denote this problem as the ( n , p q , α ) -problem. The ∨ in the notation emphasizes the fact that the union of the red balls and the blue balls cover P. This is to be contrasted with the authors’ recent work [21] where they consider the problem defined by Kavand et al. [19] where both the red and blue balls cover P. That problem is denoted as the ( n , p q , α ) problem.
Let r p q , α ( P ) denote the optimal radius for this problem. When P , p , q , α are clear from context, we will also denote this by r * for brevity. Let r k ( P ) denote the optimal k-center clustering radius for all k 1 . Notice that the centers in the k-center clustering problem can be any points in M not necessarily belonging to P. If that is the requirement, the problem is the discrete k-center clustering problem.
We will always be concerned with M = I R d . We let P = { p 1 , , p n } where, p i = ( p i 1 , p i 2 , , p i d ) . We also consider the constrained α -separated red–blue ( p + q ) -center clustering problem (when M = I R d ).
Here, we are given a line , and all the centers are constrained to lie on . Without loss of generality, we assume that is the x-axis. This can be achieved by an appropriate transformation of space. Moreover, we will use the same notation for the optimal radii and centers. For the constrained problem, we need some additional definitions and notations. For each point p i P , we consider the set of points on the line (x-axis) such that the ball of radius r centered at one of those points can cover p i . This is the intersection of B ( p i , r ) with the x-axis. Assuming this intersection is not empty, let the interval be I i ( r ) = [ a i ( r ) , b i ( r ) ] . Denote the set of all intervals as I ( r ) = { I 1 ( r ) , , I n ( r ) } , where we assume that the numbering is in the sorted order of intervals: those with earlier left endpoints are before, and for the same left endpoints, the one with earlier right endpoints occurs earlier in the order. Notice that the feasibility of radius r means that there exists a hitting set for the set of intervals I ( r ) that can be partitioned into the red centers and the blue centers satisfying the separation constraint.
The interval endpoints a i ( r ) , b i ( r ) can be computed by solving the equation, ( x p i 1 ) 2 + p i 2 2 + + p i d 2 = r 2 , for x. Thus, they are given by a i ( r ) = p i 1 r 2 j = 2 d p i j 2 , and b i ( r ) = p i 1 + r 2 j = 2 d p i j 2 . It is easy to see that for the range of r where the intersection is non-empty, a i ( r ) is a strictly decreasing function of r and b i ( r ) is a strictly increasing function of r.
Model of computation. We remark that our model of computation is the Real RAM model, where the usual arithmetic operations are assumed to take O ( 1 ) time.
Table of symbols and notations. To aid in the understanding and clarity of the presented material, a comprehensive table of symbols is provided below, which details the meanings and definitions of all symbols used throughout the text.
SymbolDefinition
pnumber of red circles
qnumber of blue circles
dist ( x , y ) distance between points x , y in the metric space M
B ( x , r ) the set of points with distance at most r from x
p i a point in the plane that should be covered
c i center of red ball
d i center of blue ball
r p q , α ( P ) optimal radius for this problem
I i ( r ) = [ a i ( r ) , b i ( r ) ] an interval which is the intersection of B ( p i , r ) with the x-axis
I ( r ) set of all intervals, { I 1 ( r ) , , I n ( r ) }

3. Approximation Algorithms

Our goal here is to show that there is a polynomial time algorithm that computes a bi-criteria approximation to the ( n , p q , α ) problem for a given point set P. The algorithm will find p red and q blue centers such that (i) the distance between each red and each blue center is at least 3 α / 4 , and (ii) the covering radius will be at most 8 r * , where r * is the optimal covering radius. The algorithm works differently when r * < α / 8 (Section 3.2) than when r * α / 8 (Section 3.1). Notice that this is not known to us at the beginning, so we explain how to combine the two in Section 3.3. We first discuss the case when r * is large (i.e., r * α / 8 ), and then proceed to the more complex case when r * is small ( r * < α / 8 ). An easy observation is that r * = r p q , α ( P ) r p + q ( P ) . To see this, notice that in both the ( p + q ) -center problem and the ( n , p q , α ) problem, we cover P by balls centered at p + q points. However, the ( n , p q , α ) problem has additional constraints; thus, the covering radius can only be larger.

3.1. The Case When r * α / 8

The algorithm first computes a 2-approximation to the ( p + q ) -center clustering problem. We briefly review this. This is via the same idea as Gonzalez’s algorithm [6]. We start with any point of P as the first center. Then, until ( p + q ) points have been found, we select the furthest center in P from the current ones. While the analysis of Gonzalez [6] was for the discrete k-center problem, it works without any changes for the k-center problem where the centers are not restricted to be points of P, and we obtain a 2-approximation to r k ( P ) .
At the end of this step, we have found points x 1 , x 2 , , x p + q P such that the radius of covering of P using these points is at most 2 r p + q ( P ) . Then, we select a maximal subset of these points that are at least 3 α / 4 away from each other. Suppose that, possibly with some renaming, these points are x 1 , x 2 , , x t where 1 t ( p + q ) . We place ( p + q ) red and blue centers among these points. In doing so, it does not matter which points are which color, except that there are at most p red points and q blue points. If there is at least one red and one blue point; then, we can increase the number of red (resp. blue) points to p (resp. q) by possibly co-locating some points. However, if t = 1 , assume it is red; then, one can choose all the p red points at x 1 and choose all the q blue points on the surface of the ball B ( x 1 , 3 α / 4 ) . The separation claim that red and blue centers are spaced at least 3 α / 4 apart is clear by construction. The following lemma, which is not too hard to prove, is the claim about the covering radius. A complete proof can be found in Appendix A.
Lemma 1.
The covering radius using the centers constructed is at most 8 r * .
Notice that the algorithm itself makes no use of the knowledge of r * . The claim about the covering radius is however only valid when r * α / 8 . Notice that it can be determined by clustering each point to its nearest center. We summarize the result in the following lemma,
Lemma 2.
There is a polynomial time algorithm that returns a set of p red centers c 1 , , c p , q blue centers d 1 , , d q , and a number r, such that,
P i = 1 p B ( c i , r ) j = 1 q B ( d j , r ) ,
and dist ( c i , d j ) 3 α / 4 for 1 i p , 1 j q . If r * α / 8 , then r 8 r * .

3.2. The Case When r * < α / 8

We first present a decision procedure that takes a number R 0 as input and returns True / False . If it returns True , it also finds a feasible solution with a bi-criteria guarantee: covering the radius of at most 2 R and separability of at least 3 α / 4 . Then, we show that when r * < α / 8 , our decision procedure will definitely return True for some R 2 r * .
The decision procedure. We first construct a neighborhood graph G on P as a vertex set by connecting two points p , q P , p q by an edge if their distance is at most 3 α / 4 . Then, we compute the connected components of G. Let the connected components be C 1 , , C m . Now, for each connected component C i , we run the following scooping algorithm. Start with an arbitrary point x i 1 C i . Suppose that points x i 1 , , x i j have already been selected in C i . If C i k = 1 j B ( x i k , 2 R ) we stop, else we select the next point x i j + 1 as any point in C i \ k = 1 j B ( x i k , 2 R ) . Let n i be the number of centers thus found. We now have positive integers n 1 , , n m and we want to solve the following problem: Does there exist a partition A B = { 1 , , m } such that, j A n j p and j B n j q ? Notice that this problem is NP-hard when the input is the set of integers n 1 , , n m , p , q . However, since our input size is at least n (it includes the n points of P), this problem can be solved in polynomial time by a dynamic programming algorithm specified by the following recursion. Let T [ k , a , b ] represent a table of True / False , where 0 k m , 0 a p , 0 b q and the table entry is True iff the numbers n 1 , , n k can be partitioned into two parts one of which sums to at most a and the rest to at most b. The recursive definition is as follows:
T [ k , a , b ] = False if 1 i k n i > a + b True if k = 0 B 1 B 2 otherwise ,
where the Boolean B 1 is defined as,
B 1 ( n k a ) T [ k 1 , a n k , b ] ,
and similarly B 2 is defined as,
B 2 ( n k b ) T [ k 1 , a , b n k ] .
If the above partitioning problem is True , i.e., T [ m , p , q ] is True , the algorithm returns True ; else, it returns False . The partitioning can also be computed by the dynamic programming. Thus, when it returns True , the algorithm also finds a partitioning of the connected components into two parts, one, whose associated integers n i add up to at most p, and the other, whose associated integers add up to at most q. We color the centers associated with the components in the former class red and those with the other blue. Clearly, we output at most p red and at most q blue centers. By possibly co-locating some centers, we can return exactly p red and q blue centers. The proof of the following appears in Appendix B.
Lemma 3.
If the procedure above returns True , it finds p red and q blue centers such that the distance between each red and each blue center is larger than 3 α / 4 , and they cover P using a radius of at most 2 R .
Our next lemma, and its proof, are actually the crux of the algorithm. First, a definition. We say a number D 0 is an interpoint distance if there exist points p i , p j P such that D = dist ( p i , p j ) . Notice that 0 is an interpoint distance.
Lemma 4.
If r * < α / 8 , then there exists an interpoint distance D such that D 2 r * , and if D R , the algorithm returns True .
Proof. 
Assume that r * < α / 8 . Consider an optimal solution: p red centers c 1 , , c p and q blue centers d 1 , , d q with covering radius r * . For a point x P , if x is covered by a red ball, we say it is red. If it is covered by a blue ball, we say it is blue. (A point can potentially have both the colors.) We claim that if x , y P are any two points, then if dist ( x , y ) 3 α / 4 , then x , y cannot be of different colors. To see this, we proceed by contradiction. Suppose wlog x is red and y is blue. Let c i be a red center covering x and d j a blue center covering y. By the separation constraint, dist ( c i , d j ) α . On the other hand, dist ( c i , x ) r * < α / 8 and similarly, dist ( d j , y ) < α / 8 . Then, by the triangle inequality,
α dist ( c i , d j ) dist ( c i , x ) + dist ( x , y ) + dist ( y , d j ) < α / 4 + dist ( x , y ) ,
and this leads to dist ( x , y ) > 3 α / 4 , a contradiction. For x = y , the above implies that each point is a unique color. Next, we claim that for each component C i , either all points are red or all are blue. To see this, suppose that the claim is false. Then, there exist points x , y C i such that x is red and y is blue. Since C i is connected, there is a path from x to y in C i and since the endpoints of this path have different colors, there must be an edge in this path such that its endpoints x , y have different colors, say x is red and y is blue. However, this contradicts the last claim, since dist ( x , y ) 3 α / 4 . The next claim is easy to see, since its proof closely follows the proof of our first claim above: No ball covers points in two different components. Due to the claims above, we may assume that we can partition the set of p + q centers into m parts; each part is the set of the centers covering a specific component C i . Let the number of centers whose balls cover C i be s i , for i = 1 , , m . Then, the sum of these numbers that correspond to the red centers sum up to p, and those for the blue centers sum up to q. Now, consider the p + q balls of radius r * covering P. For each such ball B, it covers a set of points X P . Consider two points x , y X among these points that are a furthest pair. Clearly, X is also covered by the ball B ( x , dist ( x , y ) ) . Out of the p + q balls, we consider the ball that gives us the furthest such pair, i.e., the ball that gives us the furthest apart diametral pair. Call this furthest interpoint distance D. We claim that D 2 r * and that if D R , the algorithm must return True . The first is easy to see since the diametral pair is inside one of the balls B of radius r * . To see the second statement, consider a component C i and one of the balls B out of the n i balls that cover C i . Clearly, the ball of radius D centered at any point of P B also covers P B . Then, assuming D R , if we follow the scooping procedure of our decision algorithm, by Gonzalez’s guarantee [6], the number of balls generated n i is at most s i . As such, there will be a partition of { 1 , , m } such that the n i for the indexes corresponding to red centers will sum up to at most p and those in the others sum up to at most q. Thus, our partitioning and hence decision algorithms return True . □
The algorithm for r * < α / 8 . The algorithm is simple. We enumerate and try all the O ( n 2 ) interpoint distances. Either the decision algorithm returns False for all of them or we return the set of centers returned by the smallest interpoint distance D for which it returns True . The following Lemma summarizes the result. See Appendix C for a proof.
Lemma 5.
There is a polynomial time algorithm that either terminates without returning anything, or it returns a set of p red centers c 1 , , c p , q blue centers d 1 , , d q all in P, and a number r, such that,
P i = 1 p B ( c i , r ) j = 1 q B ( d j , r ) ,
and dist ( c i , d j ) > 3 α / 4 for 1 i p , 1 j q . If r * < α / 8 , then the algorithm does return the centers mentioned, and moreover, r 4 r * .

3.3. Putting Things Together

This section just consolidates the results of Section 3.1 and Section 3.2. The algorithm runs the algorithms of both the sections, i.e., the ones detailed in Lemmas 2 and 5. Either both return centers or only one of them (Lemma 2) does. Notice that in both the cases, to determine the covering radius, we can just apply the usual nearest-neighbor clustering, i.e., each point is clustered to its closest center. We choose the one with the best covering radius. Since one works for r * α / 8 and the other in the complementary case r * < α / 8 , and we choose the one with the smallest clustering radius, we obtain at least the stricter of the guarantees. For the sake of completeness, we also provide an analysis of the running time.
Lemma 6.
The approximation algorithm detailed above can be implemented to run in O ( n 3 p q ) time.
Proof. 
The algorithm for large α , i.e., α r * / 8 detailed in Lemma 5, first needs a run of the Gonzalez algorithm [6] for p + q centers which can be completed in O ( n ( p + q ) ) time. Then, it chooses a maximal subset of those centers that are at least 3 α / 4 apart, and this can also be implemented by running the Gonzalez algorithm (choosing the furthest point from the current set) for as many iterations as necessary, i.e., until the closest distance from the current set falls below 3 α / 4 . This needs O ( ( p + q ) 2 ) time. The p red and q blue centers can then be chosen in O ( p + q ) time. Overall, the algorithm takes O ( n ( p + q ) ) time.
The second algorithm for small α , i.e., r * < α / 8 detailed in Lemma 2, is a search over O ( n 2 ) interpoint distances and after the computation of all such distances, for each of those, we run the decision algorithm of Lemma 3. The algorithm does require the distance graph, but this can be precomputed and stored. It then fills a dynamic programming table, and it is not too hard to see that a careful implementation of this can be completed in O ( n p q ) time. Thus, overall, the algorithm takes O ( n 3 p q ) time.
The combined algorithm runs both of the above algorithms and therefore takes O ( n 3 p q ) time overall. □
We obtain the following result.
Theorem 1.
There is a polynomial time algorithm that given an instance of the ( n , p q , α ) problem over a set of points P with | P | = n , returns a set of p red centers c 1 , , c p , q blue centers d 1 , , d q , and a number r such that,
P i = 1 p B ( c i , r ) j = 1 q B ( d j , r ) ,
and dist ( c i , d j ) 3 α / 4 for 1 i p , 1 j q . Moreover, r 8 r * .

4. The Constrained Problem

In this section, we consider the constrained ( n , p q , α ) problem in which the centers are restricted to lie on a line . In Section 4.1, we first assume that the line is fixed. In Section 4.2, we consider the case for d = 2 where only the orientation of is fixed.

4.1. Fixed Line

We first consider the constrained ( n , p q , α ) problem in which the centers are restricted to lie on a given line . Without loss of generality, we assume that is the x-axis. First, we consider the decision problem and present a dynamic programming algorithm to decide if the constrained ( n , p q , α ) problem has a feasible solution with a given covering radius r. Next, we will find a discrete set of candidates for possible values of r * . By performing a binary search over this set of candidates and checking their feasibility via the dynamic programming algorithm, we can find the optimal solution. In this section, for those points that are on the x-axis, by a slight abuse of notation, we may use a point instead of its x-coordinate.

4.1.1. Decision Problem

For a given radius r, we want to decide if the constrained ( n , p q , α ) problem has a feasible solution with covering radius r. We find a finite set of candidates for possible locations of centers on and use it for designing a dynamic programming algorithm to find a feasible solution if one exists.
Consider the endpoints of the intervals in I , i.e., a i ( r ) and b i ( r ) , for 1 i n . Let x 1 ( r ) , x 2 ( r ) , , x 2 n ( r ) be the sorted order of these endpoints on and C ( r ) = { x i ( r ) , x i ( r ) + α | 1 i 2 n , x 1 ( r ) x i ( r ) + α x 2 n ( r ) } . We show that if r is feasible, there exists a feasible solution which is a subset of C ( r ) . Thus, we can search over C ( r ) for a solution.
First, we need a definition. Let U = { u 1 , u 2 , , u p + q } be a feasible solution with covering radius r. It is called standard if it has all the four following properties: 1. U C ( r ) . 2. u 1 and u p + q are on the endpoints. 3. Any two consecutive same color centers are on the endpoints. 4. If u i , 2 i p + q 1 , is not on an endpoint, its adjacent centers ( u i 1 and u i + 1 ) are on the endpoints and u i u i 1 = α . We need only the first property of a standard solution for our dynamic programming algorithm. The other properties will be used to find a set of finite number of candidates for the radius of covering in the next section. A feasible solution can be converted to standard form by the following approach:
Converting a given solution to standard form: Let U = { u 1 , u 2 , , u p + q } (which are sorted from left to right on ) be a feasible solution with radius of covering r. If U { x 1 ( r ) , x 2 ( r ) , , x 2 n ( r ) } , we are finished. Now, assume that there is at least one center in U that is not on an endpoint. The feasibility problem is equivalent to finding a hitting set for the intervals in I ( r ) that can be divided into two subsets satisfying the separation constraints. As is standard in such hitting set problems, we look at the faces of the arrangement of the intervals. Some of these faces might be open intervals, or half-open intervals, or even singleton points. All faces are disjoint by definition. If F is such a face, then a point in the closure F ¯ of F will hit at least the same intervals as points in F hit. To compute all the face closures in the arrangement of the I ( r ) , we retain all the consecutive intervals [ x i ( r ) , x i + 1 ( r ) ] that do not lie outside any of the intervals I i ( r ) . This can be completed by a simple line sweep algorithm. Notice that there are only O ( n ) such face closures. In addition, we may assume each such face closure contains at most one center, since otherwise, it is easy to see that they can be “fused” together and colored in a way still preserving the separation constraints with the other centers. The face closure that contains u i is denoted by [ x i , 1 ( r ) , x i , 2 ( r ) ] , where x i , 1 ( r ) and x i , 2 ( r ) are the endpoints of some intervals (i.e., a j ( r ) or b j ( r ) ). To convert U to standard form, we construct a set of p + q points belonging to C ( r ) that is a feasible solution with covering radius r with the desired properties. It will be completed in two phases: in the first phase, we shift as many canters as we can to the endpoints of the intervals (i.e., a j ( r ) , b j ( r ) ). Clearly, only some of the centers are allowed to shift because of covering and separation constraints. In the second phase, the centers that were not shifted to the endpoints in the first phase will be shifted to some points of C ( r ) such that the solution remains feasible. Moreover, in the second phase, we will construct a solution in which, if a center is not on an endpoint, its adjacent centers are on the endpoints.
Phase 1: First of all, if u 1 (resp. u p + q ) does not lie on an endpoint, let u 1 = x 1 , 1 ( r ) (resp. u p + q = x p + q , 2 ( r ) ) . For each u i , 2 i p + q 1 , that does not lie on an endpoint, if u i and u i 1 are the same color, let u i = x i , 1 ( r ) . Otherwise, consider u i + 1 . If u i and u i + 1 are the same color, let u i = x i , 2 ( r ) . Since each face closure contains at most one center, we never cross other centers while moving centers.
After Phase 1, a sequence of consecutive centers with the same color lies on the endpoints.
Phase 2: Let u i , 2 i p + q 1 be the leftmost center that is not on an endpoint after Phase 1 if there exists at least one such center, else we are finished with this phase. Do the following process for u i . First, note that u i 1 is on an endpoint and u i should have different color from both its neighbors, u i 1 and u i + 1 (otherwise, it had been shifted to an endpoint in Phase 1). So, u i 1 and u i + 1 have the same colors. We shift u i to the left until it hits either x i , 1 ( r ) or u i 1 + α . In the other words, if x i , 1 ( r ) u i 1 α , let u i = x i , 1 ( r ) ; otherwise, u i = u i 1 + α . Next, for satisfying property 4, if u i is not on an endpoint yet, we need to consider u i + 1 to move it on an endpoint.
First, if u i + 1 is already on an endpoint (including the case i + 1 = p + q ), we are finished with u i . Otherwise, i + 1 < p + q and centers u i + 1 and u i + 2 are not the same color (due to Phase 1). Consider the four centers u i 1 , u i , u i + 1 , and u i + 2 . Clearly, u i 1 and u i + 1 are the same color and u i and u i + 2 are the same color. We invert the colors of u i and u i + 1 . This does not change the number of red and blue balls. Moreover, the solution remains feasible after inverting the colors. In this new order of centers, u i 1 and u i are the same color and u i + 1 and u i + 2 are the other color. Next, let u i = x i , 1 ( r ) and u i + 1 = x i + 1 , 2 ( r ) . After this displacement, the new solution satisfies the separation constraint, since the distance between u i and u i + 1 is greater than their previous distance, which was also at least α because of feasibility of U. Now, u i + 1 is also on an endpoint. Repeat Phase 2 again until finished.
In the new feasible solution obtained from the above approach, if a center, say u i , does not lie on an endpoint, it should have a color different from its two neighbors, u i 1 and u i + 1 , that therefore have the same color. In addition, in the sequence of three consecutive red–blue centers, u i 1 , u i , u i + 1 , u i 1 and u i + 1 must be on endpoints and u i is at a distance of α from u i 1 . Thus, all centers belong to C ( r ) . Thus, we have proved the following lemma:
Lemma 7.
If r is a feasible radius for the constrained ( n , p q , α ) problem, there exists a feasible solution with covering radius r such that the centers belong to C ( r ) .
Now, we present the dynamic programming algorithm for the decision problem.
For a given r, we sort C ( r ) from left to right to obtain c 1 , c 2 , , c m , where m 4 n 1 . The index of the first point after c j in C ( r ) with a distance of at least α from c j is denoted by N E X T ( j ) . If there is no such point, let N E X T ( j ) = m + 1 . Note that the intervals I i ( r ) are sorted in non-decreasing order of their left endpoints, a i ( r ) , which leads to an order on points of P. Let p 1 , , p n be the points ordered by inheritance from the ordering of I i ( r ) . For 0 i n , 0 p p , 0 q q and 1 j m + 1 , let T R (resp. T B ) is a 4-dimension True / False array such that T R [ i , p , q , j ] = True (resp. T B [ i , p , q , j ] = True ), if we can cover the last i points, p n i + 1 , p n i + 2 , , p n , with p red balls and q blue balls when the leftmost center is a red (resp. blue) center on c j , and such that the distance between blue centers and red centers is at least α . Otherwise, T R [ i , p , q , j ] = False (resp. T B [ i , p , q , j ] = False ).
Given the definition of tables T R and T B for a given radius r, the value of the following Boolean expression determines if there is a solution or not:
j = 1 m T B [ n , p , q , j ] j = 1 m T R [ n , p , q , j ]
This expression returns true if for some j, either T R [ n , p , q , j ] or T B [ n , p , q , j ] is true. We know that T R [ n , p , q , j ] = True (resp. T B [ n , p , q , j ] = True ) if we can cover all points p 1 , p 2 , , p n with p red balls and q blue balls when the leftmost center is red (resp. blue) located at c j such that the separation constraints hold. In other words, we should hit all the intervals in I by at least one red or blue center by trying all possible starting locations and color for the leftmost center.
We use a recursive approach to compute the tables entries. First, we find the initial values. If all intervals have been already hit, i.e., i = 0 , we should return True . So, for all 1 j m + 1 , 0 p p and 0 q q , we have: T R [ 0 , p , q , j ] = T B [ 0 , p , q , j ] = True . If there are not any red or blue centers to put while some unhit intervals remain, we return False . Thus, for all 1 j m , i > 0 , T R [ i , 0 , 0 , j ] = T B [ i , 0 , 0 , j ] = False . If we have already passed over all centers but any unhit intervals remain, we return False . So, for all i > 0 and p , q 0 , T R [ i , p , q , m + 1 ] = T B [ i , p , q , m + 1 ] = False . If there is no more red (resp. blue) center, we will no longer be able to place a red (resp. blue) center at c j , so for all 1 j m + 1 , i > 0 and p , q 0 , T R [ i , 0 , q , j ] = T B [ i , p , 0 , j ] = False .
For a given p and q , T R [ i , p , q , j ] (resp. T B [ i , p , q , j ] ) can be supposed as the ( i , j ) -th entry of a ( n + 1 ) × ( m + 1 ) matrix. We start with initializing the entries of T R [ i , 0 , 0 , j ] , T B [ i , 0 , 0 , j ] , T R [ i , 0 , 1 , j ] , and T B [ i , 1 , 0 , j ] . Matrices T R [ i , 1 , 0 , j ] and T B [ i , 1 , 0 , j ] can be easily computed by placing a center at c j and deciding if B ( c j , r ) can cover all points p n i + 1 , p n i + 2 , , p n , because we have just one center.
Now, for each pair ( p , q ) (starting from ( 1 , 1 ) ), suppose that we have already computed all entries of these matrices: T R [ i , p 1 , q , j ] and T B [ i , p 1 , q , j ] (resp. T R [ i , p , q 1 , j ] ) and T B [ i , p , q 1 , j ] ). To compute T R [ i , p , q , j ] (resp. T B [ i , p , q , j ] ), first of all, let p s , p s + 1 , , p s be the points that are covered by the ball B ( c j , r ) . If s > n i + 1 , T R [ i , p , q , j ] = T B [ i , p , q , j ] = False . This is so, because if points p n i + 1 , , p s 1 cannot be covered by B ( c j , r ) , then none of the other balls would be able to cover those points since for all k > j , c k > c j and if B ( c j , r ) cannot cover those points, neither can B ( c k , r ) . If s n i + 1 , T R [ i , p , q , j ] and T B [ i , p , q , j ] can be computed by the recursive formulae:
T R [ i , p , q , j ] = k = j + 1 m T R [ i , p 1 , q , k ] k = N E X T ( j ) m T B [ i , p 1 , q , k ] T B [ i , p , q , j ] = k = j + 1 m T B [ i , p , q 1 , k ] k = N E X T ( j ) m T R [ i , p , q 1 , k ]
where i = { p n i + 1 , p n i + 2 , , p n } \ { p s , p s + 1 , , p s } . (Since s n i + 1 , by removing the points p s , p s + 1 , , p s from the set of uncovered points, { p n i + 1 , p n i + 2 , , p n } , we will obtain i consecutive points { p n i + 1 , p n i + 2 , , p n } .)
As per the definition of the table T R [ i , p , q , j ] (resp. T B [ i , p , q , j ] ), the next center is red (resp. blue) at c j and may cover some uncovered points. We remove those points from the remaining points (i.e., { p n i + 1 , p n i + 2 , , p n } ). Then, we try all possible starting locations and color for the first unplaced center. Finally, all entries of T R [ i , p , q , j ] and T B [ i , p , q , j ] can be computed, and we can decide if the problem has a solution or not.
Analysis: Note that | C ( r ) | = O ( n ) and computing the candidate values for centers can be completed in O ( n ) time. Moreover, the successor points N E X T ( j ) can be computed in total O ( n log n ) time by first sorting C ( r ) and searching for c j + α in the sorted list. There are in total O ( n 2 p q ) entries to be filled, since m = | C ( r ) | = O ( n ) . The points that remain uncovered after putting a center at c j , can be easily found in O ( n ) time. Moreover, each entry has O ( n ) Boolean terms and can be looked up in O ( n ) time. Overall, we will take O ( n 3 p q ) time. Therefore, we have the following theorem:
Theorem 2.
For the constrained ( n , p q , α ) problem, it can be decided if a given r is feasible in time T D P ( n , p , q ) = O ( n 3 p q ) . Moreover, if r is feasible, a feasible solution with covering radius r can also be computed in the same time.

4.1.2. Candidate Values for r

For finding a discrete set of candidates for the optimal radii, we will prove a property of the optimal solution. For this purpose, we need some notations and definitions.
Let x s ( r ) be any endpoint of the interval I j ( r ) and x t ( r ) be an endpoint of I k ( r ) . The distance between x s ( r ) and x t ( r ) is called exceptional if it satisfies: 1. p j and p k are equidistant from . 2. p j 1 p k 1 = α or 2 α . 3. Either x s ( r ) = a j ( r ) , x t ( r ) = a k ( r ) or x s ( r ) = b j ( r ) , x t ( r ) = b k ( r ) . Otherwise, it is non-exceptional. If p j and p k are equidistant from , we have a j ( r ) a k ( r ) = b j ( r ) b k ( r ) = p j 1 p k 1 for all values of r. Thus, exceptional distances do not change when the radius r changes.
Lemma 8.
Let r be a feasible radius for the constrained ( n , p q , α ) problem. If any non-exceptional distance between two endpoints is not 0, α or 2 α , then the constrained ( n , p q , α ) problem has a feasible solution with radius less than r.
Proof. 
We show that there is a real number 0 < ϵ < r such that the constrained ( n , p q , α ) problem has a feasible solution with radius of covering r ϵ . To this end, we obtain a set of centers, U ¯ , from the given standard solution (U) and show that the set of balls centered at the points in U ¯ with radius r ϵ is a feasible solution for the problem. First, we need to modify U to find a feasible solution with the property that any two non-exceptional consecutive blue and red centers are at a distance strictly greater than α . Then, we use it for finding a solution, U ¯ , with radius of covering r ϵ ( ϵ will be fixed later).
Let U be a standard solution with radius of covering r. For any two consecutive red and blue centers u i and u i + 1 which are both on the endpoints and define a non-exceptional distance, we have u i + 1 u i > α , since u i + 1 u i α (by the lemma’s assumption).
On the other hand, if u i is not on an endpoint, u i 1 and u i + 1 are on endpoints, say u i 1 = x s ( r ) and u i + 1 = x t ( r ) . In addition, u i has color different from both u i 1 and u i + 1 . Let A = { u i : u i is not on an endpoint , 2 i p + q 1 } . Note that for all u i A , we have u i u i 1 = α (as completed in Phase 2) and u i + 1 u i α (due to the separation constraint). By the lemma’s assumption, if the distance between endpoints x j ( r ) and x k ( r ) is non-exceptional, u i + 1 u i 1 2 α , i.e., u i + 1 u i 1 > 2 α . This implies u i + 1 u i > α . Since u i is not on an endpoint, we can shift u i toward u i + 1 infinitesimally such that u i u i 1 > α and we still have u i + 1 u i > α . Similarly, we perturb all u i A to have a new solution in which every non-exceptional distance defined by a red and a blue center is strictly greater than α .
Now, we use this property to find a solution U ¯ with a covering radius less than r. We show that r can be decreased without hitting feasibility constraints, i.e., none of the faces change, centers remain on their own faces (covering constraint) and the distance between any two consecutive red and blue centers is at least α .
To compute U ¯ , let 0 < ε < r be a number, fixed later. There are two cases for each u i U :
Case 1: If u i is on an endpoint, say x i , 1 ( r ) (resp. x i , 2 ( r ) ), let u i ¯ = x i , 1 ( r ϵ ) (resp. u i ¯ = x i , 2 ( r ϵ ) ), i.e., it moves with the endpoint.
Case 2: If u i is not on an endpoint, u i 1 and u i + 1 are on endpoints, u i 1 = x s ( r ) and u i + 1 = x t ( r ) , and also u i = u i 1 + α . If the distance between x s ( r ) and x t ( r ) is exceptional, let u ¯ i = x s ( r ϵ ) + α . If the distance between x s ( r ) and x t ( r ) is non-exceptional, let u ¯ i = u i .
We want to find an ϵ > 0 such that U ¯ = { u ¯ 1 , u ¯ 2 , , u ¯ p + q } is a feasible solution with radius of covering r ϵ . To this end, we control the displacements of the centers in U ¯ compared to their initial locations in U, such that U ¯ with radius of covering r ϵ is feasible.
Firstly, after decreasing r to r ϵ , the relative order of the endpoints of the intervals should not change, i.e., the displacement of an endpoint of a face F i should be less than | | F i | | / 2 , where | | F i | | is the distance between the endpoints of face F i that is not zero because of lemma’s assumption. This can be controlled by shifting less than δ 1 / 2 , where 0 < δ 1 < min 1 i 2 n 1 { | | F i | | } .
Secondly, for satisfying the covering constraint, the centers belonging to A should remain in their faces, i.e., for u i A , x i , 1 ( r ϵ ) < u ¯ i < x i , 2 ( r ϵ ) . Regarding Case 1, the displacement of point x i , 1 (resp. x i , 2 ) should be less than ( u i x i , 1 ( r ) ) / 2 (resp. ( x i , 2 ( r ) u i )/2). Let 0 < δ 2 < min u i A { u i x i , 1 ( r ) , x i , 2 ( r ) u i } . Satisfying the covering constraint can be guaranteed by shifting less than δ 2 / 2 .
Finally, for satisfying the separation constraint, for each successive pair of blue and red centers, u ¯ i + 1 and u ¯ i , we should have u ¯ i + 1 u ¯ i α . If both of such points are on endpoints, i.e., u i = x s ( r ) and u i + 1 = x t ( r ) , we need to have | x s ( r ϵ ) x t ( r ϵ ) | α . If the distance between x s ( r ) and x t ( r ) is exceptional with the value of α , equality | x s ( r ) x t ( r ) | = α is always true for all values of r. Otherwise (the distance is non-exceptional), the displacement of these endpoints should be less than ( | x s ( r ) x t ( r ) | α ) / 2 . For satisfying the separation constraint, the displacement should be restricted to δ 3 , 0 < δ 3 < min x s ( r ) , x t ( r ) { | x t ( r ) x s ( r ) | α } where u i = x s ( r ) and u i + 1 = x t ( r ) have different colors and their distance is non-exceptional.
If one of u i or u i + 1 is not on an endpoint, say u i , then u i 1 and u i + 1 are on the endpoints, i.e., x s ( r ) = u i 1 and x t ( r ) = u i + 1 . If the distance between x s ( r ) and x t ( r ) is exceptional with a value of 2 α , then | x s ( r ) x t ( r ) | = 2 α for all values of r. By Case 2, | u ¯ i x s ( r ϵ ) | = | x t ( r ϵ ) u ¯ i | = α . Otherwise (the distance between x s ( r ) and x t ( r ) is non-exceptional), we have u ¯ i = u i . We should have | x t ( r ϵ ) u ¯ i | α and | u ¯ i x s ( r ϵ ) | α , i.e., the displacement of x t ( r ) and x s ( r ) should be less than | x t ( r ) u i | α and | u i x s ( r ) | α , respectively. So, let 0 < δ 4 < min u i A { | u i + 1 u i | α , | u i u i 1 | α } and restrict the displacement to δ 4 to satisfy the separability constraint.
Therefore, by choosing numbers δ 1 , δ 2 , δ 3 , and δ 4 , and 0 < δ < min { δ 1 , δ 2 , δ 3 , δ 4 } / 2 , due to continuity of the movement of endpoints on line , we can obtain a positive value of ϵ such that the displacement of an endpoint becomes at most δ when the radius decreases to r ϵ . Consequently, there exists an ϵ > 0 such that the balls centered at points in U ¯ with a covering radius r ϵ are a feasible solution. □
By Lemma 8, an optimal solution has at least a pair of two endpoints at a non-exceptional distance of 0, α or 2 α . Since the interval endpoints a i ( r ) and b i ( r ) are given by a i ( r ) = p i 1 r 2 j = 2 d p i j 2 , and b i ( r ) = p i 1 + r 2 j = 2 d p i j 2 , a candidate set for the optimal radius can be computed by solving the following equations for r, over all 1 i , k n :
p i 1 ± r 2 j = 2 d p i j 2 p k 1 ± r 2 j = 2 d p k j 2 = 0 or α or 2 α ,
where the distance between p i and p k is non-exceptional. Note that at least one of those equalities holds true, and these equations provide a finite number of solutions. By solving these equations, we obtain O ( n 2 ) candidates for the optimal radius. Thus, we have
Lemma 9.
There is a set of O ( n 2 ) candidates for the optimal radius r * , and this set can be constructed in O ( n 2 ) time.
We compute the candidates for r * and perform a binary search over them using the feasibility testing algorithm to obtain the optimal radius. We have the following theorem.
Theorem 3.
The constrained ( n , p q , α ) problem can be solved exactly in O ( n 2 + T D P ( n , p , q ) log n ) = O ( n 3 p q log n ) time.

4.2. Line with Fixed Orientation in I R 2

In this section, we assume that only the orientation of line is given in the plane. The algorithm can be easily extended to the general case of I R d even with d > 2 if the line is constrained to lie in a hyperplane parallel to the fixed orientation. To simplify notation and analysis, we assume d = 2 . WLOG, we assume that is horizontal. Let c be the horizontal line with y-intercept c. Let c * be the optimal horizontal line with optimal radius r c * * . Since r c * * is optimal, by moving line to a new location (say c * ± ϵ ), the optimal radius at the new location ( r c * , where c = c * ± ε ) is greater than r c * * . Since the interval endpoints a i ( r ) and b i ( r ) on the line c are given by a i ( r ) = p i 1 r 2 ( c p i 2 ) 2 , and b i ( r ) = p i 1 + r 2 ( c p i 2 ) 2 , a candidate set for the optimal radius and the horizontal line can be computed by solving a system of two equations of the following form for r and c, over all 1 i , j n :
p i 1 ± r 2 ( c p i 2 ) 2 p j 1 ± r 2 ( c p j 2 ) 2 = 0 or α or 2 α ,
where the distance between p i and p j is non-exceptional. This idea can be used to show the following Lemma whose proof can be found in Appendix D.
Lemma 10.
There is a set of O ( n 4 ) candidates for the optimal radius and an optimal horizontal line, and this set can be constructed in O ( n 4 ) time.
The following is the main result; see Appendix E for a proof.
Theorem 4.
The constrained ( n , p q , α ) problem on a line with fixed orientation in a plane, equivalently on a line with fixed orientation in I R 2 , can be solved exactly in O ( n 7 p q ) time.

5. Conclusions

Through this research, we have made advancements in solving the ( n , p q , α ) problem, which will be useful in various fields and applications where clustering is needed. As mentioned before, in an ( n , p q , α ) problem, there are two types of facilities to locate in a city. These facilities may be two different brands of a same service facility with the same qualities (such as ‘Costco’s’ and ‘Sam’s club’). We need to cover each client with at least one facility within the nearest distance, but businesses need to be careful to not share their trade secrets. Moreover, depending on the type of a business, being near competitors offers customers more choice that becomes overwhelming, and sales may not happen. So, the different type centers should be at an admissible distance from each other.
This paper has considered a variant of the k-center clustering problem in I R d , where the centers can be divided into two subsets, red and blue centers, such that each red center and each blue center must be a distance of at least some given α 0 apart. We have provided a bi-criteria approximation algorithm for the problem and a polynomial time algorithm for the constrained problem where all centers must lie on a given line. Additionally, we have presented a polynomial time algorithm for the case where only the orientation of the line is fixed in the plane ( d = 2 ).
Future works include removing the bifactor part and finding an approximation for r * without relaxing the separation constraint or proving that no approximation is possible if it is not relaxed, improving the complexity of the constrained problem and extending the approach to more than two colors to explore its potential applications.

Author Contributions

Conceptualization, M.E.; Methodology, M.E., B.B.K., N.K. and B.S.B.; Validation, N.K.; Writing—original draft, M.E. and N.K.; Writing—review & editing, B.B.K. and B.S.B.; Funding acquisition, M.E. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by Bahram Sadeghi Bigham.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proof of Lemma 1

Proof. 
By the guarantee of Gonzalez’s algorithm we have that P i = 1 p + q B ( x i , 2 r p + q ( P ) ) . As such, for the points x 1 , , x t , we can say P i = 1 t B ( x i , 2 r p + q ( P ) + 3 α / 4 ) , by the triangle inequality. (Adding more centers for the t = 1 case can only decrease the covering radius.) The covering radius is thus bounded by, 2 r p + q ( P ) + 3 α / 4 2 r * + 6 r * = 8 r * , where the first inequality follows from our observation above, and the second follows from the assumption that r * α / 8 . The claim is proved. □

Appendix B. Proof of Lemma 3

Proof. 
If the procedure returns True , clearly, it found a partition of the components C 1 , , C m into two parts, such that the components in the one part marked with red centers can be covered by balls of radius 2 R around the at most p such centers. Similarly, the blue centers cover all the components with the blue centers, with balls of radius 2 R . Each point of P is in one of these components, so all of P is covered. For the separation guarantee, notice that our algorithm returns centers that are points of the components themselves. However, given any two points across components, they are clearly separated by a distance of more than 3 α / 4 ; otherwise, those points would be in the same component of G. So, each red center and each blue center (or even two similarly colored centers in different components) are separated by more than 3 α / 4 . □

Appendix C. Proof of Lemma 5

Proof. 
The claim about the “return type” of the algorithm follows from Lemma 3 and the structure of the algorithm. If r * < α / 8 , then by Lemma 4, there exists an interpoint distance D such that D 2 r * and for D R , the decision algorithm returns true along with the centers satisfying the claimed separation guarantee. Since we check all interpoint distances as values for R, and return the centers for the smallest one that returns True , the given claim is true. Clearly, the covering balls have radius at most 2 D 4 r * in this case. □

Appendix D. Proof of Lemma 10

Notice that Lemma 8 gives us a condition involving non-exceptional distances that must hold for the optimal radius r c * for any given position of the line c. In our case, however, the optimal c is not known: thus, we have two unknowns c and r c * . Roughly speaking, we will show below in the proof that not one but two non-exceptional distances have to satisfy conditions similar to Lemma 8 or else there is one such condition and the partial derivative of a certain function does not exist. Thus, we obtain a set of two equations to determine c , r c * . We show that these equations can be solved and this determines the candidate set of ( r , c ) . The proof is somewhat tedious and technical but at some level is made up of simple algebraic geometry observations.
Proof. 
Assume that ( r , c ) is a possible pair for optimal position of the line c and radius r. First, we need to weed out positions of the line where there are no non-exceptional distances, since in this case, Lemma 8 is vacuous. In this extreme scenario, all pairs of points give rise to exceptional distances. By following the definition of exceptional distances (see Section 4.1.2), this can only happen when all points are equidistant from the line, i.e., they lie on a horizontal line. Moreover, by the second condition of the definition, there can only be at most three points in the set. Thus, in this case, we can solve the problem in O ( 1 ) time itself and generate just one optimal pair to satisfy the conditions of the Lemma. Thus, in the sequel, we assume there must be certain non-exceptional distances, and thus, Lemma 8 implies non-trivial conditions must hold true regarding some non-exceptional distances.
Suppose that some non-exceptional distance holds involving endpoints of p i and p j . To express this condition, we need to fix (i) which of them ( p i or p j ) has the smaller endpoint and which has the larger one, (ii) which of the endpoints (left or right) of p i , p j are involved, and (iii) whether the distance is 0, α or 2 α . Consider the function F i , j , e , γ ( r , c ) to denote the condition where the subscript denotes p i (resp. p j ) endpoint was smaller (resp. larger), and the e 1 , 2 , 3 , 4 to denote the choices of endpoints; e.g., 1 could indicate left endpoints of both p i , p j are involved, etc. and γ { 0 , α , 2 α } indicates what the non-exceptional distance is equal to. For example, one such condition could be:
F i , j , e , γ ( r , c ) = p j 1 r 2 ( c p j 2 ) 2 p i 1 r 2 ( c p i 2 ) 2 γ = 0 .
As Lemma 8 shows, at least one such condition, and moreover, involving a non-exceptional distance must hold true when r = r c * for any c. We consider the set of all possible such conditions (even possibly involving exceptional distances) and observe that at least one such condition holds true. There are only O ( n 2 ) such conditions. As we show below that if ( r , c ) is an optimal pair, either two of these equations are satisfied, or else one such condition say F i , j , e , γ ( r , c ) = 0 holds and both F i , j , e , γ ( r , c ) r and F i , j , e , γ ( r , c ) c do not exist (they encounter division by 0). In both cases, we will show this leads to impossible conditions (no candidates) or two equations that can be solved to obtain candidates.
Notice that each of the involved functions F i , j , e , γ ( r , c ) are continuous. Suppose the point ( r , c ) satisfies at most one such condition, say F i , j , e , γ ( r , c ) = 0 . For brevity, let us denote this by F ( r , c ) = 0 . It turns out, as we will see below, that geometrically F ( r , c ) = 0 describes a certain hyperbola in the c r plane (think of c on x-axis and r on y-axis) restricted to the region r > 0 . Now, there are two cases possible: (a) the partial derivatives F r or F c do not exist, or (b) they do exist. Suppose case (b) happens and moreover that F r 0 , F c 0 . Imagine changing c infinitesimally to c ± ε . Then, since the point ( r , c ) for an optimal pair candidate must always satisfy at least one such condition, and ( r , c ) satisfies F ( r , c ) = 0 but all the other functions are not zero at ( r , c ) , on infinitesimal shifts, none of the other conditions can be satisfied. Thus, F ( r , c ) = 0 continues to be satisfied on infinitesimal shifts. However, then, since F ( r , c ) = 0 identically as c changes position, we have that,
d F = 0 F r d r + F c d c = 0 d r d c = F c F r
since F r 0 . Now, since F c 0 , this means d r d c 0 . As such, either shifting c to c + ε or shifting c to c ε where ε > 0 is an infinitesimal will cause a decrease in r contradicting optimality. Therefore, either F r = 0 or else F c = 0 in this case.
Alternately, two such conditions say F i , j , e , γ ( r , c ) = 0 ( F = 0 ) and F i , j , e , γ ( r , c ) = 0 , denoted F = 0 , could be satisfied. However, again reasoning geometrically, it is also necessary here if their partial derivatives do exist and none are equal to 0, that their tangents differ in slope, and one is of positive slope and the other of negative slope (otherwise, a similar argument would apply as above to refute optimality). Since we know that the “curves” which describe F = 0 and F = 0 are really branches of hyperbolas, their tangents differing implies that they are linearly independent directions, and moreover, the equations describing the hyperbolas if presented as r 2 = ψ ( c ) , r 2 = ψ ( c ) for F , F , respectively, are different, i.e., ψ ( c ) is a different quadratic polynomial than ψ ( c ) .
Let us now see how we can simplify the conditions F = 0 or the partial derivatives not existing or F r = 0 or F c = 0 . Then, we will consider how to solve the equations resulting from our considerations above. The condition F = 0 can always be simplified to a condition of the form,
r 2 = ψ ( c ) ,
where ψ ( c ) is a quadratic function of c with positive coefficient for c 2 . (This curve is a hyperbola.) There are several cases, but we only consider one of them, as the proof for the others is similar. Suppose for example that,
F ( r , c ) = p j 1 r 2 ( c p j 2 ) 2 p i 1 r 2 ( c p i 2 ) 2 α = 0 .
Then, simple manipulations where we let one of the square roots remain on the LHS, square, simplify and then again square to remove the remaining square root will lead to the claimed form. A condition such as F r = 0 will lead to a condition such as
r r 2 ( c p i 2 ) 2 r r 2 ( c p j 2 ) 2 = 0 .
Actually, in some cases (for different possible F) where the square roots occur with both + signs or both with − sign, it simply cannot be 0. Assuming that the square roots occur with different signs and the above condition happens, it means c = ( p i 2 + p j 2 ) / 2 provided p i 2 p j 2 . Alternately, p i 2 = p j 2 is possible. In both the cases, it can be verified that either we have an impossible condition F ( r , c ) = 0 (for example where, p j 1 p i 1 α 0 along with p i 2 = p j 2 ), or else the pair of points p i , p j is defining an exceptional distance pair, and thus, such pairs can be discarded. The condition F c = 0 will lead to a condition such as
c p i 2 r 2 ( c p i 1 ) 2 c p j 2 r 2 ( c p j 2 ) 2 = 0 ,
and it can be verified that this case can also be discarded safely after simplifying and an analysis similar to the case where F r = 0 . We conclude that the partials can indeed never be zero at a candidate optimal point. Indeed, what we have shown is that when F r = 0 or F c = 0 , it leads to certain conditions for exceptional distances. The above expressions for the partials also tell us that if the partials do not exist, then at least one condition such as r 2 = ( c p i 2 ) 2 has to be satisfied. In fact, only one of the two can be satisfied. If both are satisfied, this again leads to c = ( p i 2 + p j 2 ) / 2 and thus exceptional pairs or impossible conditions.
Finally, let us discuss how the resulting equations can be solved to obtain candidate values for r , c . We discuss the two possible cases that remain following our analysis:
  • F ( r , c ) = 0 and the partials F r or F c do not exist. From the above, this leads to a condition such as r 2 = ( c p i 2 ) 2 , but in fact, here, r 2 ( c p j 2 ) 2 . It is easy to see we can solve this equation and F ( r , c ) = 0 to obtain c and then r. One of the square roots in F ( r , c ) = 0 vanishes—and it can be verified that we can simplify the remaining equation to conclude there are no solutions or obtain an independent condition such as r 2 = ( c p j 2 ) 2 + D 2 where D = ( p j 1 p i 1 α ) . Since p j 2 p i 2 , this can be solved, along with the equation, r 2 = ( c p i 2 ) 2 for a value of c (it leads to a linear equation for c).
  • F ( r , c ) = 0 and F ( r , c ) = 0 . Again, this leads to equations of the form r 2 = ψ ( c ) and r 2 = ψ ( c ) , and these can be solved to obtain candidate values for r , c . The only bothersome issue may be—is it possible that two different curves lead to the same equation? Indeed, we cannot discount that, but it turns out that in this case, we can ignore such a pair. This is because, from our previous comment about their tangents differing, it means that they must lead to independent equations r 2 = ψ ( c ) and r 2 = ψ ( c ) (otherwise, either the positive r branches do not intersect at all or the intersecting branches of the hyperbola completely coincide and so would their tangent lines). Therefore, the involved equations use different functions ψ ( c ) , ψ ( c ) and thus, they will indeed lead to finitely many solutions for c.
Overall, from the cases (A) and (B) above, we only obtain O ( n 4 ) possible candidate pairs: case (A) gives O ( n 2 ) , but case (B) leads to O ( n 4 ) possible candidate pairs. □

Appendix E. Proof of Theorem 4

Proof. 
By Lemma 10, we can obtain a set of O ( n 4 ) possible tuples of ( r , c ) where c determines the line and r determines the candidate optimal radius for c. Thus, it remains to test the feasibility for each case and choose the smallest such r that works. This can be completed for each possible tuple in O ( n 3 p q ) time by Theorem 2. Thus, the total time taken is O ( n 7 p q ) . □

References

  1. Megiddo, N.; Supowit, K.J. On the complexity of some common geometric location problems. SIAM J. Comput. 1984, 13, 182–196. [Google Scholar] [CrossRef]
  2. Sylvester, J.J. A question in the geometry of situation. Quart. J. Math. 1857, 322, 79. [Google Scholar]
  3. Megiddo, N. Linear time algorithms for linear programming in IR3 and related problems. SIAM J. Comput. 1983, 12, 759–776. [Google Scholar] [CrossRef]
  4. Hwang, R.Z.; Lee, R.C.T.; Chang, R.C. The slab dividing approach to solve the Euclidean p-center problem. Algorithmica 1993, 9, 1–22. [Google Scholar] [CrossRef]
  5. Agarwal, P.K.; Procopiuc, C.M. Exact and Approximation Algorithms for Clustering. Algorithmica 2002, 33, 201–226. [Google Scholar] [CrossRef]
  6. Gonzalez, T.F. Clustering to minimize the maximum intercluster distance. Theor. Comput. Sci. 1985, 38, 293–306. [Google Scholar] [CrossRef]
  7. Bandyapadhyay, S.; Friggstad, Z.; Mousavi, R. Parameterized approximation algorithms for k-center clustering and variants. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; Volume 36, pp. 3895–3903. [Google Scholar]
  8. Jia, X.; Sheth, K.; Svensson, O. Fair colorful k-center clustering. Math. Program. 2022, 192, 339–360. [Google Scholar] [CrossRef] [PubMed]
  9. Drexler, L.; Eube, J.; Luo, K.; Röglin, H.; Schmidt, M.; Wargalla, J. Connected k-Center and k-Diameter Clustering. arXiv 2022, arXiv:2211.02176. [Google Scholar]
  10. Brass, P.; Knauer, C.; Na, H.S.; Shin, C.S.; Vigneron, A. The Aligned k-center Problem. Int. J. Comp. Geom. Appl. 2011, 21, 157–178. [Google Scholar] [CrossRef]
  11. Hurtado, F.; Sacristn, V.; Toussaint, G. Constrained Facility Location. Stud. Locat. Anal. Spec. Issue Comput. Geom. 2000, 15, 17–35. [Google Scholar]
  12. Bose, P.; Langerman, S.; Roy, S. Smallest enclosing circle centered on a query line segment. In Proceedings of the 20th Canadian Conference on Computational Geometry, Montreal, QC, Canada, 13–15 August 2008. [Google Scholar]
  13. Bose, P.; Toussaint, G. Computing the Constrained Euclidean Geodesic and Link Center of a Simple Polygon with Applications. In Proceedings of the Computer Graphics International, Pohang, Republic of Korea, 24–28 June 1996; pp. 102–112. [Google Scholar]
  14. Das, G.; Roy, S.; Das, S.; Nandy, S. Variations of Base-Station Placement Problem on the Boundary of a Convex Region. Int. J. Found. Comput. Sci. 2008, 19, 405–427. [Google Scholar] [CrossRef]
  15. Roy, S.; Bardhan, D.; Das, S. Efficient Algorithm for Placing Base Stations by Avoiding Forbidden Zone. In Proceedings of the International Conference on Distributed Computing and Intelligent Technology, Bhubaneswar, India, 22–24 December 2005; pp. 105–116. [Google Scholar]
  16. Shin, C.S.; Kim, J.H.; Kim, S.K.; Chwa, K.Y. Two-Center Problems for a Convex Polygon. In Proceedings of the 6th Annual European Symposium, Venice, Italy, 24–26 August 1998; pp. 199–210. [Google Scholar]
  17. Shokouhifar, M.; Jalali, A. Optimized sugeno fuzzy clustering algorithm for wireless sensor networks. Eng. Appl. Artif. Intell. 2017, 60, 16–25. [Google Scholar] [CrossRef]
  18. Langari, R.K.; Sardar, S.; Mousavi, S.A.A.; Radfar, R. Combined fuzzy clustering and firefly algorithm for privacy preserving in social networks. Expert Syst. Appl. 2020, 141, 112968. [Google Scholar] [CrossRef]
  19. Kavand, P.; Mohades, A.; Eskandari, M. (n,1,1,α)-center problem. AUT J. Model. Simul. 2014, 46, 57–64. [Google Scholar]
  20. Ahn, H.K.; Kim, S.S.; Knauer, C.; Schlipf, L.; Shin, C.S.; Vigneron, A. Covering and piercing disks with two centers. Comput. Geom. 2013, 46, 253–262. [Google Scholar] [CrossRef]
  21. Eskandari, M.; Khare, B.; Kumar, N. Separated Red Blue Center Clustering. In Proceedings of the 32nd International Symposium on Algorithms and Computation (ISAAC 2021), Fukuoka, Japan, 6–8 December 2021; pp. 41:1–41:13. [Google Scholar]
  22. Sha, Y. Improved Separated Red-Blue Center Clustering. In Proceedings of the 28th International Computing and Combinatorics Conference, Shenzhen, China, 22–24 October 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 561–572. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Eskandari, M.; Khare, B.B.; Kumar, N.; Sadeghi Bigham, B. Red–Blue k-Center Clustering with Distance Constraints. Mathematics 2023, 11, 748. https://doi.org/10.3390/math11030748

AMA Style

Eskandari M, Khare BB, Kumar N, Sadeghi Bigham B. Red–Blue k-Center Clustering with Distance Constraints. Mathematics. 2023; 11(3):748. https://doi.org/10.3390/math11030748

Chicago/Turabian Style

Eskandari, Marzieh, Bhavika B. Khare, Nirman Kumar, and Bahram Sadeghi Bigham. 2023. "Red–Blue k-Center Clustering with Distance Constraints" Mathematics 11, no. 3: 748. https://doi.org/10.3390/math11030748

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop