A New Optimization Model for the Sustainable Development : Quadratic Knapsack Problem with Conflict Graphs

New information technology constantly improves the efficiency of social networks. Using optimization and decision models in the context of large data sets attracts extensive attention. This paper investigates a novel mathematical model for designing and optimizing environmental economic policies in a protection zone. The proposed model is referred to as the quadratic knapsack problem with conflict graphs, which is a new variant of the knapsack problem family. Due to the investigated problem processing a high complex structure, in order to solve efficiently the problem, we develop a metaheuristic which is based on the large neighborhood search. The proposed method embeds a construction procedure into a sophistical neighborhood search. For more details, the construction procedure takes charge of finding a starting solution while the investigated neighborhood search is used to generate and explore the solution space issuing from the provided starting solution. In order to highlight our theoretical model, we evaluate the model on a set of complex benchmark data sets. The obtained results demonstrate that the investigated algorithm is competitive and efficient compared to legacy algorithms.


Introduction
Nowadays, the evolution of information provides us the opportunity to design reasonable policies through the use of large data sets (see, e.g., Lee et al. [1]).Such technologies constantly improve the efficiency of the social network.Using optimization and decision models in social planning becomes a big challenge (see, e.g., Lee, Kim and Kim [2], Li and Lin [3] , Shim and Park [4]).A computational model is presented in this paper to design environmental economic policies for sustainable development.As the global ecological and environmental problems become increasingly severe, people's lives and social development face serious constraints and impact.Governments have proposed the need of the policies for protection zones, such as China's Ecological Red-line policies, to protect the regional economic and social development, and regional ecological safety.We propose the hypothesis of using a new model to optimize the policies through the use of large data sets from stakeholders' services and public's crowdsourcing.Such a model can be considered as a new generation from the knapsack problem family: the Quadratic Knapsack Problem with Conflict Graphs (QKPCG).An instance of QKPCG is composed of a knapsack of a capacity c, a set I of n items, and a set E of incompatible couples of items (i.e., E ⊆ (i, j) ∈ I × I, i < j ).Each item i ∈ I, it is associated with a positive weight w i and a positive profit p i .For each compatible couple of items (i, j), where (i, j) / ∈ E and i < j, it is associated with a positive profit p ij , which implies that an additional profit occurs when both of items i and j are selected to be placed into the knapsack.The objective of QKPCG consists of maximizing the total profit of items selected to be placed into the knapsack under the capacity constraint, where all selected items must be compatible.The quadratic program related to QKPCG can be defined as Equations ( 1)-( 3): (QP QKPCG ) max ∑ i∈I p i x i + ∑ i∈I ∑ j∈I,j>i p ij x i x j (1) s.t.∑ i∈I w i x i ≤ c (2) x i ∈ {0, 1} ∀ i ∈ I.
The decision variable x i , ∀ i ∈ I, equals to 1 if the i-th item is selected; 0 otherwise.Inequality (2) represents the capacity constraint and Inequalities (3) denote the disjunctive constraints, which ensure that compatibility of all items belonging to the knapsack.
On the background of sustainable development, the program of QKPCG (QP QKPCG ) can be used to establish a reasonable economic development strategy in the environmental protection zone.In QP QKPCG , the objective is to optimize the gain related to the development planning with limited natural resources without undermining the stability of natural biotic systems.More preciously, assume that a development planning is composed of n different components, where each item of QKPCG can be considered as a component.The related decision variable x i = 1 if the ith component is applied; otherwise x i = 0.A component of the planning can be defined as follows: a natural resource is used in an investment.Therefore, the objective function (cf., Formula (1)) measures the total reward related to all performed investments while the capacity constraint (cf., Inequality (2)) ensures that the effect by the exploitation of natural resources cannot exceed the fixed ecological boundary.Furthermore, the quadratic term in the objective function (i.e., (p ij x i x j )) represents the additional profit by combining the investment i and j.In order to make our model more realistic, we introduce the disjunctive constraints (cf., Inequalities (3)).A disjunctive constraint simulates the fact that two different investments cannot share the same natural resource.
QKPCG is an NP-hard problem (see, e.g., Garey and Johnson [5]).It reduces to the Quadratic Knapsack Problem (QKP) (see, e.g., [6,7]) when the constraints (3) are dropped and to the Knapsack Problem with Conflict Graphs (KPCG) (see, e.g., Pferschy and Schauer [8]) when p ij = 0, for all i = j = 1, . . ., n.Note that, in literature, the KPCG is also named by Yamada, Kataoka and Watanabe [9] as the Knapsack Problem with Conflict Graphs (KPCG).Due to the complexity of the QKP and the KPCG, few studies on exact methods for these two problems have been realized in the literature.Recently, most results on these topics are based on meteheuristics, which focus on providing high quality solution for large complex cases (see, e.g., [10][11][12]).
For the rest of the paper, Section 2 discusses the starting solution procedure, which provides an initial solution for QP QKPCG .Section 3 introduces an efficient metaheuristic for improving the solution at hand.Section 4 evaluates the performance of the proposed approach on a group of benchmark data sets.Finally, the contents of the paper are summarized in Section 5.

A Starting Solution Procedure for QKPCG
This section describes a starting solution procedure, noted by 2PH, which is based on solving successively QP QKPCG .The used procedure is composed of two phases, where the first phase aims at determining a feasible solution and the second phase tries to improve the provided solution by exploring its neighborhood.Such a procedure has also been used in [11,12] for approximately solving the KPCG.Unless noted otherwise, we assume that all items are sorted descending by their ratio of profit per weight (i.e., p i /w i , ∀ i = 1, . . ., n).

An Integer Linear Programming of the QKPCG
To reduce the computational effort caused by the quadratic term, we apply a classic linearization of the QKPCG proposed in Glover and Woolsey [13].The considered integer linear program associated with the QKPCG, notated by ILP QKPCG can be defined as follows.
(ILP QKPCG ) max ∑ i∈I p i x i + ∑ i∈I,j∈I,j>i p ij y ij s.t.
∑ i∈I w i x i ≤ c (4) In view of complexity of ILP QKPCG , in order to produce efficient solutions, 2PH consists of determining a feasible solution of the QKPCG by considering successively the quadratic term and the constraints (4)- (7).

The First Phase
At the first phase, an instance of QKPCG is reduced to an instance of KPCG, where the quadratic term is not taken account at the current step.The corresponding integer linear program of KPCG can be formally written as follows: x i ∈ {0, 1} ∀ i ∈ I.
From QP QKPCG , we can observe that a feasible solution of KPCG is also feasible for QKPCG.This is due to the fact that, in ILP KPCG , we ignore only the additional profit when choosing two items but respect the capacity constraint (8) and the disjunctive constraints (9).Therefore, a feasible solution of QKPCG can be computed by solving successively two optimization problems: a weighted independent set problem (see ILP W IS ), extracted from ILP KPCG with the elimination of the capacity constraint (8); a binary knapsack problem (see ILP K ) related to the independent set by solving ILP W IS .ILP W IS is first solved by providing an independent set.Secondly, ILP K is solved by computing a feasible solution of ILP KPCG , which is also feasible for QP QKPCG , with the consideration of the independent set yielded at the first phase.Let IS (IS ⊆ I) be a feasible solution of ILP W IS , then the linear programs referring to ILP W IS and ILP K can be defined by following expressions: Algorithm 1 displays the general idea to compute an available solution of ILP KPCG .At Step 1, IS is initialized as a null set, which is an evident solution of ILP W IS .The loop from Step 2 to Step 6 serves to produce IS iteratively with the un-selected items included in I.For each iteration, the item with the highest ratio of profit per weight is selected to be added into IS, while the selected item and its incompatible partners are immediately dropped from I. The loop stops if there are no more available items that can be included into IS.Steps 7-12 are used to compute a feasible solution for ILP KPCG .One first checks whether the current solution IS respects all constraints of ILP KPCG .If IS respects the capacity constraint (8) of ILP KPCG , IS is also feasible for KPCG and further for QKPCG.Otherwise, ILP K related to IS is solved to produce a solution ensuring the capacity constraint (8).To achieve the best performance of the investigated approach, ILP K is solved by applying the exact algorithm elaborated by Martello, Pisinger and Toth [14].
Algorithm 1 Determine a feasible solution for ILP KPCG Input: An instance I of ILP KPCG .Output: A feasible solution S KPCG for ILP KPCG .
1: Initialization: Set IS as empty set and I = {1, . . ., n}; 2: while I is not empty do Add i into IS: IS = IS ∪ {i}; 5: Eliminate i with all its incompatible items j such that (i, j) ∈ E from I; Set S KPCG as the optimal solution of ILP K ; 12: end if 13: return S KPCG .

The Second Phase
Note that, at the first phase, an instance of QKPCG is reduced to an instance of KPCG, where the quadratic term is instantly ignored.Therefore, the provided solution might be poor for QKPCG.This is due to the fact that one does not take into consideration optimization of the objective function of ILP QKPCG .Therefore, the aim of the second phase is to improve the quality of the solution obtained at the first phase (i.e., the solution S KPCG yielded by Algorithm 1) with help from an iterative local search.The applied local search serves to ameliorate a given solution by alternatively performing a building procedure and an exploring procedure.The building procedure generates a k-neighborhood of S KPCG by cleaning the values of k fixed variables of the solution vector related to S KPCG .The exploring procedure explores the generated neighborhood, which can be viewed as reduction of the original solution space, for determining a local optimum solution.The second phase stops when no valid improvement occurs.
Algorithm 2 shows how the proposed local search operates.Let S QKPCG be the solution by applying Algorithm 1 and set α as a constant number.The loop from Step 2 to Step 6 of the local search alternatively generates and explores a series of neighborhoods in order to ameliorate the starting solution.At Step 3, the current best solution, namely S QKPCG , is replaced by S QKPCG if it has a better value.Step 4 cleans the values of α variables in S QKPCG , where the items with the highest degree are favored.For an item, the value of its degree is the number of incompatible items related to this item.Let i be the item of the highest degree, where x i = 1 in S QKPCG .For each iteration, set x i and its incompatible partners as free.The incompatible items related to x i can be noted by x j , such that (i, j) ∈ E and (j, k) / ∈ E, where k (k = i) denotes the index of variables fixed to 1 in S QKPCG .At Step 5, S QKPCG is updated with the local optimum solution by exploring the current neighborhood, where the neighborhood corresponds to a subproblem of the S QKPCG .In other words, ILP QKPCG is solved on a reduced instance of QKPCG.Finally, Algorithm 2 stops when no better solution can be achieved and returns the best solution S QKPCG deduced from the starting solution.

Algorithm 2 A local search for improving a given solution
Input: S QKPCG , a feasible solution of the QKPCG.Output: S QKPCG , a local optimum solution of the QKPCG.
1: Initialize S QKPCG as an evident solution, where all variables equal to 0; 2: while S QKPCG is better than S QKPCG do 3: Set S QKPCG as S QKPCG ; 4: Generate a neighborhood of S QKPCG by cleaning α fixed variables; 5: Explore the current neighborhood to compute a local optimum solution: S QKPCG ; 6: end while 7: return S QKPCG .

A Neighborhood Search-Based Metaheuristic
In the domain of operations research, the neighborhood search is a common technique used to improve a given feasible solution or to correct a given infeasible solution.A special case of the neighborhood search, named the Large Neighborhood Search (LNS), was first introduced by Shaw [15] for solving large-scale instances of the vehicle routing problem.Similar to the local search, LNS is based on the concept of building and exploring neighborhoods.The difference between the two approaches is that, using the descent method may lead the solution procedure to stagnate in certain local optima; however, using LNS increase the probability for providing better solutions with exploration of a series of more promising solution spaces.This is because, for the local search stated in Section 2.3 (cf., Algorithm 2), either building or exploring adopts a mono-criterion for removing or inserting items.In order to enlarge the opportunity of escaping from the current local optimum, a random building strategy, which is based on removing items by considering their values of profit per weight, is applied.Algorithm 3 for determining a random neighborhood from a given solution, and Algorithm 4 summarizes the global solution procedure for computing a high quality solution of ILP QKPCG .

Algorithm 3 Remove β|I| variables of S QKPCG
Input: S QKPCG , a starting solution of ILP QKPCG .Output: An independent set IS and a reduced instance I r of ILP QKPCG .
1: Set cnt = 0, I r = ∅ and IS to the independent set related to S QKPCG ; 2: Sort the items of IS in increasing order of their values of profit per weight; 3: while cnt < β|I| do 4: Let r be a random real number varied in [0, 1] and i = |IS| × r γ ; 5: Remove i th item from IS, i in I r and set cnt = cnt + 1; 6: for all j such that (i, j) ∈ E do 7: if item j is compatible with all items of IS then 8: Add j in I r and set cnt = cnt + 1; end for 11: end while 12: return IS and I r .
The building procedure described in Algorithm 3 consists of randomly removing at least β|I| fixed variables from the current solution set.Consequently, Algorithm 3 aims at building a reduced problem of an instance of ILP QKPCG , which is defined on the set of removed variables.In contrast to the deterministic building procedure applied in the local search (see Algorithm 2), the random building procedure (cf.Algorithm 3) works by randomly exploring the solution space of ILP QKPCG .The triple parameters (β, r, γ) control the variety of sub solution spaces generated by Algorithm 3.
For example, if the values of β and r γ are both small (resp.large), one has little (resp.great) chance of visiting a diversified solution space.

Algorithm 4 A neighborhood search-based metaheuristic
Input: An instance I of ILP QKPCG .Output: S QKPCG , a local optimum of ILP QKPCG .
1: Apply successively Algorithms 1 and 2 to compute a feasible solution of QKPCG, noted by S QKPCG ; 2: Set S QKPCG as an evident solution, where all variables equal to 0; 3: while the limit of runtime NSBM time is not attained do 4: Apply Algorithm 3 to determine IS and I r associated to S QKPCG ; 5: Apply Algorithm 1 with I r to compute a new solution S QKPCG ; 6: Apply Algorithm 2 to determine S QKPCG ; 7: Update S QKPCG if S QKPCG is better; 8: end while 9: return S QKPCG .
The complete version of the investigated neighborhood search-based metaheuristic is described in Algorithm 4. It first applies successively Algorithms 1 and 2 to determine an initial solution for ILP QKPCG .Then it applies iteratively Algorithms 3, 1 and 2 to explore randomly the solution space of ILP QKPCG .Finally, when the runtime limit is met, Algorithm 4 returns the best solution at hand.

Simulation Results
For the sustainable development in an environmental protection zone, the designing and optimization of economic policies and key restraining factors were always analyzed with the method of multi-index comprehensive evaluation.The index system was composed of several aspects, i.e., ecological economy, eco-environment, eco-living, ecological culture, ecological institution and so on.From index systems built by different researchers, government planning and reports, and crowdsourcing focuses, more factors would be listed to indicate the level of the aspects.The problem of computing the index systems can be modeled as the investigated Neighborhood Search-Based Metaheuristic (NSBM).In order to conduct the simulation of computing the large data of index systems, this section examines the performance of the investigated NSBM on a group of benchmark data sets, notated as 1qkpcg-9qkpcg.These instances are generated by taking account of the structure used by Billionnet and Soutif [16] for the quadratic terms and the structure used by Yamada, Kataoka and Watanabe [9] for the disjunctive constraints.The program of NSBM is coded in C++ and all tests are performed on an Intel Core i5 with 3.1 Ghz.
The proposed dataset contains 45 instances with different numbers of items and densities.The number of items varies in {100, 150, 200} while the density of the graph varies in {2%, 4%, 8%}.The structure of the new generated benchmark problems is described in Table 1.
Table 2 displays the set of values chosen for parameters in our experiment, which guarantees the best performance for NSBM on the adopted benchmark problems.Table 3 shows the objective values achieved by NSBM and GLPK (version 4.60).Column 1 of Table 3 displays the names of the data sets.In Column 2, we report the objective values of the best solutions provided by GLPK in 3600 s, noted by V GLPK .Column 3-11 reports respectively the solution information when using different runtime limits (i.e., 50, 100 and 200 s), where Column V Mean (resp.V Best ) denotes the mean (resp.maximum) objective value achieved by NSBM over five trials within the corresponding time limit and Column T Best displays the runtime used to reach the best objective value.Table 4 displays percentage improvement in quality of the objective value.From Tables 3 and 4, we can observe the inferiority of the GLPK solver, which is based on one of the best exact algorithms: the branch-and-cut algorithm (see [17]).GLPK realizes an average objective value of 12, 012.38 compared to those reached by NSBM within less runtime.For three considered time limits, the average value of the mean objective values (V Mean ) reached by NSBM over five trials are better than the average value of the best objectives values computed by GLPK, where 14, 924.63 for the time limit of 50 s, 15, 050.58 for 100 s and 15, 194.50 for 200 s.With the first limit of runtime (i.e., 50 s), if we consider only the best solutions returned by NSBM over five independent trials, we can observe that NSBM successes in improving the best solutions computed by GLPK on 44 cases and fails for only one instance.By extending the time limit to 100 seconds (resp.200 s), the best solutions computed by NSBM over five trials become more robust.For two last cases, NSBM dominates GLPK on all cases.Furthermore, the computational effort required by NSBM is much more interesting than that consumed by the GLPK solver.

Conclusions
This paper investigated a new mathematical model for designing and optimizing economic policies in an environmental protection zone.To efficiently solve the proposed problem, we propose a metaheuristic to generate and explore a series of reduced solution spaces.The proposed method embeds an efficient starting solution procedure into a sophistical neighborhood search.The staring solution procedure consists of solving the original problem by successively ignoring the constraints and the quadratic term.The neighborhood search is introduced to build and explore a series of neighborhoods from the starting solution.The performance of the proposed approach is evaluated on a group of benchmark data sets and compared with a performant integer programming solver: GLPK.The provided results show the method's success in providing high-quality solutions within reasonable runtime.
6: end while 7: Generate ILP K defined on the items of IS; 8: if IS respect the capacity constraint (8) of ILP KPCG then

Table 1 .
Description of the benchmark instances.

Table 2 .
Parameters setting used to perform NSBM.

Table 3 .
Performance of NSBM vs GLPK on the benchmark instances.

Table 4 .
Percentage improvement of NSBM vs GLPK on the benchmark instances.