Autonomous Parameter Balance in Population-Based Approaches: A Self-Adaptive Learning-Based Strategy

Population-based metaheuristics can be seen as a set of agents that smartly explore the space of solutions of a given optimization problem. These agents are commonly governed by movement operators that decide how the exploration is driven. Although metaheuristics have successfully been used for more than 20 years, performing rapid and high-quality parameter control is still a main concern. For instance, deciding the proper population size yielding a good balance between quality of results and computing time is constantly a hard task, even more so in the presence of an unexplored optimization problem. In this paper, we propose a self-adaptive strategy based on the on-line population balance, which aims for improvements in the performance and search process on population-based algorithms. The design behind the proposed approach relies on three different components. Firstly, an optimization-based component which defines all metaheuristic tasks related to carry out the resolution of the optimization problems. Secondly, a learning-based component focused on transforming dynamic data into knowledge in order to influence the search in the solution space. Thirdly, a probabilistic-based selector component is designed to dynamically adjust the population. We illustrate an extensive experimental process on large instance sets from three well-known discrete optimization problems: Manufacturing Cell Design Problem, Set covering Problem, and Multidimensional Knapsack Problem. The proposed approach is able to compete against classic, autonomous, as well as IRace-tuned metaheuristics, yielding interesting results and potential future work regarding dynamically adjusting the number of solutions interacting on different times within the search process.


Introduction
Metaheuristics (MH) correspond to a heterogeneous family of algorithms, and multiple classifications have been proposed, such as single-solution, population-based, and natureinspired [1].In addition, it is well-known that the tuning of their key components, such as movement operators, stochastic elements, and parameters, can be the objective of multiple improvements in order to achieve a better performance.In this regard, dynamically adjusting parameters such as the population size is an important topic to the scientific community, which focuses its research on the employment of population-based approaches in order to solve hard optimization problems.This parameter can be considered as one of the most transversal issues to be defined on population-based algorithms.Nevertheless, it can be illustrated that this issue can be the most difficult parameter to settle on MH [2].
Moreover, the real impact behind the number of agents has been rarely addressed and proved to depend on several scenarios [3]: for instance, variants designed to perform in a particular study case, approaches designed to perform on a specific application, and approaches designed to tackle high-dimensional problems.Thus, in order to improve the arduous task in controlling this parameter, we propose a novel self-adaptive strategy, which aims to dynamically balance the amount of agents by analyzing the dynamic data generated on run-time solving different discrete optimization problems.In the literature, this kind of strategy has developed a solid foot in the optimization field, and in particular on evolutionary algorithms, where the generalized ideas include convergence optimization, global search improvements, and high affinity on parallelism, among others [4][5][6][7].For instance, it is well-known that the harmony search algorithm has drawbacks such as falling in local optima and premature convergence.However, these issues have been tackled by improvements on its internal components, data management, tuning parameters, and search process [8][9][10].The real well-known issue which exists to this day concerns the proposition of tailored/fitted solutions focused to perform under certain conditions, constrained to a defined environment, tackling a clear objective, and a specific problem or even specific instances within a problem [3].Moreover, in the literature there exist a wide number of proposed MH.However, a recurrent scenario is that most advances and improvements proposed in the state-of-the-art are focused on well-known algorithms, such as the works regarding the population size and other parameters in Particle Swarm Optimization (PSO) [11][12][13].In this context, the proposed approach, named Learning-based Linear Population (LBLP), aims for an improvement in the performance achieved by a pure population-based algorithm thanks to the influence given by the incorporation of a learning-based component balancing the agents on run-time.In addition, the interaction between MH and Machine Learning (ML) has attracted massive attention from the scientific community given the great results yielded on their respective fields [14][15][16][17][18].
The design proposed in this work includes the definition of three components which are mainly based on ideas and techniques from the optimization and ML fields [19].In this context, the first component focuses on the management of major issues concerning population-based related tasks, such as generation of the initial population, intensification, diversification, and binarization.In this first attempt, we employ the Spotted Hyena Optimizer (SHO) algorithm [20], which has proved to be a good option solving optimization problems [21][22][23][24][25][26].Regarding the second component, the main objective is the management of the dynamic data generated.This component includes two major tasks, which are the management of the data-structures behind LBLP and the management of the learning-based method.The learning process is carried out by a statistical modeling method ruled by the means of multiple linear regression.In this context, the process in control of the population size will be influenced by the knowledge generated through this process.The third proposed component concerns the management of parameters and agents used by LBLP while carrying out the solving process.In this regard, three major tasks are performed through the search process, the selection mechanism, control of probabilities, and increase/decrease of solutions within the population.The objective behind the selection mechanism includes the proper choice of a population size to perform for a certain amount of iterations on run-time.The design behind the mechanism follows a Monte Carlo simulation strategy.The second task concerns the control of parameters such as the probabilities employed.In addition, the third task carries out the generation-increase and the removal of solutions.
In order to test the performance and prove the viability or our proposed hybrid approach, we solve three well-known optimization problems, named the Manufacturing cell design problem (MCDP) [27], the Set covering problem (SCP) [28], and the Multidimensional knapsack problem (MKP) [29].The illustrated comparison is carried out in a three-step experimentation phase.Firstly, we carry out a performance comparison against reported results yielded by competitive state-of-the-art algorithms.Secondly, we compare against a pure implementation of SHO assisted by IRace, which is a well-known parameter tuning method [30].Thirdly, we compare the results obtained by the pure implementation of SHO vs. our proposed hybrid.Finally, we illustrate the interesting experimental results and discussion, where the proposed LBLP achieves good performance, proving to be a good and competitive option to tackle hard optimization problems.The main contributions and strong points in the proposal can be described as follows.

•
Robust self-adaptive hybrid approach capable of tackling hard optimization problems; • Online-tuning/Control of a key issue in population-based approaches: Adapting population size on run-time; • The hybrid approach successfully solved multiple hard optimization problems: In the experimentation phase, great results were achieved solving the MCDP, SCP, and MKP by employing an unique set of configuration values; • Scalability in the first component designed: This work proved great adaptability given to the employed population-based algorithm.This allows the incorporation of several movement operators from different population-based algorithms in order to be instantiated by the approach to perform (parallel approach); • Scalability in the third designed component: This work demonstrated significant benefits derived from the dynamic data generated through the search.The proposed design allows for the incorporation of different techniques, such as multiple supervised and deep learning methods.
The rest of this paper is organized as follows.The related work is introduced in Section 2. In Section 3 we illustrate the proper background in order to fully understand the proposed work and optimization problems solved.The proposed hybrid approach is explained in Section 4. Section 5 illustrates the experimental results.Finally, we conclude and suggest some lines of future research in Section 6.

Related Work
The proposed self-adaptive strategy has been designed by the interaction of multiple components from the optimization and machine learning field.In the literature, this kind of proposal has been known as hybrid approaches, which aims to incorporate knowledge from data and experience to the search process while solving a given problem.This line of investigation has received noteworthy attention from the scientific community and multiple taxonomies have been reported [19,31].
Preliminary works concerning machine learning at the service of optimization methods has been a trendy approach in recent years.In Ref. [32], a hybrid approach conformed by TS and Support Vector Machine (SVM) was proposed.The objective was to design an approach capable of tackling hard combinatorial optimization problems, such as Knapsack Problem (KP), Set Covering Problem (SCP), and the Traveling Salesman Problem (TSP).The proposed hybrid defined decision rules from a corpus of solutions generated in a random fashion, which were used to predict high quality solutions for a given instance and lead the search.However, the complexity behind the designed approach is a key factor and authors highlight the arduous and time consuming tasks, such as the knowledge needed to build the corpus, and the extraction of the classification rules.In addition, more recent hybrids that integrate self-adaptive strategies in their process have been receiving significant attention given the achieved results.In Ref. [9], an ensemble learning model focused on detecting fake news was proposed.This hybrid includes off-line process and online process.The authors proposed the incorporation of a self-adaptive harmony search at the off-line process in order to modify the weight of four defined training models based on different CNN versions.However, the issues persist to this end, such as computation complexity, resources, and solutions being tailored for an specific objective.
The objective behind the proposed approach concerns the improvement in the performance of a pure population-based algorithm based on the proper control of parameters [33].This work gives emphasis on the population size value, which is well-known for being a key parameter defined by all swarm-based approaches.In this regard, similar objectives can be observed in Refs.[22,34], where the authors proposed two hybrids that follow the same objective.The approaches employ Autonomous Search (AS) to assist the MH Human Behavior-Based Optimizer (HBBO) and SHO, respectively.AS is described as a reactive process that lets the solvers automatically reconfigure their parameters in order to improve when poor performances are detected.Nevertheless, data-driven hybrids that follow an equal objective are scarce.Currently, the body research related to this work focuses on classification, clustering, and data mining techniques.In this context, the authors in Ref. [35] proposed a hybrid framework based on the Co-evolutionary Genetic Algorithm (CGA) supported by machine learning.They employed inductive learning methods from a group of elite populations in order to influence the population with lower fitness.Their objective was to achieve a constant evolution of agents with low performance through the search.The learning process was carried out by the C4.5 and CN2 algorithms in order to perform the classification.Regarding data mining-based approaches, in Ref. [36], a hybrid version of ant colony optimization that incorporates a data mining module to guide the search process was proposed.Regarding the usage of clustering-based methods, Streichert et al. [37] proposed the clustering-based nitching method.The main objective behind the proposed approach was to identify multiple global and local optima in a multimodal search space.The model is employed over well-known evolutionary algorithms, and the aim of the model was the preservation of diversity using a multi population approach.
The proposed hybrid brings inspiration from multiple ideas described as follows.Firstly, we propose a hybrid approach that is capable of solving different optimization problems.In addition, the main objective concerns the design of a self-adaptive strategy in order to dynamically adjust and control a key parameter such as the population size on population-based approaches.In this regard, we detected a scarce number of illustrated works that focus their efforts on this issue.In the literature, the objective of most proposed works concern the tuning of parameters.The values are adjusted before the execution of the algorithm, usually through a number of previous runs.However, the proposed work adjusts the parameter values on-the-fly.In this context, a control method chooses a set of values for the optimization algorithm to perform in a given amount of time.The performance achieved is properly measured.Thus, the control method is able to know how good that choice was.These steps are repeated, and the aim is to maximize the chances of success by making the best decisions in the optimization process.On the other hand, although there is a clear presence of well-known learning-based methods such as clustering and classification, regression analysis is hardly employed, leaving out highly potential models that can tackle the presented issue.Lastly, we highlight the promising results obtained by solving three different hard optimization problems, which are illustrated in Section 5.In this regard, most proposed approaches in the literature are problem-oriented.Nevertheless, one of our objectives is for the proposed approach to be nature-friendly for the given problem.Thus, with the presented results a promising contribution to the field is illustrated.

Background
In this section, we review essential topics needed in order to fully understand the proposed hybrid.Firstly, the main features of population-based methods are presented, followed by the description of the employed SHO algorithm.Secondly, the detailed description of the proposed problems solved in this work are illustrated.

Metaheuristics
The MHs can be described as general-purpose methods that have great capabilities to tackle optimization problems [38].This heterogeneous family of algorithms has been the focus of several works as a consequence of their attractive features such as the capability to tackle hard optimization problems in a finite computational time, achieving close-to-optimal solutions [39].In the literature, subgroups from this family have been identified thanks to different criteria given the features of the proposed algorithms.Firstly, single solution algorithms were designed to carry out the transformation of a single solution during the search.Well-known examples are the local search [40], simulated annealing [41], etc.On the other hand, population-based algorithms focus on the transformation of multiple solutions during the search.In this context, all the agents/solutions in the population interact between them and evolve.Well-known algorithms are the shuffle frog leaping algorithm [42], ant colony optimization [43], gray wolf optimizer [44], and so on.Another big family of proposed algorithms consist of nature-inspired approaches.They are born as metaphors that define their behaviors on the basis of nature.For instance, the genetic algorithm [45], memetic algorithm [46], and differential evolution [47].Additionally, an inverse phenomena can be described for non-natural algorithms, such as imperialist competitive algorithm [48], and several subgroups of algorithms designed from multiple fields, such as music, physics, and so on.However, all the proposed algorithms in this heterogeneous family share between them equal concepts in their design, such as ideas, components, parameters, and so on.

Spotted Hyena Optimizer
In this work, we employ the SHO algorithm, which is a population-based MH that follows clustering ideas in their performance and has proved to be a good option for solving optimization problems.The main concept behind this algorithm is the social relationship between spotted hyenas and their collaborative behavior, which was originally designed to optimize constraint and unconstrained design problems.Regarding the description and equations of the movement operators, at the beginning, encircling prey is applied.The objective is to update the position of each agent towards the current best candidate solution in the population.In order to carry out the perturbation on each agent, we employ Equations ( 1) and (2).In (1), D h is the distance between the current agent (P) and the actual best agent in the population (P p ).In addition, in Equation (2), we compute the update of the current agent.In both equations, B and E correspond to co-efficient vectors; they are computed as illustrated in Equations ( 3) and ( 4), where rd 1 and rd 2 are random [0, 1] vectors.
The second movement employed is named hunting.The main objective is to influence the decision regarding the next position of each agent and the main idea is to compose a cluster towards the current best agent.In order to carry out this movement, we employ Equations ( 6)- (8).In ( 6) and (7), D h represents the distance, P h represents the actual best agent in the population, and P k the current agent being updated.Equation (7) illustrates the data-structure that contains the population clustered, where N indicates the number of agents.
C h = P k + P k+1 + ... + P k+N (8) Attacking the prey is illustrated as the third movement employed.This operator concerns the exploitation of the search space.In (9), each agent belonging to the cluster D h , generated in (8), will be updated.
The fourth movement concerns the performance of a passive exploration.The proposed SHO performs with B and E as co-efficient with random values to force the agents to move far away from the actual best agent in the population.This mechanism improves the global search of the approach.Additionally, SHO was initially designed to work on a continuous space.In order to tackle the MCDP, SCP, and MKP, a transformation of domain is needed and this process is illustrated in the next subsection.

Domain Transfer
In the literature, continuous population-based MH have proved to be very effective in tackling several high complex optimization problems [49].Currently, the increment in complexity of binary modern industrial problems have pushed new challenges to the scientific community, which have ended up proposing continuous methods as potential options to tackle this domain.For instance, Binary Bat Algorithm [50], PSO [51], Binary Salp Swarm Algorithm [52], Binary Dragonfly [53], and Binary Magnetic Optimization Algorithm [54], among others [55][56][57].In order to carry out the transformation, binarization strategies have been proposed [51].In this regard, a well-known employed strategy concerns the Two-step binarization scheme, which as the name implies, is composed of a two step process where transformation and binarization is performed.Firstly, transfer functions were introduced to the field in Ref. [58] with the aim to give a probability between 0 and 1 employing low computational resources.Thus, transfer functions, illustrated in Table 1, are applied to the values generated by the movement operator from the continuous MH.This process achieves these values to be in the range between 0 and 1.Secondly, the application of binarization is carried out, which focuses on the value discretization applied to the output values from the first step.This process decides for a binary value (0 or 1) to be selected.In this regard, classic methods have been described as follows: 1.
Standard: If the condition is satisfied, standard method returns 1, otherwise returns 0.
Complement: If the condition is satisfied, standard method returns the complement value.
Static probability: A probability is generated and evaluated with a transfer function. 4.
Elitist Discretization: Method Elitist Roulette, also known as Monte Carlo, consists of selecting randomly among the best individuals of the population, with a probability proportional to its fitness.
In this work, the two-step strategy employed consists of the transfer function V 4 and the elitist discretization.

Optimization Problems
In this subsection we illustrate a detailed explanation of the three optimization problems tackled by our proposed LBLP.

Manufacturing Cell Design Problem
The Manufacturing Cell Design Problem (MCDP) [59] is a classical optimization problem that finds application in lines of manufacture.In this regard, the MCDP consists of organizing a manufacturing plant or facility into a set of cells, each of them made up of different machines meant to process different parts of a product that have similar characteristics.The main objective is to minimize the movement and exchange of material between cells in order to reduce the production costs and increase productivity.The optimization model is stated as follows.Let: M max -the maximum number of machines per cell.We selected as the objective function to minimize the number of times that a given part must be processed by a machine that does not belong to the cell that the part has been assigned to.Let: The problem is represented by the following mathematical model: In this work, we solved a set of 35 instances from different authors.Each instance has its own configuration, the amounts of machines goes from 5 to 40, parts goes from 7 to 100, and so on.For this experiment, the instances tested have been executed 30 times.

Set Covering Problem
The set covering problem (SCP) is one of the well-known Karp's 21 NP-complete problems, where the goal is to find a subset of columns in a 1-0 matrix so that they cover all the rows of the matrix at a minimum cost.Several applications of the SCP can be seen in the real world, for instance, bus crew scheduling [60], location of emergency facilities [61], and vehicle routing [62].The formal definition is presented as follows.Let m × n be a binary matrix A = (a ij ) and a positive n-dimensional vector C = (c j ), where each element c j of C gives the cost of selecting the column j of matrix A. If a ij is equal to 1, then it means that the row i is covered by column j, otherwise it is not.The goal of the SCP is to find a minimum cost of columns in A such that each row in A is covered by at least one column.A mathematical definition of the SCP can be expressed as follows: where x j is 1 if column j is in the solution, otherwise it is 0. The constraint ensures that each row i is covered by at least one column.In this work, we solved 65 different instances, which have been organized into 11 sets extracted from the Beasley's ORlibrary.The employed instances were pre-processed in order to reduce the size and complexity.In this context, multiple pre-processing methods have been proposed in the literature for the SCP [63].In this work, we used two of them, which have proved to be the most effective: Column Domination (CD) and Column Inclusion (CI).Firstly, the definition of CD concerns a set of rows L j being covered by another column j ′ and c j ′ < c j .We then say that column j is dominated by c j ′ , and column j is removed from the solution.Second, in CI, the process is described as when a row is covered by only one column after the CD is applied.This means that there is no better column to cover those rows, and therefore this column must be included in the optimal solution.For this experiment, the test instances have been executed 30 times.

Multidimensional Knapsack Problem
Multidimensional Knapsack Problem (MKP) is NP-hard and can be considered as the generalized form of the classic Knapsack Problem (KP).The goal of MKP is to search for a subset of given objects that maximize the total profit while satisfying all constraints on resources.In addition, the KP is a widely-used problem with real-world applications in diverse fields including cryptography, allocation problems, scheduling, and production [64,65].The model can be stated as follows.

Maximize
where n is the number of items and m is the number of knapsack constraints with capacities b i .Each item j requires a ij units of resource consumption in the ith knapsack and yields c j units of profit upon inclusion.The goal is to find a subset of items that yields maximum profit without exceeding the resource capacities.In this work, we solved 6 different set instances from the Beasley's ORlibrary.The details concerning the solved benchmark is illustrated in Table 2.

Proposed Hybrid Approach
In this section we describe the details concerning the proposed hybrid: the main ideas, motivations, and design.Firstly, a general description of the process carried out is presented.In Section 4.2 we describe a more detailed methodology behind LBLP.Section 4.3, describes the main ideas, objectives, and techniques employed in the design of the proposed components.Finally, Section 4.4 illustrates the proposed algorithms.

General Description
The proposed LBLP follows a population-based solving strategy, which concerns multiple agents evolving in the solution space, intensification and diversification are performed, and the process is terminated when a threshold defined as an amount of iteration is met.Dynamically the adjusting parameters, especially population size, is an important topic that continues to be of growing interest to the natural computation community.In Ref. [3], the authors carried out a complete analysis of different implementations of PSO in order to define the perfect number of agents to perform.However, they highlighted that the same configuration will not necessarily fit each optimization problem or even each instance of the same problem.In this proposal we employ a population-based algorithm and consequently improve the performance by modifying the population size on run-time.This proposed modification is designed by the means of a learning component based on regression, which transforms all the yielded results employing different population sizes during solving time.Thus, the modifications are managed based on the possible best performance that can be achieved by employing a certain size as a population value.In this context, this whole process is governed by two parameters that are used as thresholds in order to carry out different tasks for LBLP: (1) The instance for a new population size to perform and (2) the instance when the knowledge needs to be generated.The first threshold is named α, which decides when the selection process will be carried out.This process will be selecting a suitable population size to perform.The second proposed parameter is named β, which manages when the regression analysis needs to be performed.The steps comprehending the proposed LBLP are described as follows: Step 1: Set the initial parameters for the population-based algorithm and the regression analysis.

Step 2:
Select the initial population size to perform.

Step 4:
while the termination criteria is not met.Update the population-based algorithm's parameters.

Methodology
The proposed LBLP defines four different population sizes as schemes to be selected to perform during the search.The initial probability given to each scheme to be selected is equally defined.For instance, if we configure four different size values, their initial probability to be picked corresponds to 0.25.Thus, at iteration 1, the selection mechanism (given by the Roulette selector component) will be choosing a scheme, and this selected value is the one to be performed in the next α iterations.In addition, in each iteration, the component managing the movement operators (given by the Driver component) will be sequentially carrying out diversification and intensification within the agents on the search space.This process generates dynamic data on each iteration that is sorted and stored, and this recollected data will be processed when the threshold β is met, where regression is applied and knowledge is generated.This learning process concerns the results yielded by the regression and the value interpretations, where the scheme with the best computed forecasting fitness value is selected and rewarded as the winner.In this regard, if this probability is selected it will be boosted by the model.

LBLP Components
In this subsection, we present a detailed explanation and definition of each component proposed in our first attempt designing LBLP.

Component 1: The Driver
The solving strategy employed by the proposed hybrid follows a population-based design.This component brings inspiration from the optimization field in order to search in the solution space of a given problem.The objective behind this component includes the generation of initial/new population (solutions), intensification, diversification, and binarization.In this first attempt proposing LBLP, we employ SHO mostly because it can be identified as a modern MH, outside of the well-known PSO, differential evolution, and genetic algorithms.In addition, the selection was based on the expertise of the research team.Nevertheless, in future upgrades, the incorporation of several algorithms smartlyselected to perform on run-time will be considered.Regarding the domain transfer process, the driver will be carrying the two-step strategy over the solutions generated.The strategy performed was function V 4 and the elitist discretization, which has already been proved to perform.

Component 2: Regression Model
This component is the key factor in LBLP.It concerns the analysis, storage, and decision making over the dynamic data generated.In this regard, while a scheme is performing, the regression model will be storing and indexing their respective fitness values achieved.Concerning the data-structure employed, in this work they were designed as vectors, but a more generalized description is presented as follows.i concerns the data-structure with the probabilities for each scheme to be selected.

DSsol d
i represents the data-structure which stores the corresponding solutions for each regression analysis carried out.The data-structure DSrank d i concerns the ranking for each scheme regarding the best values reached.In addition, d represents the number of schemes designed to be employed by LBLP.Regarding the regression analysis, it is carried out by the means of linear regression, where the fitted function is of the form: where y corresponds to the dependent variable, which is the fitness and value to predict.x represents the independent variable, which corresponds to the scheme performed.In this simple linear regression model, we present the close relationship between the performance and population size, which is employed through search.Regarding our proposed learningmodel, we define four fitted functions for each scheme defined in this work, and they are represented as follows.
In order to solve these functions, we employ the least squares method which is a wellknown approach used in the regression field.The outputs of the mentioned analysis goes to DSsol d i , where in order to select the winner scheme the model takes the following decision: where the probabilities concerning each scheme, stored in DSprob d i , will be updated taking in consideration Equation (15) and DSrank d i .Thus, this process will be addressed by the selection of the best prognostic regarding fitness defined by the four linear models, and the best result will be given "priority".A practical example can be described as follows: At the beginning, in each iteration, the approach will select a scheme using a probabilistic roulette.For a four-way scheme, the initial probabilities for each scheme to be selected was in a 25%-25%-25%-25% ratio.Additionally, the regression model is always storing and sorting the fitness values and agents on run-time.When the threshold is met, Equation (15) analyses the prognostic achieved and gives the winning scheme a higher probability to be chosen.For instance, we designed a ratio of 55%-15%-15%-15%.

Component 3: Roulette Selector
The idea associated behind this component corresponds to a roulette system, where the main objective concerns the probabilistic selection mechanism behind the agents performing on run-time.In this work, a 4-way scheme defining 4 different population sizes (20, 30, 40, and 50 agents) is employed.In the literature, the perfect number of agents to be employed has been an everlasting discussion within the scientific community.In this context, in Ref. [3], the reasoning for the selection goes after the complexity such as the highdimensional or unimodal problems, a designed topology of the proposed approach, and for approaches tackling very large tasks.Thus, the definition of this parameter value concerns an adjusting-testing process.In this work we follow the first standard recommendation given by the authors, which is between 20 and 50 agents.In future upgrades to be proposed for LBLP, new configuration will be employed.
Regarding the selection mechanism, the schemes are placed and selected by their assigned probabilities.The initial probability of each scheme to be selected is defined as follows.
where N corresponds to the number of schemes designed for the approach.Thus, in a 4-way scheme they are described as follows. 1 The probabilities for each scheme will be modified by the regression model after the corresponding analysis on the dynamic data generated is carried out on run-time.

Proposed Algorithm
In this section, we illustrate a detailed description of the proposed Algorithm 1.

Algorithm 1
Proposed LBLP The driver: perform intensification family of operators 7: The driver: perform diversification family of operators Roulette selector: select scheme to perform 20: if check number of agents by the scheme selected then

21:
Roulette selector: balance the population

Experimental Results
In this section, we describe the experimentation process carried out to evaluate the performance of our proposed LBLP.In this context, a two-step experimentation phase was designed in order to test the competitiveness.Firstly, we compare against state-of-the-art algorithms solving the MCDP, SCP, and MKP.In the second step, we compare the results obtained by our LBLP against implementations based in SHO + IRace, and classic SHO.Additionally, the results are evaluated using the relative percentage deviation (RPD).The RPD quantifies the deviation of the best value obtained by the approach from S opt for each instance.The configuration employed is illustrated in Table 3, and we highlight the good performance achieved.

First Experimentation Phase
As mentioned before, this subsection illustrates a detailed comparison and discussion of the performance achieved by LBLP against reported data from state-of-the-art algorithms for each problem.

Manufacturing Cell Design Problem
In this comparison, we employ the reported results illustrated by Binary Cat Swarm Optimization (BCSO) [66], Egyptian Vulture Optimization Algorithm (EVOA), and the Modified Binary Firefly Algorithm (MBFA) [67].In order to have a deeper sample of algorithms related to the proposed work, we also include a Human Behavior-Based Algorithm supported by Autonomous Search approach [34], which focuses on the control of the population size on run-time.In addition, to compare and understand the results, we highlighted in bold the best result for each instance when the optimum is met.
Table 4 illustrates the comparison of the reported results, and the description is as follows.The first column ID represents the identifier assigned to each instance.The S opt depicts the global optimum or best value known for the given instance.Column Best, Mean, and RPD are the given values for best value reached, the mean value of 30 executions, and the relative percentage deviation correspondingly for each approach.Regarding the performance comparison, the lead is clearly dominated by BCSO and the proposed LBLP.Analyzing the values reported in column Best, BCSO gets all 35 best values known, in comparison to 25 values for LBLP.In addition, concerning the median values for column Best in all the instances, LBLP can be placed in second place with 35.14 and the algorithm with the best performance reported is BCSO, which has a median value of 34.51.In this regard, far behind follows MBFA, EVOA, and HBBO-AS which computed 42.54, 50.83, and 55.03 respectively.On the other hand, concerning the median value for column Mean, the proposed LBLP gets first place with 36.00 against a 36.61reached by BCSO.This can be interpreted as being more robust and consistent in the reported performance.Moreover, we highlight that in several results, the proposed LBLP remains close to the best values known for those instances, which gives room for future improvements.Thus, the overall observations can be described as follows.LBLP does not fall behind against state-of-the-art algorithms specially designed to tackle the MCDP.In addition, the proposed approach achieved better results than HBBO-AS, which can be interpreted as how a population-based approach makes great profit due to the adaptability given by statistical modeling methods.

Set Covering Problem
In this comparison, we made use of the reported results illustrated by binary cat swarm optimization (BCSO) [68], binary firefly optimization (BFO) [69], binary shuffled frog leaping algorithm (BSFLA) [70], binary artificial bee colony (BABC) [71], and binary electromagnetism-like algorithm (BELA) [72].In addition, we highlighted in bold the best result for each instance when the optimum is met.
Table 5 illustrates the comparison of results achieved by LBLP against the state-of-the-art algorithms specially designed to tackle the SCP; the description is as follows.The column ID represents the identifier assigned to each instance.The S opt depicts the global optimum or best value known for the given instance.Column Best, Mean, and RPD are the given values for the best value reached, the mean value of 30 executions, and the relative percentage deviation correspondingly for each approach.Regarding the performance comparison, between the six approaches the lead is carried by LBLP.In this regard, a closer observation to the median values can be interpreted as follows.The proposed approach achieved the smallest value for column Best and Mean with 197.31 and 199.75, correspondingly.Moreover, the overall performance in the hardest sets of instances, such as groups F, G, and H, is pretty good as we analyzed the RPD values in comparison to BCSO and BELA.We highlight that in several results the proposed LBLP remains close to the best values known for those instances, encouraging us to continue working and further improve our method.

Multidimensional Knapsack Problem
Regarding the MKP, the state-of-the-art algorithms employed include the filter-and-fan heuristic (F& F) [73], Binary version of the PSO algorithm (3R-BPSO) [74], and a hybrid quantum particle swarm optimization (QPSO) algorithm [75].These approaches were defined in the literature as specifically designed methods to effectively tackle the MKP, and a certain degree of adaptability was designed into their search process on run-time.For instance, the 3R-BPSO algorithm employs three repair operators in order to fix infeasible solutions generated on run-time.In addition, if the results of an algorithm for a set of benchmark instances are not available, the algorithm will be ignored in the comparative study, for instance, 3R-BPSO in mknapcb2 and mknapcb5.In order to compare and understand the results, we highlighted in bold the best result for each instance when the optimum is met.
Table 6 illustrates the reported performance by the state-of-the-art approaches vs. LBLP.The column ID represents the identifier assigned to each instance.The S opt depicts the global optimum or best value known for the given instance.Column Best, Mean, and RPD are the given values for the best value reached, the mean value of 30 executions, and the relative percentage deviation correspondingly for each approach.Regarding the performance comparison, QPSO and LBLP lead the ranking by the reported performance.The QPSO approach reported a total of 21 best known values and LBLP reached 20 optimum values out of 30.However, observing the median values for column best, the proposed LBLP falls behind even against F & F with a 67,179.10 vs. 67,438.93.In this context, this issue can be clearly observed by the RPD values, instances mknapcb2 and mknapcb5 computed 1.47% and 3.13%, respectively, for tests 5.250.04 and 10.250.04.Thus, there exists a considerable distance between the performance for those instances where the best values known is not reached by LBLP.Nevertheless, in this first attempt, LBLP proved to be a competitive approach capable of tackling multiple optimization problems.In addition, this issue encouraged us to further improve and take profit from multiple mechanisms and heuristics to be employed in the design.In this first experimentation phase, LBLP was compared against state-of-the-art approaches specially designed to tackle the MCDP, SCP, and MKP.In this context, the proposed hybrid demonstrated a competitive performance for the three different problems tested.Regarding the MCDP, LBLP achieved 25 best values known out of a total of 35, achieving second place overall; see Figure 1.Regarding the SCP, LBLP achieved 39 best values known out of 65, achieving first place; see Figure 2. In addition, it is well-known that optimization methods such as MH are designed to perform in certain environments.Thus, there exists a certain degree of uncertainty when employing such methods to tackle different types of optimization problems.For instance, we can observe the polarized performance reported by BCSO solving the MCDP and SCP.In this regard, this is one of the strong points of our proposition, as the optimization problem to be tackled is not an issue given the adaptability of our proposed LBLP.Regarding the MKP, the proposed LBLP reached second place with 20 best values known out of 30; see Figure 3. Regarding the overall performance, LBLP proved to be competitive.However, we observed an inconsistent performance solving the MKP when the optimum value was not reached.This issue can be interpreted as a consequence of LBLP not taking enough profit from the population size vs. the diversification/intensification relationship and the frequency on which knowledge is opportunely generated.In this first attempt designing LBLP, the approach works with static values for α and β.In this context, a first improvement can be described as the incorporation of a new learning-based component managing the values for α and β on demand.The objective will be to achieve higher adaptability, giving the decision to auto-assign thresholds to perform the scheme-selection mechanism and the regression analysis.

Second Experimentation Phase
In this subsection, we take a closer look at the performance achieved by classic and hybrid approaches.We compare and discuss implementations based on the classic SHO, classic SHO assisted by IRace, and the proposed LBLP.In addition, in order to further demonstrate the improvement given by hybrids in optimization tools, the Wilcoxon's signed rank (Mann and Whitney 1947) test is carried out.We highlight the improvements, shortcomings, complexity, and robustness observed through the comparison.

Manufacturing Cell Design Problem
In this experimentation, Table 7 illustrates the comparison of results obtained by Classic SHO, Classic SHO assisted by IRace, and LBLP.In addition, in Table 8, a comparison against a hybrid approach is presented.This hybrid was proposed by Soto et al. in Ref. [34] and includes an approach based on the interaction between the populationbased human behavior-based algorithm supported by autonomous search algorithm and autonomous search (HBBO-AS), which focuses on the modification of the population.The table description is as follows: column ID represents the identifier for each instance; the S opt depicts the global optimum or best value known for the given instance; column Best, Worst, Mean, and RPD are the given values for best value reached, the worst value reached, the mean value of 30 executions, and the relative percentage deviation correspondingly for each approach.In order to compare and understand the results, we highlighted in bold the best result for each instance when the optimum is met.
Regarding the overall performance of approaches related to SHO, LBLP takes the lead and Classic SHO goes in last place.If we observe the median values, LBLP obtained 35.14 in comparison to 36.09 achieved by Classic SHO, and 35.43 by Classic SHO + IRace for column best.However, Classic SHO + Irace seems to be more consistent as we observed columns Worst and Mean, where 36.37 and 35.90 were the median values achieved against 37.54 and 36.00 reached by LBLP.Nevertheless, the achieved performance can be expressed as hybrid approaches being more competitive than their respective classic algorithms.LBLP demonstrated to be a good option tackling the MCDP and rooms for improvements were observed.In order to further demonstrate the performance by hybrids solving the MCDP, a statistical analysis is carried out.Table 9 illustrates a matrix that comprehends the resulting p-values after applying the well-known Average Wilcoxon-Mann-Whitney test for all the instances corresponding to the MCDP.Thus, a p-value less than 0.05 means that the difference is statistically significant, so the comparison of their averages is valid, such as LBLP vs. SHO.Concerning the comparison between LBLP and HBBO-AS, the approach led by autonomous search falls clearly behind on all the columns presented.However, new ideas and future interaction between optimization tools are highlighted.For instance, regarding performance metrics, the main job carried out by AS was to detect low performance or repetitive values/patterns on the solution.In this context, new components based on deep learning would clearly be effective at tackling this task.

Set Covering Problem
In this subsection, the results obtained by the three implementations are illustrated in Table 10.The description of the table is as follows: column ID represents the identifier assigned to each instance; S opt depicts the global optimum or best value known for the given instance; column Best, Mean, and RPD are the given values for best value reached, the mean value of 30 executions, and the relative percentage deviation correspondingly for each approach.In order to compare and understand the results, we highlighted in bold the best result for each instance when the optimum is met.
Regarding the best values achieved, LBLP leads with 39 followed by 23 achieved by Classic SHO + IRace, and Classic SHO with 18.This is confirmed by the median values in column best and RPD, where LBLP achieved 197.31 and 1.09, Classic SHO computed 199.89 and 2.24, and Classic SHO + IRace obtained 197.95 and 1.42.Moreover, we highlight that even in the instances where the best values are not met, LBLP stays close to the reported values and this can be corroborated by the small RPD values computed.On the other hand, two interesting phenomenons can be observed in this test.Firstly, the hybridized implementation outperforms the classic approach.In addition, the LBLP median values for column mean can be interpreted as a degree of deficit in robustness.In order to tackle this issue, new improvements will be performed over the regression model and configuration parameters.Nevertheless, LBLP reached most values known and proved to be a competitive option tackling the SCP, which has multiple opportunities to evolve and improve in future works.In order to further demonstrate the performance by hybrids solving the SCP, a statistical analysis is carried out.Table 11 illustrates a matrix that comprehends the resulting p-values after applying the well-known Average Wilcoxon-Mann-Whitney test for all the instances corresponding to the SCP.Thus, a p-value less than 0.05 means that the difference is statistically significant, so the comparison of their averages is valid, such as LBLP vs. SHO.In this subsection, the results obtained tackling the MKP are compared and discussed.In Table 12, we illustrate the comparison of results obtained by the three implementation works.In addition, Table 13 illustrates a comparison between the proposed LBLP and LMBP, which is a hybrid architecture based on population algorithm assisted by multiple regression models [76].The table description is as follows: column ID represents the identifier assigned to each instance; S opt depicts the global optimum or best value known for the given instance; Column Best, Worst, Mean, and RPD are the given values for the best value reached, the worst value reached, the mean value of 30 executions, and the relative percentage deviation correspondingly for each approach.In order to compare and understand the results, we highlighted in bold the best result for each instance when the optimum is met.
Regarding the best values concerning Table 12, the implementation employing IRace leads the overall performance with median values for column Best of 67,268, followed by LBLP with 67,179, and 66,730 for classic SHO.In addition, this is corroborated by the median values computed for column RPD, and IRace achieved 0.27 against a 0.35 for the proposed hybrid.However, the phenomenon observed in this test differs completely in comparison to the ones reported in the previous subsections.The median values reported for column Mean illustrates good robustness in the overall performance of LBLP.On the other hand, the bad results illustrated by Classic SHO + IRace can be interpreted as an inconsistency in the performance and as being trapped in local optima in multiple MKP instances.In order to further demonstrate the performance by hybrids solving the MKP, a statistical analysis is carried out.Table 14 illustrates a matrix that comprehends the resulting p-values after applying the well-known Average Wilcoxon-Mann-Whitney test for all the instances corresponding to the MKP.Thus, a p-value less than 0.05 means that the difference is statistically significant, so the comparison of their averages is valid, such as LBLP vs. SHO.Regarding results in Table 13, an equal competition is observed, and LMPB is capable of achieving better values for solving instances where LBLP falls behind.Nevertheless, it is interesting to consider designing a more complete or complex learningbased component.We observed that there is no certainty in achieving a good performance with a sole technique solving all the instances.Thus, a proper answer to this issue could be presented by the design of different learning techniques.

Overall Performance in This Phase
In this second experimentation phase, the proposed LBLP was compared against the Classic SHO and Classic SHO + IRace solving the MCDP, SCP, and MKP.The objective was to verify the improvements blending a learning-based method in the search process of a population-based strategy.In this regard, the proposed LBLP achieved good results solving the optimization problems.Figures 4-6 illustrate a performance overview, which ended up corroborating the idea of profiting over dynamic data generated.On the other hand, the good performance demonstrated by Classic SHO + IRace is to be expected.IRace is an off-line method that specializes in the tuning of parameters.Regarding the complexity, users need a certain degree of expertise in R, as the scripts configuration process can be an arduous task, and the implemented optimization tool can be enhanced by IRace.Regarding the proposed approach, LBLP requires the configuration of a scheme, α, and β.In addition, the implementation comprehends a population-based algorithm and well-known statistical modeling methods.
Regarding the observed phenomenons, while solving the MCDP and SCP, LBLP achieved a considerable amount of best values for the column Best but falls behind for the column Mean.Nonetheless, this situation completely changed while solving the MKP.The interpretation can be described in two ways: LBLP asking for a faster response in the learning process and a more detailed configuration of the parameters.Firstly, the parameters proposed in this work are static through the search.This issue was already addressed in Section 5.1.4.Nevertheless, multiple and unexpected events may present themselves while the search is being carried out.Thus, in order for LBLP to answer properly, the first improvement needs to be done over α and β, which controls the scheme selection and the generation of knowledge.On the other hand, concerning the proposed scheme, 4 different values were employed that completely differ from the static values employed by Classic SHO + IRace, such as 41 and 33.In this regard, multiple options as schemes will be added and tested.For instance, 20 different schemes from 20 to 40 agents.Lastly, a more complex scenario can be designed for a further detailed mechanism to be employed by the learning model.The objective is for each defined scheme to implement different α and β values in order to increment the adaptiveness of LBLP.

Conclusions
In this work, a hybrid self-adaptive approach has been proposed in order to solve hard discrete optimization problems.The main objective was to improve the performance by transforming a general component that exists on all population-based algorithms-the population size.The proposed strategy focuses on the dynamic update of this parameter in order to give high adaptive capacities to the agents, which is governed by a learning-based component that takes profits from their dynamic data generated on run-time.Interesting facts concerning the design are described as follows.As the complexity of the learning component is not high as the statistical modeling methods employed are well-known, the main issue is the novelty in the designed mechanism taking profit of the technique.In this context, movement operators from SHO and linear regression are classic means in their respective fields to solve multiple problems.On the other hand, general complications and drawbacks can be described as follows: computational time incremented, increment of complexity based on the scalability in the designed architecture, and increment at the complexity based on the wide spectrum of optimization problems to be tackled.
Regarding the experimentation carried out, LBLP proved to be a good option in comparison to state-of-the-art methods.We solved three well-known different hard optimization problems: the MCDP, SCP, and MKP, employing a unique configuration set of parameter values for the LBLP.In this context, the first phase helped us to measure at which point LBLP was a viable optimization tool in comparison to already reported approaches.The second phase was meant to highlight the improvements achieved regarding the performance between a pure population-based algorithm vs. the incorporation of a low-level learning-based component of the design.In addition, the competitiveness against reported successful hybrids and parameter-tuned versions of the employed algorithms was highlighted.This is an interesting observation, mainly because of the limitations behind the proposed approach, which concerns the algorithms selected.For instance, if we observe the linear regression, the main drawback is the inclusion of a unique performance metric as an independent variable in the model.In this regard, there are several metrics that exist in the literature and can have different weights during the search, such as bad solutions, percentage of unfeasible solutions, diversity, and amount of feasible solutions generated, among others.Nevertheless, the overall good performance and the given rooms for improvement brings motivation to further exploit this research field.In addition, this work contributed with scientific evidence of hybridized strategies outperforming their classic algorithm, proving to be profitable approaches solving hard optimization problems.
Regarding the phenomenon described in the experimentation phases, future considerations and improvements were discussed.In this regard, two improvements are under consideration: (1) dynamically adjusting values for α and β and (2) multiple and larger ranges for population size values.On the other hand, the well-known drawback generally associated with on-line data-driven methods are the amount, profit, and quality of the data given to the model to properly and timely learn on run-time.Thus, as there is no guarantee for the performance achieved by different learning techniques, it is a major issue to carry out an extensive experimental process employing state-of-the-art regression-based methods.However, this consideration can end up on a considerable increment on solving time in comparison to the ones reported in this work.Thus, the incorporation of an optimizer regarding the computational resources employed on run-time will be a key factor for future proposals.

Figure 1 .
Figure 1.Performance comparison between state-of-the-art approaches vs. LBLP tackling the MCDP.

Table 2 .
Configuration details from MKP instances employed in this work.
] with i = {1, 2, . . ., n} where DS f it d i stores the fitness values reached by the agents of each scheme performed.DSprob d

Table 3 .
The second experimentation phase's configuration parameters for LBLP, Classic SHO, and Classic SHO + IRace.

Table 4 .
Computational results achieved by LBLP and state-of-the-art approaches solving the MCDP.

Table 5 .
Computational results achieved by LBLP and state-of-the-art approaches solving the SCP.

Table 8 .
Computational results achieved by LBLP and HBBO + AS solving the MCDP.