Transfer Learning Based on Clustering Difference for Dynamic Multi-Objective Optimization

: Dynamic multi-objective optimization problems (DMOPs) have become a research hotspot in engineering optimization, because their objective functions, constraints, or parameters may change over time, while quickly and accurately tracking the changing Pareto optimal set (POS) during the optimization process. Therefore, solving dynamic multi-objective optimization problems presents great challenges. In recent years, transfer learning has been proved to be one of the effective means to solve dynamic multi-objective optimization problems. However, this paper proposes a new transfer learning method based on clustering difference to solve DMOPs (TCD-DMOEA). Different from the existing methods, it uses the clustering difference strategy to optimize the population quality and reduce the data difference between the target domain and the source domain. On this basis, transfer learning technology is used to accelerate the construction of initialization population. The advantage of the TCD-DMOEA method is that it reduces the possibility of negative transfer and improves the performance of the algorithm by improving the similarity between the source domain and the target domain. Experimental results show that compared with several advanced dynamic multi-objective optimization algorithms based on different benchmark problems, the proposed TCD-DMOEA method can signiﬁcantly improve the quality of the solution and


Introduction
Dynamic multi-objective optimization problems (DMOPs) [1,2] are optimization problems in which the objective function and decision variables are related to time (environment).Their optimal solution is a set of Pareto optimal solutions that change dynamically with time (environment).Different from solving the static multi-objective optimization problem, when dealing with this kind of optimization problem, it is necessary to not only optimize several conflicting objectives, but also deal with the changes of objective function and constraints at the same time.DMOPs are widely used to solve many real-world problems, and common application domains include scheduling [3,4], control [5], chemistry [6], industry [7] and energy design [8].
An ideal dynamic multi-objective optimization algorithm should contain three necessary parts, namely change detection, change response and multi-objective optimization algorithm (MOEA) [9][10][11][12].When the time variable changes, the algorithm needs to detect the change of the objective function in time and respond to the change according to different types of changes, deal with the optimization problem after the change, and use the multi-objective optimization algorithm to iterate the population, then quickly find DPF and DPS at the current moment.In fact, DMOPs can be regarded as a static multi-objective optimization problem (SMOEA) under a set of discrete time variables that can convert complex dynamic characteristics into static processing, which is more convenient and easier to handle, but its disadvantages are also obvious.That is, the processing speed is slow, and timeliness cannot be guaranteed.When encountering an environment with a high frequency of changes, it often fails to achieve the purpose of quickly tracking the Pareto front, resulting in poor performance of the algorithm.Therefore, for ideal DMOEAs, a suitable change response mechanism is the top priority in dealing with DMOPs.The existing environmental response strategies can be divided into the following four categories: diversity maintenance, memory strategy, prediction mechanism and transfer learning [13,14].
Diversity maintenance improves algorithm performance mainly by maintaining or improving population diversity.The specific process is shown in Figure 1a.Xu et al. [15] used the perturbation method to divide the decision variables into time-dependent and time-independent according to the dependence of the decision variables on the time parameters, and respectively adopted the optimal solution of the corresponding decision variables of the co-evolution of two subpopulations.In addition, Zhang et al. [16] maintained population diversity by simulating magnetic particles, and then quickly converged to the Pareto front in the current environment.Generally speaking, the diversity maintenance strategy directly adopts the Pareto optimal solution set of the optimization problem in the previous environment as the initial population in the current environment.Liu et al. [17] used an additional auxiliary strategy to maintain diversity that maintains two archives focusing on convergence and diversity, respectively.In addition, for some problems with biased characteristics, an interval mapping strategy is designed to make their solutions have good diversity.Based on this, Liang et al. [18] divided the decision variables into three parts and adopted the methods of diversity maintenance, prediction and diversity introduction, respectively, to generate high-quality offspring individuals and speed up the convergence of the population.Diversity preservation methods have better performance for DMOPs with weaker changes, but when the optimal solution of the historical environment deviates far from the real Pareto front in the current environment, it will lead to poor problem tracking performance.
Appl.Sci.2023, 13, x FOR PEER REVIEW 3 of 21 translation, rotation and composite problems.Compared with the existing three prediction methods, the proposed prediction mechanism has significant advantages in solving most DMOPs.Zheng et al. [25] used different prediction strategies for different decision variables to generate a new initial population according to the different effects of decision variables on convergence and distribution.Ma et al. [26] recently proposed a feature information prediction algorithm for DMOPs.Among them, the joint distribution adaptation model is used to identify the distribution of solutions after environmental changes and create a population in a new environment on this basis.Although the prediction model is more widely applicable, it inevitably has prediction errors, which affect the accurate guidance of the optimization process.From a statistical point of view, the solution set used to construct the prediction model and the solution set predicted by the prediction model must obey the independent and identical distribution Methods based on memory strategies and predictive mechanisms are common approaches to address DMOPs; the process is shown in Figure 1b.Chen et al. [19] proposed an evolutionary algorithm for dealing with time-varying constraints and objective functions.The algorithm employs new mating selection and environment selection operators that allow the population to contain both feasible and infeasible solutions and reuse previous solutions based on information obtained from new environments.Yu et al. [20] used polynomial regression predictors to extract linear or non-linear relationships in historical changes to generate good initial populations for new environments.Zou et al. [21] developed a reinforcement learning method to respond to environmental changes according to the severity of the change, which is considered as three states (mild, moderate and severe).The method adopts three actions of knee-based prediction, center-based prediction and local search, and selects a series of actions according to a given state to reorient the population to a new Pareto front (PF).
Based on the evolution information of the optimization problem in the historical environment, the prediction strategy predicts the fitness terrain or the dominant evolution direction of the current environment and provides useful guidance for the evolutionary optimization process, thereby improving the performance of the algorithm.The process of the forecasting model is shown in Figure 1c.Geng et al. [22] designed a group prediction strategy by converting the individual positions at different moments in the same direction of convergence in the target space into time series, predicting the position at the next moment, improving the diversity and effectiveness of the predicted population, and effectively reducing the convergence time of the algorithm after changing the problem.Wang et al. [23] proposed a prediction strategy based on ensemble learning.The strategy has three forecasting models, including linear and non-linear.Given this, Rong et al. [24] constructed a multi model prediction method for the characteristic change types of translation, rotation and composite problems.Compared with the existing three prediction methods, the proposed prediction mechanism has significant advantages in solving most DMOPs.Zheng et al. [25] used different prediction strategies for different decision variables to generate a new initial population according to the different effects of decision variables on convergence and distribution.Ma et al. [26] recently proposed a feature information prediction algorithm for DMOPs.Among them, the joint distribution adaptation model is used to identify the distribution of solutions after environmental changes and create a population in a new environment on this basis.
Although the prediction model is more widely applicable, it inevitably has prediction errors, which affect the accurate guidance of the optimization process.From a statistical point of view, the solution set used to construct the prediction model and the solution set predicted by the prediction model must obey the independent and identical distribution assumption, thus ignoring the non-independence and identical distribution of the data.In view of this, Jiang et al. [27] introduced the transfer learning strategy as the environment response strategy.
Transfer learning makes full use of the problem information with similar characteristics to guide the prediction or classification of current problems and improve recognition accuracy.Jiang et al. [28] proposed a multi-objective dynamic learning method based on evolutionary learning.Using the transfer principal component analysis method, we learn the Pareto optimal solution set in the adjacent historical environment and generate a set of initial populations through the transfer model in the current environment.Experimental results show that this method can accelerate population convergence and accurately track the optimal solution in the new environment.Jiang et al. [29] proposed a fast dynamic multi-objective evolutionary algorithm based on manifold learning.This method combines the memory mechanism with the learning characteristics of manifold transfer to predict the optimal individual in new instances in the process of evolution.The process is shown in Figure 1d.Liu et al. [30] recently proposed the combination of PPS method and transfer learning to improve population prediction.Jiang et al. [31] proposed a method based on individual transfer learning to solve DMOPs.Unlike the existing method, this method uses a pre-search strategy to filter out some high-quality individuals with good diversity, avoiding negative migration caused by individual agglomeration.Fan et al. [32] also applied transfer learning to solve DMOPs with a large amount of computing, and used alternative auxiliary evolution algorithms, especially MOEA/DEGO, as the baseline to evolve and optimize under limited functional assessment.In addition, transfer learning is used to map the previously archived training data to the current environment to quickly start the proxy model building process, so that the dynamic multi-objective evolutionary algorithm can better adapt to the new environment.
Combining the above transfer learning strategies, the transfer learning strategy has good performance in solving dynamic multi-objective optimization test problems that contain periodic changes and have a large degree of change.However, the application of transfer learning to DMOP solving is still in its infancy.Multiple studies have shown that hybrid change response methods generally perform better than single methods because hybrid methods can handle more diverse dynamic features than single methods.This is demonstrated by the increasing use of mixed strategies in recent work [33].
However, the existing transfer learning methods often need a long training time, which is the main obstacle of some DMOPs.One of the reasons for the slow running speed is that the existing transfer learning methods often realize knowledge reuse by searching the potential space, which will require more parameter settings and consume more computing resources, resulting in a large amount of computing resources wasted on searching lowquality individuals, which greatly increases the possibility of negative transfer [34,35].If the knowledge possessed by these high-quality individuals can be transferred (from the perspective of convergence and diversity), then more effective and accurate prediction models can be built for the application of DMOPs in various real complex environments.
At the same time, existing dynamic multi-objective optimization algorithms based on forecasting strategies usually use a single forecasting strategy.On the one hand, a single prediction strategy cannot quickly and effectively respond to complex environmental changes; on the other hand, the group diversity generated by a single prediction strategy is poor, and it cannot quickly and effectively track the Pareto front, resulting in the algorithm not being able to quickly converge.Based on the above analysis, in order to reduce the occurrence of negative transfer and improve the running speed, this paper proposes a new environment response mechanism that combines the cluster difference strategy and the transfer learning strategy.
In this paper, the similarity between the source domain and the target domain is improved by adding a clustering difference strategy to predict individuals before transfer learning, thereby reducing the possibility of negative transfer.Then, the transfer learning method TradaBoost [36] is used to build the prediction model.A higher quality initial population is generated through this model, followed by subsequent multi-objective optimization.Therefore, this method is suitable for any population-based multi-objective optimization algorithm and can achieve a large performance improvement.Experiments on different test functions show that the proposed strategy is highly competitive in dealing with problems with different dynamic characteristics and achieves better convergence and distribution.
The contributions of this paper are summarized as follows: (1) The Pareto solution set at the next moment is predicted by the clustering difference strategy, so as to narrow the difference between the source domain and the target domain of transfer learning, thereby reducing the possibility of negative transfer.Therefore, the preprocessing process of the target domain is very necessary and can make the subsequent transfer learning more efficient.(2) After the target domain is preprocessed, a sample classifier based on the TradaBoost algorithm is used to extract high-quality populations, which can effectively improve the running speed of the algorithm, avoiding more parameter settings and the excessive consumption of computing resources.
The rest of this paper is organized as follows.Section 2 introduces some basic concepts of dynamic multi-objective optimization problems and TradaBoost.Section 3 presents the proposed clustering difference strategy and its combination with transfer learning for solving dynamic multi-objective evolutionary optimization problems.Section 4 presents the experimental setup and results, and discusses the comparison with five other typical dynamic multi-objective optimization algorithms.Section 5 concludes this paper and presents an outlook for future research directions.

Dynamic Multi-Objective Optimization Problems
The mathematical form of dynamic multi-objective optimization of DMOPs is as follows [37]: where x = x 1 , x 2 , . . ., x n is the decision vector, t is a time or environment variable.
are the lower and upper bounds of the i-decision variable, respectively.g i (x, t) ≤ 0, i = 1, 2, . . ., p is the i-th inequality constraint, h j (x, t) = 0, j = 1, 2, . . ., q is the j-th equality constraint.The purpose of solving the DMOP is to find a set of solutions in different times or environments, so that all objectives are as small as possible.
Definition 1 (Dynamic Decision Vector Domination [38]).At time t, the decision vector x 1 Pareto dominates another vector x 2 , expressed as x 1 x 2 , if and only if Definition 2 (Dynamic Pareto-Optimal Set (DPS) [39]).If a decision vector x * at time t satisfies For a fixed time window t and a decision vector x * ∈ Ω, a decision vector x * is said to be non-dominant if no other decision vector x ∈ Ω dominates x * , and the dynamic Pareto-optimal set (DPS) is the set of all non-dominated solutions in the decision space.Definition 3 (Dynamic Pareto-Optimal Front (DPF) [39]).DPF is the set of the corresponding objective vectors of the DPS, and Algorithm 1 is the main framework of DMOEA.After initializing the current generation of the population, the algorithm employs several strategies to respond to the environment when it changes.The initialized population is updated with the effective strategy, and the time window t is incremented by 1 to represent the next environmental change.In the next step, the i-th multi-objective problem is optimized for generation using a multi-objective evolutionary algorithm.SMOEA uses the updated population as the initial population.Finally, repeat the process if the stopping condition is not met.

Algorithm 1:
The main frame of DMOEA.

Input:
The number of generations: g; the time window: t; Output: Optimal solution x * at every time step; Initialize population POP 0 ; While stop criterion is not met do if change is detected, then Update the population using some strategies: reuse memory, tune parameters, or predict solutions; t = t + 1; end if Optimize population with an MOEA for one generation and get optimal solution x * ; end while g = g + 1; return x *

TradaBoost
This paper adopts a method called TradaBoost to meet the requirement of DMOP.TradaBoost is evolved from the Adaboost algorithm, but the Adaboost algorithm, like most traditional machine learning algorithms [40], assumes that the data of the training set and the test set are the same distribution.For migration learning, this assumption is not true.In addition, for the part of the data in the training set that is different from the data in the test set, it will directly lead to a decline in the prediction effect.The TradaBoost algorithm adds weight to each training set sample, and uses the weight to weaken the test set data with different distributions, thereby improving the effect of the model.In each iterative training, if the model misclassifies a source domain sample, then this sample may have a large gap with the target domain sample, so the weight of this sample needs to be reduced.By multiplying the sample by a weight between 0 and 1, through the influence of the weight value, in the next iteration, the influence of this sample on the classification model will be reduced.After a series of iterations, the weights of samples in the source domain that is similar to the target domain or helpful to the classification of the target domain will increase, while the weights of other source domains will decrease.
If there are similarities between multiple source domain datasets and target datasets, in this case, you can try to use multiple source domain datasets to help the learning of the target dataset.More data can be obtained through the above method, so that the relationship between the source data and the target data becomes closer, the transfer process easier, and the classification effect more accurate.
Multi-source TradaBoost assumes that the source training data come from different source domains.In each iteration, the source domain most relevant to the target domain is selected to train a weak classifier, and finally, a strong classifier is obtained.This method can ensure that the transferred knowledge is most relevant to the target task, and through continuous learning, the TradaBoost algorithm can obtain a more accurate classifier for the target domain samples.

Proposed TCD-DMOEA
This section details the transfer learning based on clustering difference for the dynamic multi-objective optimization algorithm (TCD-DMOEA).Figure 2 describes the process of TCD-DMOEA.Specifically, first of all, the framework of the algorithm is outlined.Then, the specific process of the clustering type strategy is described.Finally, we analyze the calculation complexity of the strategy.

Overall Framework
The main framework of TCD-DMOEA is introduced in Algorithm 2. When an ronmental change is detected, in the first two changes, evolution is performed u SMOEA.In subsequent changes, the clustering difference strategy is used to proces target domain (see Algorithm 3), and then the transfer learning prediction model (se gorithm 4) is used to process high-quality individuals as the initial population for su quent iterations.while the environment has changed, do t = t + 1;

Overall Framework
The main framework of TCD-DMOEA is introduced in Algorithm 2. When an environmental change is detected, in the first two changes, evolution is performed using SMOEA.In subsequent changes, the clustering difference strategy is used to process the target domain (see Algorithm 3), and then the transfer learning prediction model (see Algorithm 4) is used to process high-quality individuals as the initial population for subsequent iterations.

Input:
The dynamic optimization problem F t (x), a static multi-objective optimization algorithm SMOEA; Output: The POS of the F t (x) at the different moments; Initialization; POS t = SMOEA(F t (x)); Generate randomly dominated solutions P t ; while the environment has changed, do ; for i = 1 to N do set P t according to (13); Call Learner, providing it the combined training set D with the distribution over D.Then, get back a hypothesis h t : X → Y ; Calculate ε t according to (9); Set β t , Update the weight vector ω t+1 i according to (10); end for Get h f (x) according to (11); Sample solutions x test at the current environment;

Processing of Target Domain
The output of the target domain processing stage is the predicted population.The purpose of generating predicted population is to reduce the possibility of negative transfer in subsequent transfer learning.According to the characteristics of transfer learning, negative transfer can be improved by increasing the amount of effective source domain knowledge or reducing the data distribution difference between neighbors.Through the clustering difference strategy, the characteristics of the source domain and the target domain data are first extracted to reduce the difference between the data, and then the knowledge transfer is performed.At the same time, the population quality is improved through the clustering difference strategy to increase the amount of effective source domain knowledge.
In the dynamic environment, the objective function changes with time, but there is a certain relationship between the two objectives before and after the change.Therefore, the optimal solution information before the change can be used to predict the distribution of the next solution.First, the population is divided into five categories using the k-means algorithm, and the centroids of these five categories are calculated separately.Then, the firstorder difference is used to predict the next corresponding centroid, and these centroids are formed into a predicted population.Figure 3 shows the process of prediction of clustering differential strategies.optimal solution information before the change can be used to predict the distribution of the next solution.First, the population is divided into five categories using the k-means algorithm, and the centroids of these five categories are calculated separately.Then, the first-order difference is used to predict the next corresponding centroid, and these centroids are formed into a predicted population.Figure 3 shows the process of prediction of clustering differential strategies.

Population Individual
Population centroid , , , , need to be initialized first; the Euclidean distance from each object to the center of each cluster is calculated as shown in the following Formula (5): In the above equation, Xi represents the i-th object , Cj represents the center of the j-th cluster 1 j k ≤ ≤ , Xit represents the t-property of the i-th object, , Cjt represents the t-th attribute of the j-th cluster center.The distance from each object to each cluster center is compared sequentially, and the objects are assigned to the cluster of the nearest cluster center, resulting in k class clusters { } , , , , k S S S S  .The K-means algorithm defines the prototype of the class cluster with the center, which is the average of all objects in the class cluster in each dimension, and its calculation process is shown in Formula (6): Algorithm 3 gives the pseudo code of target domain processing.The basic principle of K-means algorithm is: assuming a given data sample X, contains n objects X = {X 1 , X 2 , X 3 , . . . ,X n }, each of these objects has m-dimensions attributes.The goal of the K-means algorithm is to cluster n objects into a specified k-class cluster based on similarity between objects.Each object belongs to only one of the class clusters with the smallest distance to the center of the class cluster.For K-means, k cluster centers {C 1 , C 2 , C 3 , . . . ,C k }, 1 < k ≤ n need to be initialized first; the Euclidean distance from each object to the center of each cluster is calculated as shown in the following Formula (5): In the above equation, X i represents the i-th object 1 ≤ i ≤ n, C j represents the center of the j-th cluster 1 ≤ j ≤ k, X it represents the t-property of the i-th object, 1 ≤ t ≤ m, C jt represents the t-th attribute of the j-th cluster center.
The distance from each object to each cluster center is compared sequentially, and the objects are assigned to the cluster of the nearest cluster center, resulting in k class clusters { S 1 , S 2 , S 3 , . . . ,S k }.
The K-means algorithm defines the prototype of the class cluster with the center, which is the average of all objects in the class cluster in each dimension, and its calculation process is shown in Formula (6): where C l represents the center of the l-th cluster, 1 ≤ l ≤ k, S l represents the number of objects in the l-th class cluster, X i represents the i-th object in the l-th class cluster, 1 ≤ i ≤ |S l |.The population is divided into five categories according to the K-means principle above, and the centroid C i T (i = 1, 2, . . ., 5) of each cluster is calculated after clustering.
The first-order differences are then used to derive the centroids C i T (i = 1, 2, . . ., 5) of each cluster at the next moment.P T is the DPS obtained by the time window T, then C i T can be calculated by the following Formula (7): where C i T+1 represents the centroid of each cluster in the next time window T+1, as obtained by Formula (8): where C i T+1 will constitute a predicted population.

Transfer Learning
After the predicted population is generated, the source and target domains for transfer learning are specified.Figure 4  ( ) ( ) then the algorithm terminates; set up ( ) , sets the new weight vector by Formula (10): Finally, output the final classifier by Formula (11): During the pre-build phase of the source domain, the output is the predicted population.In this approach, the main purpose of transfer learning after preprocessing is to reduce the possibility of negative transfer.
Given the sample with an initial weight vector ( ) , where 1 i ω is ob- tained from Formula (12): Here, the TradaBoost algorithm is mainly used to realize the transfer, increase the weight of each training set sample using the weight to weaken the test set data of those different distributions, and then improve the effect of the model.In each iteration of training, if the model misclassifies a sample of the source domain, the sample may have a large gap with the sample of the target domain, so the weight of the sample needs to be reduced.By multiplying the sample by a weight between 0 and 1, by the effect of the weight value, in the next iteration, the effect of the sample on the classification model will be reduced, and after a series of iterations, the weight of the samples in the source domains that are similar to the target domain or helpful to the classification of the target domain will be increased, while the weights of other source domains will be reduced.When the training is complete, the classifier can recognize randomly generated solutions in the current environment and select those individuals identified as "good" by the classifier as the initial population.The specific algorithm pseudo code is described in detail in Algorithm 4.
Algorithm 4 show the main program of the proposed TCD-DMOEA method.During the transfer process, the target domain D T is the predicted population, and the source domain D S is the solution obtained in the past environment.These solutions are then labeled with c(x) : x ∈ D T ∪ D S → y, y ∈ {+1, −1} .For each domain, the non-dominant solution is labeled +1, and the dominant solution is labeled −1.
Call Learner, according to the combined training data D and the weight distribution P t on D and the unlabeled data s, to obtain a classifier h t : X → Y in S; statistics on the error rate of h t on D t by Formula (9): If ε t > 0.5 then the algorithm terminates; set up β t = ε t /(1 − ε t ), β = 1/ 1 + √ 2 ln n/N , sets the new weight vector by Formula (10): Finally, output the final classifier by Formula (11): During the pre-build phase of the source domain, the output is the predicted population.In this approach, the main purpose of transfer learning after preprocessing is to reduce the possibility of negative transfer.

Computational Complexity Analysis
This section analyzes the computational complexity of TCD-DMOEA at one iteration.According to Algorithm 2, the main calculation of TCD-DMOEA comes from the following aspects.(1) The complexity of the target domain preprocessing process mainly lies in the use of K-means clustering and first-order difference to predict the non-dominated solution at the next moment.K-means clustering requires O(Inkm) calculation, where m is the number of element fields, n is the amount of data, I represents the number of iterations, and k is the number of clusters.Generally, I, k, and m can be considered as constants, so the computational complexity can be simplified to: O(n).The first difference requires O(n) computation.The computational complexity of the target domain preprocessing stage is O(n).(2) The transfer learning stage uses SVM as the basic classifier, and the SVM classifier costs O(NS 2 d) to obtain a strong classifier, where S is the overall size, N is the number of iterations, and d is the dimension of the decision variable.To sum up, the computational complexity of TCD-DMOEA in this work is O(NS 2 d).

Experiments 4.1. Test Problems and Performance Indicators
Our experiment was divided into two parts: the first part demonstrated the convergence and distribution uniformity of TCD-DMOEA by comparing it with several popular dynamic multi-objective algorithms.In the second part, through the comparison with Tr-DMOEA [28], it was possible to observe a reduction in the running time, effectively reducing the possibility of negative transfer.The entire experiment was based on MATLAB R2019b, running in a Windows 10 Pro.
The strategy proposed by the experimental setup was mainly used during the initialization stage of the algorithm.A suite of preprocessing and transfer learning techniques allowed us to obtain a high-quality population adapted to the current environment, and evolution to obtain the optimal solution after modification.Theoretically, the target domain generated by the clustering difference strategy is more similar to the source domain, and the initial population improved by the transfer learning method is more adaptable to the changing environment, so as to obtain a solution closer to the actual DPF.To verify the effectiveness of the method, this section used 20 test functions and two related metrics to measure the convergence and uniformity of the algorithm, while using the running time to evaluate the negative transfer possibility of the algorithm.
The 20 test functions used in this section were from CEC 2018: (1) DF function [41], and (2) F function [42].The DF Benchmark Suite contains 14 questions (DF1-DF14) and the F Benchmark Suite contains six questions (F5-F10).The DF function is a diverse and unbiased benchmark problem, covering various attributes that represent various real scenes, such as time-dependent PF/PS geometry, irregular PF shape, disconnection, knee, etc. F5-F8 in F function have nonlinear correlations among decision variables.The PSs of F5-F7 are 1-D curves, and the PSs of F8 are 2-D surfaces.In F9, the environment changes smoothly in most cases.Sometimes, Pareto sets jump from one region to another.In F10, the geometry of two consecutive PFs is completely different.
In 20 test benchmarks, the time parameter t was used here and defined as t = (1/n t ) (τ/τ t ) , where τ, n t , and τ t represented the maximum number of iterations, the severity of the change, and the frequency of the change, respectively, as described in Table 1.Different kinetic parameters were set for the experiment.We set different variation severities, frequencies of change, and numbers of iterations so that each function could iterate 20 times, and the entire population size was set to 200, meaning that 200 solutions could be generated during evolution.In addition, the k value for K-means was set to 5.This experiment chose RM-MEDA [43] as the SMOEA optimizer of TCD-DMOEA.In this study, the following metrics were used to evaluate the performance of different algorithms.(1) Inverted generational distance (IGD [44][45][46]): IGD evaluates the convergence and diversity of algorithms by measuring the proximity between the real Pareto frontier and the Pareto frontier obtained by the algorithm, and the definition of the IGD indicator is calculated by Formula ( 14), where d is calculated by Formula (15).
where PF * t is the standard POF of the t-moment, it is the POF obtained by the t-moment algorithm, and d is the Euclidean distance between the individual v on the PF * t and the individual closest to v in PF t .It can be seen that the evaluation method of IGD is for each individual in the standard POF.PF t is used to find the closest point to it in the POF.PF t is obtained by the algorithm, the Euclidean distance between them is calculated, and then all the Euclidean distances are summed and the average taken, so IGD can not only evaluate the proximity between PF * t and PF t , but also evaluate the distribution of individuals in PF t ; the smaller the IGD value, the better the convergence of the POF obtained by the algorithm, and the more uniform the distribution.
The MIGD [42,47] indicator is a variant of IGD and is defined as the average of IGD values over certain time steps in operation.The MIGD value is calculated by Formula ( 16): where IGD(t) represents the IGD value at time t, T is a set of discrete time points in operation, and |T| is the cardinality of T.
(2) Maximum coverage (MS [31,48]): MS measures the extent to which the Pareto front obtained by the algorithm covers the standard Pareto front.The larger the MS, the better the performance of the algorithm.The MS value is calculated by Formula ( 17): where PF t * is the standard Pareto frontier of t-moment, and PF t is the POF obtained by the t-moment algorithm.Where PF ti and PF ti are the maximum and minimum values of the i-th target of the POF obtained by the t-moment algorithm, respectively, PF ti * and PF ti * are the maximum and minimum values of the i-th target of the real Pareto frontier at the t-moment.

Performance Comparison with Other Algorithms
In this section, performance comparison experiments are performed.The above indicators MIGD and MS are used, and some algorithms are compared with the algorithm proposed in this paper.Firstly, the MIGD values of six algorithms are compared, proving the proposed algorithm's effectiveness.These six algorithms include a dynamic multiobjective optimization algorithm (TCD-DMOEA) that combines clustering difference and transfer learning, a dynamic multi-objective optimization algorithm (Tr-DMOEA) that combines only transfer learning methods [28], a dynamic NSGA-II algorithm (DNSGA-II-A, DNSGA-II-B) [49], PPS [42], and MOEA based on Kalman's predictions (KF-DMOEA) [50].A comparison of TCD-DMOEA and Tr-DMOEA was conducted to prove the performance of our proposed strategy.According to the difference in performance index and running time, it could be determined that the clustering difference strategy proposed in this paper can not only maintain good convergence and diversity, but also effectively reduce the possibility of negative transfer.
Figure 5 plots the IGD values obtained by different algorithms after each change.These curves show that under 20 different test functions, the curve obtained by the method was basically at the bottom, and the curve of the method fluctuated less, which meant that the method is not only better performing, but also more stable.
The statistical results of MIGD and MS values that were run 20 times are shown in Tables 2 and 3, respectively.Table 2 shows the MIGD values for six algorithms in three different configurations.Bold words in the table indicate that the algorithm had the best diversity on this benchmark, and the last column represents the "winner" in this comparison.It can be seen from Table 2 that TCD-DMOEA obtained 44 of the 60 best results, accounting for 73.3%.KF-DMOEA achieved five best results, PPS achieved three best results, and DNSGA-II-A and DNSGA-II-B achieved six and two best results, respectively.Specifically, TCD-DMOEA performed well at most test functions in all dynamic test setups, and DNSGA-II-A achieved better convergence than TCD-DMOEA on DF9 and DF12.The value of PPS was obviously superior to that of other algorithms on DF11.
Table 3 shows the MS values for six comparison algorithms, and it is clear that TCD-DMOEA obtained 38 out of 60 best results, KF-DMOEA obtained only one best result, PPS obtained five best results, and Tr-DMOEA obtained nine best results.DNSGA-II-A and DNSGA-II-B achieved four and three best results, respectively.Specifically, TCD-DMOEA performed well at most test functions in all dynamic test setups, with Tr-DMOEA performing better on F8 and DNSGA-II-B performing better on DF10.
A, DNSGA-II-B) [49], PPS [42], and MOEA based on Kalman's predictions (KF-DMOEA [50].A comparison of TCD-DMOEA and Tr-DMOEA was conducted to prove the perfor mance of our proposed strategy.According to the difference in performance index an running time, it could be determined that the clustering difference strategy proposed i this paper can not only maintain good convergence and diversity, but also effectively re duce the possibility of negative transfer.
Figure 5 plots the IGD values obtained by different algorithms after each change These curves show that under 20 different test functions, the curve obtained by th method was basically at the bottom, and the curve of the method fluctuated less, whic meant that the method is not only better performing, but also more stable.The statistical results of MIGD and MS values that were run 20 times are shown i Tables 2 and 3, respectively.Table 2 shows the MIGD values for six algorithms in thre different configurations.Bold words in the table indicate that the algorithm had the bes diversity on this benchmark, and the last column represents the "winner" in this compar ison.It can be seen from Table 2 that TCD-DMOEA obtained 44 of the 60 best result accounting for 73.3%.KF-DMOEA achieved five best results, PPS achieved three best re sults, and DNSGA-II-A and DNSGA-II-B achieved six and two best results, respectively Specifically, TCD-DMOEA performed well at most test functions in all dynamic test se ups, and DNSGA-II-A achieved better convergence than TCD-DMOEA on DF9 and DF12 The value of PPS was obviously superior to that of other algorithms on DF11.
Table 3 shows the MS values for six comparison algorithms, and it is clear that TCD DMOEA obtained 38 out of 60 best results, KF-DMOEA obtained only one best result, PP obtained five best results, and Tr-DMOEA obtained nine best results.DNSGA-II-A an DNSGA-II-B achieved four and three best results, respectively.Specifically, TCD-DMOEA performed well at most test functions in all dynamic test setups, with Tr-DMOEA per forming better on F8 and DNSGA-II-B performing better on DF10.
The above experimental results show that TCD-DMOEA could obtain a set of solu tions with good convergence and diversity for most test problems.However, it did no perform well enough on some reference functions, such as DF9, DF11, and DF12.TCD The above experimental results show that TCD-DMOEA could obtain a set of solutions with good convergence and diversity for most test problems.However, it did not perform well enough on some reference functions, such as DF9, DF11, and DF12.TCD-DMOEA also had some shortcomings, which may have been due to the large variation in the POS of these problems, so it was difficult to accurately predict in the target domain processing stage, and the final output of the population quality was poor.
At the same time, it can be found from Table 2 that except for F8, F9 and F10, TCD-DMOEA achieved good performance on the F test function set in most cases.These three test functions were characterized by POS jumps from one region to another, and the geometry of the two consecutive POFs was completely different.This resulted in a low degree of similarity between the source domain and the target domain, which ultimately increased the likelihood of negative transfer.

Running Speed
The most obvious feature that reduces the possibility of negative transfer is reduced running time, so the running times of different algorithms are compared and shown in Table 4. Table 4 shows that the running time of TCD-DMOEA was the shortest among the six algorithms.This shows that the proposed clustering difference strategy was very effective.TCD-DMOEA ran much faster than Tr-DMOEA.The important difference is that Tr-DMOEA mapped the different distributions that the solution obeys at varying moments to a new potential space through the domain adaptation method, and then found the solution in the potential space, and the construction of the potential space required a huge amount of resources.However, TCD-DMOEA improved the similarity between the source domain and the target domain by preprocessing the population and distinguished the quality of the population through classifiers, which significantly shortened the running time.Figure 6 is a histogram of the running time of several other algorithms except Tr-DMOEA, through which the running time of TCD-DMOEA can be visually compared, and it was found that the running time of TCD-DMOEA was the shortest, while the running times of other algorithms were not much different.The above experimental results show that the proposed TCD-DMOEA method not only improved the running speed of the algorithm, but also greatly improved the quality of the Pareto optimal solution set, resulting in a better state of convergence and distribution for the algorithm.resulting in a better state of convergence and distribution for the algorithm.

Conclusions
In recent years, transfer learning has been proven to be one of the effective means to solve dynamic multi-objective optimization problems.However, the efficiency of transfer learning (also known as negative transfer) decreases significantly when the target domain is poorly similar to the source domain, or when the transfer learning method is incorrect.Negative transfer forces a search for solutions in the wrong direction, wasting a lot of

Conclusions
In recent years, transfer learning has been proven to be one of the effective means to solve dynamic multi-objective optimization problems.However, the efficiency of transfer learning (also known as negative transfer) decreases significantly when the target domain is poorly similar to the source domain, or when the transfer learning method is incorrect.Negative transfer forces a search for solutions in the wrong direction, wasting a lot of computing resources.As a result, the running speed becomes slower and the convergence becomes worse.
In this article, a transfer learning method based on a cluster difference dynamic multiobjective optimization algorithm, TCD-DMOEA, was proposed.The TCD-DMOEA applies a clustering difference strategy to increase the similarity between the target domain and the source domain to reduce the likelihood of negative transfer, and the TradaBoost algorithm to classify good-quality populations.Therefore, when the environment changes drastically, the proposed method can improve the quality of the population in the drastically changing environment, thereby improving the convergence and diversity of the dynamic multiobjective algorithm, and at the same time improve the running speed of the algorithm.
The above experimental results fully demonstrate that the algorithm significantly improves the performance of dynamic optimization.Compared with existing transfer learning-based algorithms, the proposed algorithms are tens or even hundreds of times faster at finding POS.
Although the proposed TCD-DMOEA can generate a high-quality initial population, the reliability of the acquired individuals becomes very poor.When the environmental changes are more complex, the accuracy of the cluster-based difference strategy will decrease, and due to the added classifier, the computational complexity of the algorithm will increase.Therefore, in future research work, we will explore the following promising directions.First, it would be beneficial for our static evolution process to try to combine multiple response mechanisms to cope with environmental changes, rather than employing a single strategy.Second, we can try to use classifiers with lower complexity to speed up the optimization process.Additionally, it will be worthwhile to test TCD-DMOEA on a wider range of problems with different types of variation.

2 Figure 1 .
Figure 1.Description of environmental response strategies and methods.(a) diversity maintenance.(b) Memory based approach.(c) Prediction based approach (d) Manifold transfer learning method.

Figure 1 .
Figure 1.Description of environmental response strategies and methods.(a) diversity maintenance.(b) Memory based approach.(c) Prediction based approach (d) Manifold transfer learning method.

Figure 3 .
Figure 3. Classification of clustering dynamics.Algorithm 3 gives the pseudo code of target domain processing.The basic principle of K-means algorithm is: assuming a given data sample X, contains n objects { } 1 2 3 , , , , n X X X X X =  , each of these objects has m-dimensions attributes.The goal of the K-means algorithm is to cluster n objects into a specified k-class cluster based on similarity between objects.Each object belongs to only one of the class clusters with the smallest distance to the center of the class cluster.For K-means, k cluster centers { } 1 2 3

Figure 4 .
Figure 4. Schematic of procedure case transfer.Algorithm 4 show the main program of the proposed TCD-DMOEA method.During the transfer process, the target domain DT is the predicted population, and the source domain DS is the solution obtained in the past environment.These solutions are then labeled with 1 ( ) { } : , ,1 T S c x x D D y y ∈ ∪ → ∈ + − .For each domain, the non-dominant solution is labeled +1, and the dominant solution is labeled -1.Call Learner, according to the combined training data D and the weight distribution P t on D and the unlabeled data s, to obtain a classifier : t h X Y → in S; statistics on the error rate of ht on Dt by Formula (9):

Figure 4 .
Figure 4. Schematic of procedure case transfer.

Figure 5 .
Figure 5. IGD values of six algorithms under S2 configuration.

Figure 5 .
Figure 5. IGD values of six algorithms under S2 configuration.

Figure 6 .
Figure 6.Average running time (s) obtained by comparing algorithms under the configuration of t n = 5, t τ = 10.

Figure 6 .
Figure 6.Average running time (s) obtained by comparing algorithms under the configuration of n t = 5, τ t = 10.
The current population P T ; the number of individuals in population, N; Output: The predicted population PP Initialize the random population P T and evaluate the initial population P T ; The two labeled sets D S and D T , and unlabeled data set D, a based learning algorithm Learner, and the maximum number of iterations N; , F t (x));Input:Input: Output: The initial population initPop; Initialize the initial weight vector ω 1

Table 1 .
The dynamic parameters.

Table 2 .
The MIGD values of six comparative algorithms with different dynamic test settings.

Table 3 .
The MS values of six comparative algorithms with different dynamic test settings.

Table 4 .
The average running time (seconds) of the six algorithms under the S2 configuration.

Table 4 .
The average running time (seconds) of the six algorithms under the S2 configuration.