Support Vector Machine-Assisted Importance Sampling for Optimal Reliability Design

: A population-based optimization algorithm combining the support vector machine (SVM) and importance sampling (IS) is proposed to achieve a global solution to optimal reliability design. The proposed approach is a greedy algorithm that starts with an initial population. At each iteration, the population is divided into feasible/infeasible individuals by the given constraints. After that, feasible individuals are classiﬁed as superior/inferior individuals in terms of their ﬁtness. Then, SVM is utilized to construct the classiﬁer dividing feasible/infeasible domains and that separating superior/inferior individuals, respectively. A quasi-optimal IS distribution is constructed by leveraging the established classiﬁers, on which a new population is generated to update the optimal solution. The iteration is repeatedly executed until the preset stopping condition is satisﬁed. The merits of the proposed approach are that the utilization of SVM avoids repeatedly invoking the reliability function (objective) and constraint functions. When the actual function is very complicated, this can signiﬁcantly reduce the computational burden. In addition, IS fully excavates the feasible domain so that the produced offspring cover almost the entire feasible domain, and thus perfectly escapes local optima. The presented examples showcase the promise of the proposed algorithm.


Introduction
There are various approaches to increase system reliability [1], for example, raising component reliability, increasing redundancy level, exchanging positions of important components, selection of different redundancy methods (e.g., active vs. standby), maintenance, etc.These measures undoubtedly will increase the system budget.Therefore, there needs to be a trade-off between reliability improvement and system budget [2].No matter which measure we choose to enhance system reliability, the corresponding problem can be abstracted into a similar mathematical model, i.e., either maximizing system reliability under various resource constraints, or minimizing resource requirement under minimum reliability requirements.Of course, different methods of increasing system reliability correspond to different design variables, and the expression of system reliability is also different.Practically, we have to choose the most appropriate of one or several measures to improve system reliability according to the problem at hand.Another problem associated with this is how to procure the optimal decision scheme according to the established mathematical model.
There is a common sense that almost all optimal reliability designs (ORDs) are nondeterministic polynomial hard (NP-hard) problems [3].Canonical methods for optimization, such as the linear/dynamic programming techniques, often fail or reach local optima in solving high-dimensional and complex problems.These are usually referred to as the exact methods.Although much more computational complexity is involved, exact methods are able to provide precise optimal solutions.These approaches are particularly advantageous for small-scale systems.More importantly, their solutions can be used to measure the performance of newly developed optimization strategies.Hence, there are also some improved versions of this kind of algorithm [4].The difficulties associated with applying the mathematical programming on large-scale engineering systems have contributed to the development of alternative solutions.
One of the alternatives is the heuristic approach.Heuristics do not guarantee precise optimal solutions, but are highly recommended for solving ORD.This is because heuristics achieve reasonable solution quality for large-scale systems within relatively short periods [5].Interestingly, heuristics usually leverage the sort information obtained by importance measures to guide the search direction.Importance measures, evaluating the relative importance of different components/positions in a system, can be used to prioritize components/positions in a system by quantitatively measuring their impact amount on the system reliability, whose value may not be as useful as their relative ranking [6].The application of importance measures will guide the convergent direction and limit the randomness in heuristics, so that they can reach the (near) global optimal solution sooner.The role of importance measures in system design has been proved to be crucial.The heuristic approach is more efficient than the exact method, but it still takes a long time if the problem's scale is very large.
To further accelerate the convergence speed, researchers have turned their attention to intelligent algorithms (i.e., metaheuristics).These are generally stochastic search methods that mimic the metaphor of natural biological evolution, social behavior of species, and natural/physical phenomena.Metaheuristics are currently considered to be the most promising solutions, because they can find (near) optimal solutions within reasonable CPU time.An old intelligent algorithm is simulated annealing (SA), which is invented on the basis of annealing (slow cooling after heating) of melted metals to crystallize their structures [7].SA can jump out of the local optimum with a certain probability, and eventually tends to the global optimum.For applications of SA to solve ORD, we refer to [8,9].Another well-known intelligent algorithm is the genetic algorithm (GA).GA is an evolutionary-based optimization technique, which was proposed by mimicking the natural selection and genetic mechanism [10].Selection, crossover, and mutation are cores of GA.Examples of papers applying GA to solve ORD are [11][12][13].Particle swarm optimization (PSO) is a random search algorithm based on group cooperation [14].PSO is initialized as a group of random particles within the feasible domain.In each iteration, the particles update themselves by tracking the optimal solution found by the particle itself (i.e., individual best or personal best) and the optimal solution found by the whole population (i.e., global best).For papers that apply PSO to solve ORD, see [15][16][17].Ant colony optimization (ACO) is a probabilistic algorithm used to find the optimal decision scheme [18].It utilizes the walking path of ants to represent the feasible solution of the optimization problem.For applications of ACO for solving ORD, see [19,20].There are also some other metaheuristics for solving ORD, such as in [21,22].
Nonconvex optimization problems can present many discontinuous discrete feasible regions and local optima.This may trap the algorithm's iterations and make the algorithm of poor quality to tackle the problem at hand.Therefore, global optimization methods must be sought to escape local optima.Revisiting the collected literature, we can see intuitively that most population-based intelligent approaches are greedy algorithms.These explore the optimal solution gradually and iteratively.The decision of each iteration is usually made according to a certain criterion based on the current situation, without considering all possible situations.It makes successive greedy choices until the optimal solution is emerged.In this process, the generation (or renewal) of the next-generation population (or particle position, etc.) is a crucial operation.For example, SA draws new candidate solutions by simulating an ergodic Markov chain whose stationary distribution is the target distribution; GA produces new individuals through selection, crossover, and mutation; and PSO updates the positions of particles in terms of particle velocity, local optimum, and global optimum.Despite the benefits of intelligent algorithms, there are still many issues associated with implementing these approaches.For example, it is difficult to determine the initial temperature and temperature gradient in SA; GA may require long processing for a feasible solution to evolve; and PSO is easy to be caught into local optima and lacks of strict mathematical study.
In an attempt to reduce the processing time and improve the quality of solutions, particularly to escape local optima, this paper proposes a new population-based greedy algorithm that is able to reach the (near) global optimum in a relatively short time.The key ingredients of the proposed algorithm include importance sampling (IS) [23] and support vector machine (SVM) [24,25].Starting from an initial group of individuals uniformly generated from the design domain, a new population is produced based on the existing information about the feasible/infeasible domains and the fitness values of feasible individuals.New populations will be generated iteratively until the optimal solution appears.To generate the new population in each iteration, a quasi-optimal IS probability density function (PDF) is constructed as the target distribution to draw samples for the new-generation population, leveraging the information of the constraint boundary and the fitness values of feasible individuals.To alleviate the computational burden, SVM is utilized to manage the information that is used to construct IS PDF, so as to avoid repeatedly invoking the objective and constraint functions numerous times.Obviously, this advantage is of great significant for complex problems.Furthermore, to speed up the convergence, a number of candidate solutions are generated at each iteration.The merits of the proposed algorithm are twofold.On the one hand, IS prevents sample degeneracy (keeps the sample diversity), and thus the exploration of the feasible space is more adequate.In addition, the constructed optimal IS PDF perfectly avoids local optima.On the other hand, the utilization of SVM escapes the repeated invocation of complex functions, thus saving computation time.This advantage is evident if the investigated problem involves complicated black-box functions.
The innovations and contributions of this paper are threefold.(a) A deterministic target distribution is constructed by utilizing IS without the need to set a series of parameters.(b) SVM is used to construct alternative models for dividing the feasible/infeasible domains and distinguishing the superior/inferior individuals.This facilitates the sampling process, because it does not need to repeatedly invoke the complex functions involved in the optimization model.(c) New individuals can be simply generated via the constructed quasi-optimal IS PDF without complicated operations.The diversity of new individuals is ensured and local optima avoided.The rest of this paper is organized as follows.Section 2 revisits the mathematical model related to ORD.Section 3 introduces the proposed algorithm with explanations of the rationale behind it.Numerical results are given in Section 4 to showcase the feasibility of the proposed algorithm.Conclusions are drawn in Section 5.

Model Description
In the light of the requirements of designers, ORD can be formulated either to maximize the system reliability under resource constraints or to minimize the resource under the minimum demand on system reliability.For brevity, we only take the former to illustrate.For the latter, the proposed algorithm is also applied.Put more clearly, the mathematical model of ORD is given by [26]: where R s (z) is the objective function (system reliability) related to design variables z.G i (z) is the ith constraint function with preset threshold G t i for i = 1, 2, • • • , n c , and n c is the number of constraints.z L and z U are the lower and upper bound vectors of decision variables z.
Example 1. Takethe Wi-Fi system shown in Figure 1 to illustrate.The whole area is covered by three signal networks, namely, Verizon, AT&T, and T-Mobile.Each carrier has four relay stations, and each relay station can send and receive signals in a specific block.Here, each block is covered by three consecutive staggered relay stations operated by these three carriers.The Wi-Fi system uses the strongest detection signals from different carriers.Unequivocally, the Wi-Fi signal loss in a particular area occurs if and only if the three consecutive staggered relay stations fail altogether.This Wi-Fi system can be abstracted into a Lin/Con/k/n:F system where k = 3 and n = 12.The Lin/Con/k/n:F system is a special two-terminal network that includes an ordered sequence of n components arranged in a line.The system fails if and only if at least k consecutive components fail.The Lin/Con/k/n:F system is important and has many applications, such as the pipeline system, streetlight system, and telecommunications system.
where   () is the objective function (system reliability) related to design variables .  () is the th constraint function with preset threshold    for  = 1,2, ⋯ ,   , and   is the number of constraints.  and   are the lower and upper bound vectors of decision variables .
Example 1.Take the Wi-Fi system shown in Figure 1 to illustrate.The whole area is covered by three signal networks, namely, Verizon, AT&T, and T-Mobile.Each carrier has four relay stations, and each relay station can send and receive signals in a specific block.Here, each block is covered by three consecutive staggered relay stations operated by these three carriers.The Wi-Fi system uses the strongest detection signals from different carriers.Unequivocally, the Wi-Fi signal loss in a particular area occurs if and only if the three consecutive staggered relay stations fail altogether.This Wi-Fi system can be abstracted into a Lin/Con//:F system where  = 3 and  = 12.The Lin/Con//:F system is a special two-terminal network that includes an ordered sequence of  components arranged in a line.The system fails if and only if at least  consecutive components fail.The Lin/Con//:F system is important and has many applications, such as the pipeline system, streetlight system, and telecommunications system.To improve the reliability of this Wi-Fi system, we can increase the reliability of relay station or the number of relay stations.Without loss of generality, we consider a Lin/Con//:F system with redundant components, as shown in Figure 2, in which   is the number of redundant components of the th subsystem for  = 1,2, ⋯ , , and  is the number of subsystems.The system fails if the successive  subsystems fail.In this example, subsystem  contains an active-standby component and   − 1 cold-standby redundant components.Suppose that the switch is required at all times, and there is a constant probability that the switching will be successful [8].In addition, the following assumptions are made.
(1) Each component/switch possesses only two states: normal and abnormal.(2) The  To improve the reliability of this Wi-Fi system, we can increase the reliability of relay station or the number of relay stations.Without loss of generality, we consider a Lin/Con/k/n:F system with redundant components, as shown in Figure 2, in which z j is the number of redundant components of the jth subsystem for j = 1, 2, • • • , n, and n is the number of subsystems.The system fails if the successive k subsystems fail.In this example, subsystem j contains an active-standby component and z j − 1 cold-standby redundant components.
where   () is the objective function (system reliability) related to design variables .  () is the th constraint function with preset threshold    for  = 1,2, ⋯ ,   , and   is the number of constraints.  and   are the lower and upper bound vectors of decision variables .
Example 1.Take the Wi-Fi system shown in Figure 1 to illustrate.The whole area is covered by three signal networks, namely, Verizon, AT&T, and T-Mobile.Each carrier has four relay stations, and each relay station can send and receive signals in a specific block.Here, each block is covered by three consecutive staggered relay stations operated by these three carriers.The Wi-Fi system uses the strongest detection signals from different carriers.Unequivocally, the Wi-Fi signal loss in a particular area occurs if and only if the three consecutive staggered relay stations fail altogether.This Wi-Fi system can be abstracted into a Lin/Con//:F system where  = 3 and  = 12.The Lin/Con//:F system is a special two-terminal network that includes an ordered sequence of  components arranged in a line.The system fails if and only if at least  consecutive components fail.The Lin/Con//:F system is important and has many applications, such as the pipeline system, streetlight system, and telecommunications system.To improve the reliability of this Wi-Fi system, we can increase the reliability of relay station or the number of relay stations.Without loss of generality, we consider a Lin/Con//:F system with redundant components, as shown in Figure 2, in which   is the number of redundant components of the th subsystem for  = 1,2, ⋯ , , and  is the number of subsystems.The system fails if the successive  subsystems fail.In this example, subsystem  contains an active-standby component and   − 1 cold-standby redundant components.Suppose that the switch is required at all times, and there is a constant probability that the switching will be successful [8].In addition, the following assumptions are made.
(1) Each component/switch possesses only two states: normal and abnormal.(2) The  Suppose that the switch is required at all times, and there is a constant probability that the switching will be successful [8].In addition, the following assumptions are made.(1) Each component/switch possesses only two states: normal and abnormal.
(2) The performance of each component/switch is not affected by others.(3) There is no repair/maintenance during the whole service cycle.(4) The components or switches of a subsystem are of the same type.(5) There is imperfect switching to activate the cold-standby redundant components.(6) The time to failure of components is exponential.
Then, following [27,28], the reliability of subsystem j is: where r j (t) is the component reliability at moment t for the jth subsystem, i.e., the probability that the lifetime of the component in the jth subsystem is larger than t.ρ j (t) is the reliability of the switching mechanism at moment t. f (s) j (u) is the PDF for the sth failure in the jth subsystem, i.e., the probability that the sth failure in the jth subsystem arrives at moment u.
The first term of (2) indicates that the active-standby redundant component remains in a good state until moment t; during this period, no cold-standby redundant components are put into operation.The summation term in (2) represents s cold-standby redundant components being sequentially activated through the switch.This implies that the initial active-standby redundant component and the first s − 1 cold-standby redundant components have failed before moment t, and the sth cold-standby redundant component works until moment t.There are s failures arriving in total, and all the s switches are required to be successful to make sure that the system is reliable at moment t.
However, it is difficult to derive the closed form of (2), because of the intractability of the integration.A more accessible lower bound R j (t) of the concerned reliability is given in [27] based on the non-increasing property of the switch device probability (i.e., ρ j (u) ≥ ρ j (t) for u ≤ t): R j (t) ≥ R j (t) = r j (t) + ρ j (t) Obviously, R j (t) is a conservative estimation of R j (t).When ρ j (t) is close enough to 1, (3) is a good estimation of (2).For brevity, we no longer distinguish between R j (t) and R j (t).
Henceforth, unless otherwise specified, the system reliability refers to its lower bound.
Since the switch's reliability is a constant, (3) can be simplified as: where ρ j is the reliability of switches in subsystem j.In terms of the exponential time-to-failure assumption, the occurrences of subsystem failures can be treated as a homogeneous Poisson process prior to the z j th failure.On this basis, the reliability of subsystem j is the probability that there are strictly less than z j failures, which is Poisson-distributed [27,29,30].Therefore: where β j is the component failure rate (the exponential distribution parameter) of the jth subsystem.Taking (5) into (4), we can obtain: After that, the reliability of this Lin/Con/k/n:F system is obtained by the recursive function, as follows: with the boundary condition R s (t; k, n) = 1 for n < k.Now, the goal is to design a Lin/Con/k/n:F system under system-level constraints, such that the system reliability ( 7) is maximized.For simplicity, the design variables are temporarily set as redundant levels here, that is, z = {z 1 , z 2 , • • • , z n }.Then, the mathematical model of this design task is as follows: max : where G 1 (z) and G 2 (z) are cost and volume constraints, respectively, c i and v i are parame- ters related to these two constraints, respectively.C and V are the thresholds of these two constraints, respectively, r i is the reliability of component of subsystem i, and N + is the set of positive integers.In addition, these constraints are dimensionless and can be regarded as the constraints after standardization.It is seen that ( 8) involves a complex system reliability function related to decision variables.To explore the optimal decision scheme, the recursive approach is usually adopted to estimate the system reliability under each candidate decision.However, for a candidate solution, it spends a long time on the recursion to procure a precise estimate.This will consume large computational effort and reduce the efficiency of the whole optimization procedure.To mitigate the computation burden, we propose an SVM-assisted IS approach to address the formulated ORD.

Proposed Solution Procedure
To facilitate understanding, we use (1) to illustrate the proposed population-based optimization algorithm.Following custom, we first transform (1) into a minimization problem, as follows: min : Here, H(z) is the new objective function.
The general process of the population-based greedy approach for exploring the optimal solution of ( 9) is presented in Algorithm 1, in which l stands for the lth iteration and Iter max is the longest iteration time.
Algorithm 1 General process of the population-based optimization approach 1.
Produce the first-generation population.
Sift out feasible individuals from the whole population.
Evaluate the fitness values of feasible solutions.

5.
Produce the next-generation population.

End for
The first-generation individuals are usually generated by evenly occupying the whole design space, in order to capture more information of the feasible domain and procure a relatively good solution at the initial design stage.To achieve this goal, we can use stratified sampling approaches, such as Latin hypercube sampling or low-discrepancy sampling strategies, such as Sobol's sequence.
Then, we process the initial population, and the given constraints are utilized to filter out infeasible individuals while retain feasible ones.The current optimal solution is updated as the feasible individual with minimum objective function value.After that, a new-generation population should be produced with the intention of improving the solution.Before this, a criterion, dubbed the fitness, is usually used to evaluate existing feasible individuals, so as to determine the informative parent individuals (i.e., superior individuals) for the next generation.These superior individuals may be directly inherited by the offspring or act as guidelines to produce better offspring.Then, we process the new population with the same strategy of processing the previous population, in order to further refine the solution.This process is proceeded iteratively until the termination criterion or the limited maximum iteration number (Iter max ) is achieved.
From the above analyses, it is seen that step 5 is at the core of the whole optimization approach, i.e., the way to produce the next-generation population is the key ingredient of the optimization algorithm.The quality of the offspring severely affects the quality of the final solution and the convergence speed of the algorithm.Generally, we hope that the new population has the following peculiarities: (i) falling within the feasible region as far as possible; (ii) mining the information of the feasible domain as much as possible; and (iii) possessing better fitness than their parents.These are the directions of the proposed approach to improve the efficacy of population-based approaches.Obviously, for the desired feature #i, we need to resort to constraint functions G i (z ), because they decide whether an individual is feasible.For feature #ii, new individuals tend to be produced as evenly as possible in the feasible area, in order to fully mine the information of the feasible domain and escape local optima.As for #iii, we turn to current feasible individuals for help, striving to make new individuals better than their parents.In order to produce better offspring, the first idea that comes to mind is to straightforwardly generate new individuals in terms of the given proposal (or target) distribution.As such, the problem is transformed into how to establish a suitable proposal distribution to generate excellent offspring.
Motivated by these facts, we propose a new way to produce the offspring by drawing upon the principle of IS and SVM.The merits of the proposed algorithm can be explained from two perspectives.From the sampling perspective, the proposed algorithm helps to overcome sample degeneracy and keep the diversity of individuals, and ensures that each individual is informative.From the optimization perspective, the proposed algorithm brings more exploration to the neighborhood of good candidate solutions.It pays equal attention to possible solution spaces, rather than focusing only on elite parents, in order to perfectly avoid local optima.This advantage is very important for the problem of multiple discrete feasible regions, especially when the importance of each feasible region is close.

Importance Sampling for Optimal Proposal Distribution
Let f (z) be the prior joint PDF of variables z, and g(z) be the PDF of the needed proposal distribution.Then, for any integrable function ϕ(z), its integration with respect to f (z) equals: Taking advantage of the instrumental PDF g(z), (10) can be equivalently expressed as: If we draw N independent and identically distributed (i.i.d.) samples {z i } N i=1 from g(z) and set their weights {ω i } N i=1 according to: Then, in view of (11), the estimate of I ϕ is: This instrumental PDF g(z) is also referred to as the IS PDF corresponding to f (z).A most direct IS PDF g(z) is to transfer the sampling center from the mean point to an informative point, as shown in Figure 3.In Figure 3, it is a 2D case in the standard normal space.The dashed lines stand for the iso-probability density lines of f (z) or g(z).The mean point of f (z) is the origin, and the sampling center of g(z) is z * .Now, suppose that subspace 1 is the region of interest (e.g., feasible domain), while subspace 2 is a region of no concern (e.g., infeasible domain).There is a boundary separating these two spaces.The purpose of sampling is to place samples in subspace 1 as much as possible.Obviously, f (z) cannot complete this goal, but g(z) can.This IS PDF is easy to understand, but its defects are evident.If a problem has multiple informative points and the importance of each informative point is close, this IS will be trapped into local optima, but for a practical problem, we cannot know whether it has multiple informative points in advance.For the investigated problem, the informative point can be viewed as a local optimal solution.Thus, we need to explore a more suitable IS strategy that globally explores the interested domain.
= ∫ ()() (10) Taking advantage of the instrumental PDF (), (10) can be equivalently expressed as: If we draw  independent and identically distributed (i.i.d.) samples {  } =1  from () and set their weights {  } =1  according to: Then, in view of (11), the estimate of   is: This instrumental PDF () is also referred to as the IS PDF corresponding to ().A most direct IS PDF () is to transfer the sampling center from the mean point to an informative point, as shown in Figure 3.In Figure 3, it is a 2D case in the standard normal space.The dashed lines stand for the iso-probability density lines of () or ().The mean point of () is the origin, and the sampling center of () is  * .Now, suppose that subspace 1 is the region of interest (e.g., feasible domain), while subspace 2 is a region of no concern (e.g., infeasible domain).There is a boundary separating these two spaces.The purpose of sampling is to place samples in subspace 1 as much as possible.Obviously, () cannot complete this goal, but () can.This IS PDF is easy to understand, but its defects are evident.If a problem has multiple informative points and the importance of each informative point is close, this IS will be trapped into local optima, but for a practical problem, we cannot know whether it has multiple informative points in advance.For the investigated problem, the informative point can be viewed as a local optimal solution.Thus, we need to explore a more suitable IS strategy that globally explores the interested domain.It is seen that the expectation of estimate Îϕ is: Since {z i } N i=1 are i.i.d.samples from g(z), (14) can be further transformed into: This indicates that ( 13) is an unbiased approximation of I ϕ .Then, the variance of estimate Îϕ is: In the same vein, due to {z i } N i=1 are i.i.d.samples from g(z), V( Îϕ ) can be converted into: Because the variance of samples converges to that of the population in the sense of probability, we can obtain: Substituting ( 18) for ( 17), V( Îϕ ) can be approximated by: Reducing the variance V( Îϕ ) to 0, we can obtain: where g opt (z) is the optimal choice of g(z), i.e., optimal IS PDF.This optimal IS PDF g opt (z) no longer provides the maximum priority for a certain point, but assigns priority depending on the contribution of the point itself to the solution.Its advantages are escaping from local optima and avoiding searching for the important point of constructing IS PDF.
Figure 4a shows a 2D problem with multiple important points (regions), and the shaded area represents the region of interest.It is seen that this example possesses discrete interested domains that look like a chessboard.If we use the IS shown in Figure 3 to sample for the interested domains, a possible result is shown in Figure 4b.It can be observed that a vast majority of samples are concentrated in a local area.If the best solution is in this local area, this case can happen to get the global optimum.However, if the global optimum is far away from this region, it is obvious that this case is caught in the local optimum.Figure 4c presents the sampling result obtained by the optimal IS PDF g opt (z).Compared with Figure 4b, the generated samples cover multiple regions of interest.Therefore, it explores feasible regions more fully, and the possibility of obtaining the global optimal solution is obviously larger.Now, recall that our purpose is to produce new individuals within the feasible domain that have better fitness than their parents.Let I F (z) be an indicator function such that I F (z) = 1 if z belongs to the feasible domain, I F (z) = 0 otherwise.Furthermore, let I λ (z) be the indicator function such that I λ (z) = 1 if H(z) ≤ λ, I λ (z) = 0 otherwise.λ is a constant that is related to the fitness.For two feasible individuals, z i and z j , if H(z i ) ≤ H z j , we say that the fitness of z i is better than that of z j .Feasible individuals that satisfy H(z) ≤ λ are referred to as superior individuals, and those with H(z) > λ are inferior individuals.
After that, we clarify the specific form of ϕ(z) as: Here, ϕ(z) stands for an indicator function so that ϕ(z) = 1 if z is a feasible point with objective function value smaller than λ, and ϕ(z) = 0 otherwise.Now, recall that our purpose is to produce new individuals within the feasible domain that have better fitness than their parents.Let   () be an indicator function such that   () = 1 if  belongs to the feasible domain,   () = 0 otherwise.Furthermore, let   () be the indicator function such that   () = 1 if () ≤ ,   () = 0 otherwise. is a constant that is related to the fitness.For two feasible individuals,   and   , if (  ) ≤ (  ), we say that the fitness of   is better than that of   .Feasible individuals that satisfy () ≤  are referred to as superior individuals, and those with () >  are inferior individuals.
After that, we clarify the specific form of () as: Here, () stands for an indicator function so that () = 1 if  is a feasible point with objective function value smaller than , and () = 0 otherwise.Using (21), the optimal proposal distribution (20) can be further expressed as: Sampling from   () theoretically can obtain the optimal desired offspring.However, this optimal IS PDF   () is not available in practice, because we do not have any information of the feasible domain in advance.That is,   () is unknown and should be explored.Meanwhile, we need to determine the threshold value  , in order to determine   () .Hence, we can only integrate the current available information to establish an asymptotical alternative to the optimal IS PDF   (), in order to generate Using (21), the optimal proposal distribution (20) can be further expressed as: Sampling from g opt (z) theoretically can obtain the optimal desired offspring.However, this optimal IS PDF g opt (z) is not available in practice, because we do not have any informa- tion of the feasible domain in advance.That is, I F (z) is unknown and should be explored.Meanwhile, we need to determine the threshold value λ, in order to determine I λ (z).Hence, we can only integrate the current available information to establish an asymptotical alternative to the optimal IS PDF g opt (z), in order to generate offspring according to the proposal distribution.Obviously, the current information we have is that from parent populations.The indicator function I F (z) is unknown, but we can construct an alternative model ÎF (z) for it by leveraging the available information of the feasible/infeasible domains.In the same vein, the alternative model Îλ (z) for I λ (z) can be also established through the data set including superior individuals (with H(z) ≤ λ) and inferior individuals (H(z) > λ).
Then, an asymptotical model ĝopt (z) for g opt (z) is constructed as follows: where ĝopt (z) is also referred to as the quasi-optimal IS PDF.
The remaining issue is how to construct the alternative models ÎF (z) and Îλ (z).To construct the alternative model for I F (z), we utilize the existing feasible and infeasible individuals as the training data set.Meanwhile, the alternative model for I λ (z) is con- structed by using two sets of feasible individuals: the one set contains feasible individual with objective function value larger than λ (inferior individuals), and the other set has feasible individual with objective function value smaller than λ (superior individuals).The alternative models are constructed by SVM using data sets, since these two tasks are binary-classification problems and SVM is good at handling such problems.In the following section, we first showcase a brief review of SVM.Then, the concrete procedures for constructing the alternative models ÎF (z) and Îλ (z) via SVM are presented.

SVM for Alternative Model
Given a binary-classification problem, let D = {(z i , and N t is the number of training samples.SVM aims to search for an optimal decision hyperplane for which all points labeled "−1" are located on one side and all points labeled "+1" on the other side [24].As shown in Figure 5, Figure 5a shows arbitrary hyperplanes that can distinguish two types of samples, while Figure 5b represents the optimal classification hyperplane. ) and inferior individuals (() > ).Then, an asymptotical model  ̂ () for   () is constructed as follows: ̂ () =  ̂() ̂()() ∫  ̂() ̂()() ∝  ̂() ̂()() where  ̂ () is also referred to as the quasi-optimal IS PDF.
The remaining issue is how to construct the alternative models  ̂() and  ̂().To construct the alternative model for   (), we utilize the existing feasible and infeasible individuals as the training data set.Meanwhile, the alternative model for   () is constructed by using two sets of feasible individuals: the one set contains feasible individual with objective function value larger than  (inferior individuals), and the other set has feasible individual with objective function value smaller than  (superior individuals).The alternative models are constructed by SVM using data sets, since these two tasks are binary-classification problems and SVM is good at handling such problems.In the following section, we first showcase a brief review of SVM.Then, the concrete procedures for constructing the alternative models  ̂() and  ̂() via SVM are presented.

SVM for Alternative Model
Given a binary-classification problem, let  = {(  () ,   () ),  = 1,2, ⋯ ,   } be the set of labeled training data, where   () is the ith training sample,   () ∈ {−1,1} is the label of   () , and   is the number of training samples.SVM aims to search for an optimal decision hyperplane for which all points labeled "−1" are located on one side and all points labeled "+1" on the other side [24].As shown in Figure 5, Figure 5a shows arbitrary hyperplanes that can distinguish two types of samples, while Figure 5b represents the optimal classification hyperplane.A possible hyperplane that divides a sample space into two types of subspaces is: where the weight vector  is perpendicular to the hyperplane, and b is a scalar parameter that represents the bias.A possible hyperplane that divides a sample space into two types of subspaces is: where the weight vector a is perpendicular to the hyperplane, and b is a scalar parameter that represents the bias.
To determine a and b, so as to orientate the hyperplane to be as far as possible from the closest samples, two hyperplanes (P 1 and P 2 ) parallel to decision boundary P are as follows: There are no points between P 1 and P 2 .The shortest distance from the decision boundary (P) to P 1 /P 2 is 1/||a||, thus the margin between P For the nonlinearly separable samples, SVM first maps the data into a higher-dimensional feature space where the points are linearly separable, as shown in Figure 6.
the closest samples, two hyperplanes ( 1 and  2 ) parallel to decision boundary  are as follows: 1 :  T  +  = +1,  2 :  T  +  = −1 (25) There are no points between  1 and  2 .The shortest distance from the decision boundary ( ) to  1 / 2 is 1/||||, thus the margin between  1 and  2 is 2/||||.All training points  should satisfy   () ( T   () + ) ≥ 1.Therefore, determining the optimal hyperplane  with maximum margin is equivalently reduced to finding  1 and  2 that give the maximum margin, as follows: For the nonlinearly separable samples, SVM first maps the data into a higherdimensional feature space where the points are linearly separable, as shown in Figure 6.Let Φ() be the nonlinear mapping function, then (26) in the higher-dimensional feature space is: Furthermore, SVM can be extended to allow for imperfect separation by penalizing the data falling between  1 and  2 .First, we introduce the nonnegative slack variables   ≥ 0 so that: Then, add a penalizing term to the objective function in (27), and the optimization problem in ( 27) is now formulated as: where  is the penalty factor.The Lagrangian function for (29) is: Let Φ(z) be the nonlinear mapping function, then (26) in the higher-dimensional feature space is: min : Furthermore, SVM can be extended to allow for imperfect separation by penalizing the data falling between P 1 and P 2 .First, we introduce the nonnegative slack variables ξ i ≥ 0 so that: Then, add a penalizing term to the objective function in (27), and the optimization problem in ( 27) is now formulated as: min : where η is the penalty factor.The Lagrangian function for ( 29) is: where {α i } N t i=1 and {γ i } N t i=1 are Lagrange multipliers satisfying α i ≥ 0 and Then, the optimization problem (29) can be converted into: The KKT conditions corresponding to (31) are as follows: From conditions (35) and (36), it is seen that when 0 < α i < P, we can get ξ i = 0 and y Taking (32)-( 34) into (31), we can obtain: j ) be the kernel function.We do not need to know the explicit expression of the mapping function Φ(z), as long as the kernel function k(z For an arbitrary untrained point z, its label predicted by the trained SVM is: where s(z) is the symbolic function, z * j for j = 1, 2, • • • , N * are N * support vectors, and y * j is the sign of z * j .α * j represents the Lagrange multiplier corresponding to support vector z * j .
Remark 1.Only those samples that lie closest to the decision boundary P satisfy α i > 0, and these samples are referred to as the support vectors (just as the "*" points in Figure 5b).For the non-support vectors, their corresponding Lagrange multipliers equal zero.
Remark 2. Parameter b can be solved by any support vector, but for accuracy, the estimate of b corresponding to each support vector is calculated, and their mean value is taken as the final estimate of b.
Remark 3. The soft penalty η permits the misclassification.Increasing η generates a stricter classification.If we reduce η towards 0, it makes misclassification less important; if we increase η to infinity, it means no misclassification is allowed.
Till now, we have expounded the basic idea of SVM.Hereinafter, the procedure of SVM for constructing the alternative models ÎF (z) and Îλ (z) are demonstrated, as shown in Algorithms 2 and 3, respectively.
(3) In step 9, we construct the initial SVM model ÎF (z) by using the initial information of the feasible/infeasible domains.Then, the SVM model ÎF (z) will be updated by the expanded data set (see step 12).This adequately excavates the information of the design domain, and thus we can construct a more precise asymptotical boundary to separate the feasible domain from the infeasible domain.(4) The quasi-optimal IS PDF ĝopt (z) is established by using the constructed SVM models ÎF (z) and Îλ (z).Since the denominator ÎF (z) Îλ (z) f (z)dz is a constant that does not affect the probability density, we can only utilize the numerator ÎF (z) Îλ (z) f (z) to produce the offspring.Furthermore, since we only know the lower and upper bounds of decision variables z, it is convenient to regard that the prior distribution of z is uniform.That is, f (z) is a constant, so we can use ÎF (z) Îλ (z) to produce new individuals.The modified Metropolis-Hastings sampler is applied to generate the quasi-optimal new individuals, and the thinning procedure is used to ensure these individuals are independent [31].

Algorithm 4
The general process of the proposed optimization approach 1.
Let z * l be the initial solution of decision variables, and l = 0. 2.
Set a value to p 0 .

3.
Produce the first-generation population S (l) = {z N l }.
Evaluate whether the individuals in S (l) satisfy the given constraints.

6.
Sift out the feasible individuals S f = {z Update the current optimal solution as z * l+1 = argmin z∈S f H(z).
Construct the classifier Îλ (z) via SVM.10.1.Rank the objective function values of S f in descending order, i.e.,
Output the optimal decision scheme z * opt = z * l+1 .Example 2. Consider the case study in Example 1.Here, we set k = 1 and n = 2, then the Lin/Con/ k/ n:F system is reduced to a series system with two subsystems, as shown in Figure 8.For a subsystem j ∈ {1, 2}, it involves an active-standby component and z j − 1 cold-standby redundant components.
The mathematical model of the optimization problem of this example under budget constraint is:  Meanwhile, the system reliability is reduced to: where the redundancy level z = {z 1 , z 2 } is the decision vector.Suppose that the component reliabilities for subsystem 1 and subsystem 2 are 0.93 and 0.92, respectively.The reliability of each switch is 0.9998.In addition, β 1 = 5 × 10 −5 , β 2 = 4 × 10 −5 , and t = 1400 (h).
The mathematical model of the optimization problem of this example under budget constraint is: max : where C = 27 is the budget constraint.
For comparison, we first utilize GA to explore the optimal decision scheme of this optimization problem.The obtained optimal decision scheme is z * opt = {3, 3}, and the corresponding system reliability is R s (z) = 0.971999.The value of the constraint function is G(z) = −3.1665,which indicates that the constraint is satisfied.Then, we implement the proposed approach to address this optimization problem.The initial solution is chosen as z * 0 = {2, 2}.Through two iterations, the optimal solution obtained by the proposed approach is also z * opt = {3, 3}.This is consistent with that obtained by GA.To showcase the quality of the alternative model constructed by SVM, the feasible/infeasible candidates distinguished by SVM and the actual constraint function are shown in Figures 9a and 9b, respectively.It is seen that the constructed alternative model is sufficiently accurate to separate feasible individuals from infeasible individuals.where  = 27 is the budget constraint.
For comparison, we first utilize GA to explore the optimal decision scheme of this optimization problem.The obtained optimal decision scheme is   * = {3,3}, and the corresponding system reliability is   () = 0.971999.The value of the constraint function is G() = −3.1665,which indicates that the constraint is satisfied.Then, we implement the proposed approach to address this optimization problem.The initial solution is chosen as  0 * = {2,2}.Through two iterations, the optimal solution obtained by the proposed approach is also   * = {3,3}.This is consistent with that obtained by GA.To showcase the quality of the alternative model constructed by SVM, the feasible/infeasible candidates distinguished by SVM and the actual constraint function are shown in Figure 9a and Figure 9b, respectively.It is seen that the constructed alternative model is sufficiently accurate to separate feasible individuals from infeasible individuals.

Discussion
(1) The quasi-optimal IS PDF embedded in the proposed approach facilitates producing high-quality offspring.Different from other reproduction algorithms, it is a deterministic rather than a stochastic strategy.Most importantly, it does not need to determine a series of parameters, which is critical to the robustness of the algorithm.In addition, this distribution does not give a larger weight to a certain individual, but gives weight according to the contribution of each feasible individual.This ensures the diversity of offspring, and avoids the degeneration of offspring or the emergence of super individuals (local optima).(2) The classifiers constructed by SVM avoid the repeat invocation of objective and constraint functions during the process of producing offspring.For complex systems, this is of monumental significance in mitigating the computational burden, but we have to point out that if the actual boundary is highly nonlinear, the alternative boundary constructed by SVM may deviate from the actual one.In addition, standardizing the training data will be helpful to improve the quality of the constructed classifier.(3) There is no limit on the objective function or constraint function.We just utilize the objective function to measure the fitness of feasible individuals, and use the

Discussion
(1) The quasi-optimal IS PDF embedded in the proposed approach facilitates producing high-quality offspring.Different from other reproduction algorithms, it is a deterministic rather than a stochastic strategy.Most importantly, it does not need to determine a series of parameters, which is critical to the robustness of the algorithm.In addition, this distribution does not give a larger weight to a certain individual, but gives weight according to the contribution of each feasible individual.This ensures the diversity of offspring, and avoids the degeneration of offspring or the emergence of super individuals (local optima).(2) The classifiers constructed by SVM avoid the repeat invocation of objective and constraint functions during the process of producing offspring.For complex systems, this is of monumental significance in mitigating the computational burden, but we have to point out that if the actual boundary is highly nonlinear, the alternative boundary constructed by SVM may deviate from the actual one.In addition, standardizing the training data will be helpful to improve the quality of the constructed classifier.(3) There is no limit on the objective function or constraint function.We just utilize the objective function to measure the fitness of feasible individuals, and use the constraint (2) For this example, in general, the selection of p 0 has little impact on the system reliability obtained by the proposed algorithm, except for the case under N = 100 (there is a sudden drop when p 0 = 0.9).In contrast, the choice of p 0 has an obvious impact on the efficiency of the proposed algorithm, because the curve related to the CPU time fluctuates greatly.(3) The choice of crossover fraction c f largely influences the accuracy of GA, because it is obvious that the system reliability curve obtained by GA fluctuates greatly with c f .In addition, the crossover fraction c f does not seem to have much effect on the efficiency of GA. (4) As the population size N increases, the CPU time required for the proposed algorithm or GA increases gradually.Of course, this is a predictable result.
crossover fractions   for GA, and population sizes .From the listed results we can draw the following conclusions.
(1) Under the same population size  , the system reliability corresponding to the optimal solution obtained by the proposed algorithm tends to be higher than that obtained by GA, because the system reliability curve obtained by the proposed approach is almost above that obtained by GA, except in several special cases.Meanwhile, the CPU time consumed by the proposed algorithm is longer than that consumed by GA, that is, the efficiency of the proposed algorithm is slightly lower than that of GA. (2) For this example, in general, the selection of  0 has little impact on the system reliability obtained by the proposed algorithm, except for the case under  = 100 (there is a sudden drop when  0 = 0.9).In contrast, the choice of  0 has an obvious impact on the efficiency of the proposed algorithm, because the curve related to the CPU time fluctuates greatly.(3) The choice of crossover fraction   largely influences the accuracy of GA, because it is obvious that the system reliability curve obtained by GA fluctuates greatly with   .In addition, the crossover fraction   does not seem to have much effect on the efficiency of GA. (4) As the population size  increases, the CPU time required for the proposed algorithm or GA increases gradually.Of course, this is a predictable result.Figure 10f demonstrates the best results obtained by GA and the proposed approach under different population sizes.We can also observe that the population size  almost has no effect on the final solution obtained by the proposed approach, but has large effect on the solution obtained by GA.The best optimal solutions obtained by GA and the proposed approach are listed in Table 1.
Table 1.Optimal solutions of the Lin/Con/2/10:F system Figure 10f demonstrates the best results obtained by GA and the proposed approach under different population sizes.We can also observe that the population size N almost has no effect on the final solution obtained by the proposed approach, but has large effect on the solution obtained by GA.The best optimal solutions obtained by GA and the proposed approach are listed in Table 1.From Table 1, we can see that the final system reliability obtained by the proposed approach is larger than that obtained by GA.Moreover, the final decision schemes obtained by these two approaches all satisfy the given constraints, because the values of these two constraint functions are all smaller than 0. However, the proposed approach needs a longer time to explore the optimal solution.That is, the proposed approach tends to obtain a more reliable system, but sacrifices CPU time.For this example, the parameter settings of the proposed approach almost have no effect on the final system reliability, but somewhat affect the computational efficiency.

Lin/Con/3/50:F System
In this section, we consider a Lin/Con/3/50:F system, i.e., n = 50 and k = 3.This system fails if three consecutive subsystems fail.The design task of this system is to find the optimal reliability choice for each subsystem and its corresponding redundancy level; thus, it is a 100-dimension problem.As for the parameters involved in constraint functions, we set C = 370, V = 170, c = 0.5 × 1 1×n and ν = 1.2 × 1 1×n .The parameters related to the exponential time-to-failure assumption are The reliability of the switch of each subsystem is 0.9998.The time interval we investigate is [0, 1200].The decision variables satisfy 0.9 Similarly, we first investigate the performance of the proposed approach under different parameter settings.Specifically, we study the performance of the proposed approach varying with the population size N or choice of p 0 .Furthermore, for comparison, we also list the results obtained by GA under different parameter settings.The comparison results are depicted in Figure 11.
From Figure 11, we can observe that from the perspective of the final solution, it seems that the proposed approach is likely to procure more reliable systems compared with GA, because under different scenarios, the system reliability curve obtained by the proposed approach is always lying above that obtained by GA.As for the computational effort, the proposed approach needs longer time to explore the design domain than GA.
Under the assumptions of this example, we also can conclude that the choice of p 0 has little effect on the final decision scheme, because under a fixed population size N, the system reliability curve varies slightly.In contrast, the computational cost curve waves largely with p 0 .As for the population size N, from Figure 11f we can see directly that N has relatively large influence on the final system reliability, meanwhile, it also severely affects the computational efficiency. is [0,1200].The decision variables satisfy 0.9 ≤   ≤ 0.94 ( = 1,2, ⋯ , ) and   ∈ {2,3,4} ( =  + 1,  + 2, ⋯ ,2).
Similarly, we first investigate the performance of the proposed approach under different parameter settings.Specifically, we study the performance of the proposed approach varying with the population size  or choice of  0 .Furthermore, for comparison, we also list the results obtained by GA under different parameter settings.The comparison results are depicted in Figure 11.
From Figure 11, we can observe that from the perspective of the final solution, it seems that the proposed approach is likely to procure more reliable systems compared with GA, because under different scenarios, the system reliability curve obtained by the proposed approach is always lying above that obtained by GA.As for the computational effort, the proposed approach needs longer time to explore the design domain than GA.Under the assumptions of this example, we also can conclude that the choice of  0 has little effect on the final decision scheme, because under a fixed population size , the system reliability curve varies slightly.In contrast, the computational cost curve waves largely with  0 .As for the population size , from Figure 11f we can see directly that  has relatively large influence on the final system reliability, meanwhile, it also severely affects the computational efficiency.
The best optimal solutions obtained by GA and the proposed approach are listed in Table 2 for further comparison.It is seen that the maximum system reliabilities obtained by GA and the proposed approach are 0.994916477883922 and 0.994922886980181, respectively.This indicates that the proposed approach tends to procure more reliable systems compared with GA, but this is achieved by sacrificing CPU time.The best optimal solutions obtained by GA and the proposed approach are listed in Table 2 for further comparison.It is seen that the maximum system reliabilities obtained by GA and the proposed approach are 0.994916477883922 and 0.994922886980181, respectively.This indicates that the proposed approach tends to procure more reliable systems compared with GA, but this is achieved by sacrificing CPU time.

Application Results
Consider the electrical power network system shown in Figure 12a.This system contains transformer substations and electric wires.The electricity starts from the supplier city, and then will be delivered to the target city through electrical wires and transformer substations.Here, we suppose that the wires are very reliable (with reliability approach to 1) and the system reliability only depends on the reliability of the transformer substation shown in Figure 12b.Following [26], the reliability of this network system is as follows: where i (i = 1, 2, • • • , n) is the nonnegative real coefficient for single-variable term, and {a The transformer substation is a 25-bar space truss structure whose material mass density is 0.1.The three coordinates of each node and the member grouping information are listed in Table 3 and Table 4, respectively.The cross-sectional area and Young's modulus of the bar at each group are denoted  I - VI and  I - VI , respectively.Four nodal forces  1Y =  1Z =  2Y =  2Z = −10 4 () are applied at node 1 and node 2, while the forces on node 3 and node 6 are  3X and  6X with random values.{ I - VI ,  I - VI ,  3X ,  6X } are the input variables following normal distribution, and the distribution parameters are listed in Table 5.The transformer substation fails if its maximum displacement exceeds 0.80 (m).Here, the displacement is an implicit function related to random inputs, which is obtained by the finite element model (FEM).The FEM analysis result is demonstrated in Figure 13.The transformer substation is a 25-bar space truss structure whose material mass density is 0.1.The three coordinates of each node and the member grouping information are listed in Tables 3 and 4, respectively.The cross-sectional area and Young's modulus of the bar at each group are denoted A I -A VI and E I -E VI , respectively.Four nodal forces p 1Y = p 1Z = p 2Y = p 2Z = −10 4 (N) are applied at node 1 and node 2, while the forces on node 3 and node 6 are p 3X and p 6X with random values.{A I -A VI , E I -E VI , p 3X , p 6X } are the input variables following normal distribution, and the distribution parameters are listed in Table 5.The transformer substation fails if its maximum displacement exceeds 0.80 (m).Here, the displacement is an implicit function related to random inputs, which is obtained by the finite element model (FEM).The FEM analysis result is demonstrated in Figure 13.where G(z) is the cost constraint with threshold C = 33.The design variable is the redundancy of each group of bars.Suppose that all the transformer substations are the same, the objective of this problem is equivalent to maximizing the reliability of the transformer substation.We implement the proposed approach and GA to mine the optimal decision scheme of this system, and the obtained results are listed in Table 6.It is seen that the solution obtained by GA is {2, 2, 1, 2, 1, 2}.This implies that the redundancy level of bars of group 3 and group 5 is one, and the redundancy level of other groups of bars is two.The system reliability corresponding to this design is 0.9998.The constraint value is G = −15, which means that the constraint is satisfied.In addition, the CPU time consumed by GA is 42.3 (h).As for the proposed approach, the obtained optimal solution is {2, 2, 2, 2, 1, 1}, which indicates that the redundancy level of bars of group 5 and group 6 is one, while the redundancy level of bars of other groups is two.The system reliability corresponding to this solution is 1.0000.G = −15 implies that the constraint is met.The running time of the proposed approach for searching for the optimal solution is 4.1 (h).Comparing the results obtained by the proposed approach and those obtained by GA, we can conclude that (1) the proposed approach can obtain more reliable system than GA; (2) the computational cost is significantly reduced by using the proposed approach.This example fully demonstrates the merits of the proposed approach for solving complex engineering problems.

Conclusions
This paper aims to develop a more effective population-based greedy metaheuristic algorithm to solve ORD.The proposed algorithm is inspired by the principles of IS and SVM.Specifically, the proposed algorithm first utilizes the idea of IS to establish the optimal proposal distribution, in order to obtain better new individuals.For complex problems, to avoid repeatedly invoking the system reliability and constraint functions, the proposed algorithm uses the classification characteristics of SVM to establish a classification hyperplane to distinguish feasible/infeasible individuals and a classification hyperplane to divide superior/inferior individuals.This makes the sampling process no longer need to use the original complicated function for calculation, only needing to use the currently available information.The proposed algorithm requires few parameters to be determined manually, so it has a large scope of applications.In addition, the use of SVM makes it more suitable for solving complex practical engineering problems.The results of the listed numerical examples showcase that the proposed algorithm can obtain a system with higher reliability, but requires more computation time.However, if a practical problem involves a complex finite element model (or a black box), the merit of the proposed algorithm in saving calculation cost will be considerable.Considering component dependence and degradation in ORD is the future research direction on this topic.

Figure 3 .Figure 3 .
Figure 3. Illustration of IS in the standard normal space.It is seen that the expectation of estimate  ̂ is:

Figure 4 .
Figure 4.A case study on the optimal IS PDF.(a) 2D problem with multiple interested domains; (b) Sampling result obtained by IS in Figure 3; (c) Sampling result obtained by optimal IS PDF.

Figure 4 .
Figure 4.A case study on the optimal IS PDF.(a) 2D problem with multiple interested domains; (b) Sampling result obtained by IS in Figure 3; (c) Sampling result obtained by optimal IS PDF.

Figure 6 .
Figure 6.A nonlinear separating region transformed into a linear one.

Figure 6 .
Figure 6.A nonlinear separating region transformed into a linear one.

10 . 3 .
Divide S f as superior individuals S + with objective function values smaller than λ, and inferior individuals S − with objective function values larger than λ.10.4.Label "+1" to individuals in S + and "−1" to individuals in S − .10.5.Construct the classifier Îλ (z) by using S f , L S f .

Figure 7 .
Figure 7. Flowchart of the proposed solution procedure.

Figure 7 .Example 2 .
Figure 7. Flowchart of the proposed solution procedure.Example 2. Consider the case study in Example 1.Here, we set  = 1 and  = 2 , then the Lin/Con//:F system is reduced to a series system with two subsystems, as shown in Figure8.For a subsystem  ∈ {1,2}, it involves an active-standby component and   − 1 cold-standby redundant components.

Figure 12 .
Figure 12.A schematic view of an electrical power network system.(a) An electric power network system; (b) Transformer substation.

Figure 12 .
Figure 12.A schematic view of an electrical power network system.(a) An electric power network system; (b) Transformer substation.

Fig. 13 Figure 13 . 1 ≤
Fig.13 Deformation distribution of the 25-bar space truss structure Figure 13.Deformation distribution of the 25-bar space truss structure.To improve the system reliability, we can add redundant bars to the transformer substation and construct the optimization model as follows: max : z (40) 1 and P 2 is 2/||a||.All training points i + b) ≥ 1.Therefore, determining the optimal hyperplane P

Table 3 .
Nodal coordinates of the truss structure.

Table 4 .
Group membership for the truss structure.

Table 3 .
Nodal coordinates of the truss structure.

Table 4 .
Group membership for the truss structure.

Table 5 .
Input variables for the truss structure.

Table 6 .
Optimal solutions for the application case.