Modeling and HDA-CR Solution of Multi-Period Allocation Scheme of Hazardous Materials under Uncertainty

Featured Application: In this study, an improved multi-objective difference evolution algorithm (HDA-CR) is proposed and applied to the formulation of a multi-period allocation scheme for hazardous materials under uncertainty, which can provide a reference for solving similar large-scale multi-dimensional assignment problems. Abstract: Developing a multi-period allocation scheme for life-limited hazardous materials is essential to ensure safe and sustainable hazardous material management. In this study, the allocation risk under uncertainty is measured by a type-II fuzzy number, and a bilevel chance constrained programming model is established with the minimum cumulative number of reserve points participating in allocation and the minimum cumulative allocation risk as to the objective functions. Aiming at the multi-dimensional characteristics of multi-period, multi-reserve points, multi-consumption points, and multi-hazardous materials types, and the resource conﬂict problem in the allocation scheme formulation process, a multi-objective hierarchical differential evolution algorithm with coding repair strategy was designed. By comparing with the classical multi-objective optimization algorithm, the algorithm can search for a more excellent Pareto solution set at the expense of certain time complexity. At the same time, when the decision-maker’s preference is introduced, the method can select a more appropriate multi-period allocation scheme from the perspective of the overall situation and the decision-maker. It provides a reference for determining the rational allocation scheme of resources under the long-term allocation of hazardous materials.


Introduction
With the continuous development of the industrial level and productivity of the society, the types and the market demand are also increasing in hazardous materials. For some special hazardous materials with storage life limitation, such as weapons and ammunition, nuclear materials, and hazardous chemicals, a certain quantity of hazardous materials with less remaining storage life needs to be regularly allocated from the reserve warehouse to the designated location for use. How to reduce the allocation risk of specific hazardous materials while ensuring the multiple demands of industry, military and energy, and how to use new solutions to deal with chemical, biological, radiological, and nuclear threats brought by hazardous materials allocation to achieve sustainable development remains to be studied [1].
In practical application, the reserve warehouse mainly determines the allocation scheme according to the demand for various types of hazardous materials in the next period and the remaining storage life of the hazardous materials to which it belongs; however, it lacks multiple periods of a hazardous materials allocation scheme. However, the differences in such factors as the geographical location and inventory of the warehouses of reserve hazardous materials, the location of the demand points, and the hazardous materials differences in such factors as the geographical location and inventory of the warehouses of reserve hazardous materials, the location of the demand points, and the hazardous materials demand of each period cause the phenomenon of unreasonable resource allocation of hazardous material in the subsequent allocation process, resulting in the phenomenon of more remaining reserve life in the hazardous materials used in the allocation and some hazardous materials being scrapped due to their service exceeding the storage time limit. Therefore, it is especially important to coordinate the hazardous materials inventory in different places and formulate sustainable multi-periods of allocation scheme in advance.
This work aims to study the method of developing the optimal sustainable multiperiod allocation scheme for a class of hazardous materials that need to be distributed regularly under uncertainty. On the premise of determining the location and transportation path of each reserve point and consumption point, we must fully consider the risk uncertainty factors in the allocation process of hazardous materials, use type-II fuzzy numbers to measure the unit allocation risk, and establish a bilevel chance constraint programming model for multi-period allocation of hazardous materials. According to the characteristics of the problem, a new hierarchical differential evolution algorithm with coding repair strategy (HDA-CR) is designed, which can provide an algorithm reference for solving multi-dimensional complex hazardous material assignment problems. Finally, the effectiveness of the algorithm is verified through comparative experiments.
The rest of this paper is stated as follows. First, Section 2 summarizes the literature research about hazardous material allocation. The description of hazardous materials allocation risk model and multi-period allocation model is shown in Section 3. Section 4 describes the proposed HDA-CR. Section 5 provides some numerical experiments to test the proposed algorithm. Finally, the work is summarized in Section 6. The overall research framework is shown in Figure 1.

Research Overview
At present, the research on hazardous material allocation mainly focuses on the path optimization and risk assessment. In the aspect of path optimization, Izdebski proposed a minimization model considering the probability of hazardous materials transport accidents, on which the ant colony and genetic algorithm are used to determine the hazardous mate-rials transport route [2]; Zografos designed a heuristic algorithm to solve the scheduling problem of hazardous materials, and defined a double-objective scheduling model with minimum risk and minimum cost [3]. Li designed a novel optimization model of a multimodal hub-and-spoke network with a detour (MHSNWD) for hazardous materials on the strategic level, considering the dual goals of risk and cost management of hazardous materials transportation [4]. However, for some specific hazardous materials, cost is not the most important factor to be considered. For example, Saat carried out the research on the safety design optimization of the generalized railway tank car for the transportation of hazardous materials from the perspective of transportation efficiency and safety [5]. Therefore, how to meet the demand for use and ensure sustainable resource allocation under the premise of safety is the most important issue to consider.
In terms of risk assessment, Liu analyzed the different factors affecting accidents in different populations and environments, and presented a risk assessment model of hazardous materials road transportation under time-varying conditions by considering the bearing capacity of the assessed area [6]. Korytárová monitored the frequency of individual risk factors in qualitative risk analysis, and used a Monte Carlo approach to test the interaction of the impact of key variables on project efficiency [7]. Park proposed a model for real-time prediction of accident consequences for dynamic risk assessment of LNG fuel supply system rooms [8]; Krejci provided a series of risk assessment methods for hazardous material transportation, such as the expert consultation method, fault-tree analysis, failure mode impact and cause analysis, and reliability assessment [9]. Verter presented a methodology for assessment of the hazardous materials transport risk in a multicommodity, multiple origin-destination setting, which was integrated with a Geographical Information System (GIS) and made large-scale implementation possible [10]. Topuz used the analytic hierarchy process and fuzzy logic to integrate the environment and human health into the risk assessment of hazardous materials in order to support environmental decision-makers with quantitative and directive results [11].
Due to the lack of consistent and sufficient hazardous material transportation accident data, it is usually difficult to obtain the parameters required for accurate transportation risk measurement during actual hazardous materials allocation. Different allocation risk measurement methods will lead to different decision-making results [12]. Many studies show that the high uncertainty of allocation risk significantly affects the decision making of hazardous materials allocation [13,14]. Therefore, the allocation decisions obtained in a deterministic environment may have difficulty meeting the actual needs of mobile hazardous material risk management.
Risk modeling of hazardous material under uncertainty received extensive attention. Du assumed that the path length in the traffic network is a fuzzy variable, and proposed a fuzzy multi-objective programming model to minimize transportation risk, time, and fuel consumption, and build a chance constraint model based on the fuzzy credibility theory [15]. Zero used fuzzy random theory to balance the lowest cost and the lowest risk, and solved the problem of gas station positioning [16]. Yan established a transport vehicle scheduling model considering random interference and travel time [17]. Ljubojevic used an adaptive neuro-fuzzy inference system to investigate the cost and risk assessment of multiobjective routing of hazardous materials transport on urban road networks and tested it in a case study of oil and oil derivatives allocation in Belgrade, Serbia [18]. Berglund studied the impact of uncertainty and the robust optimization problem of hazardous material transportation [19].
The multi-period allocation scheme of hazardous material studied in this paper belongs to a kind of optimization problem involving multiple objective functions, multiple reserve points, multiple consumption points, multiple types of hazardous materials, and multiple periods. Its decision space is huge, and it is often difficult to solve the problem by enumerating. In solving this kind of optimization model, various heuristic algorithms received extensive attention [20][21][22]. For example, in research of road screening issues of hazardous materials transportation, Ma built a road screening algorithm of hazardous materials transportation based on genetic algorithms and the Levenberg Marquardt neural network (GA-LM-NN) by analyzing 15 attribute data of each road network section [23]. Bula used variable neighborhood search to solve the vehicle routing problem for hazardous materials transportation [24]. Zero proposed two algorithms to the bi-objective shortest path problem for the road transport of hazardous materials, including the first one, using a modified Gandibleux revision of Martin's algorithm and the second one based on the A* algorithm acceleration technique [25].

Problem Description
This type of hazardous materials allocation problem can be defined as a complex largescale assignment problem under uncertainty, assuming that there are n hazardous materials reserve points in a region with a total of K different hazardous material types. In each period t ∈ {1, 2, . . . , T}, all kinds of hazardous materials need to be allocated to the hazardous materials consumption points to meet the needs c tj k of each consumption point in that period, and there are m consumption points. On the basis of considering the hazardous materials inventory near the rated life of the existing reserve points, the allocation scheme X = (X 1 , X 2 , . . . , X T ) of multi-periods, multi-reserve points, multi-consumption points, and multi-hazardous material types is prepared for such hazardous materials. The model parameters are described in Table 1.

Parameters
Parameter Significance The jth consumption point y k i Initial reserve quantity of hazardous materials of type k at a i ε k ti Remaining reserve quantity of hazardous materials of type k at a i after period t c tj k Demand for hazardous materials of type k at d j in period t l ij Generalized distance between a i and d j x k tij Decision variables, quantity of hazardous materials of type k allocated from a i to d j in period t r ij Population density on the section from a i to d j pj ij Vehicle traffic accidents on the section from a i to d j s k Accident impact range of hazardous materials of type k θ R Risk threshold f Cumulative allocation risk g Cumulative quantity of reserve points performing the allocation task α u , α l degree of confidence

Problem Assumptions
Assumption 1. The inventory of hazardous material at each reserve point that is eligible for allocation with a remaining life below a specific standard will not increase during the allocation.  The demand for all types of hazardous materials at consumption points in each period is known.

Assumption 4.
Road transportation is considered as the only allocation mode of hazardous materials.

Allocation Risk Model
The allocation risk of hazardous materials can usually be measured in combination with the probability and consequence of hazardous accidents [26], which is given as follows: where, f tij is the allocation risk from a i to d j in period t, considering the transportation risk. P ij is the transportation accident probability on the road section from a i to d j , and C tij is the accident consequence on the road section from a i to d j in period t.
Referring to the risk modeling approach proposed by Errkut [27] for transporting hazardous materials, the allocation risk described above can be defined as: where P ij = pj ij × l ij is used to measure the transportation accident probability on the road section from a i to d j , which is measured by the probability of traffic accidents pj ij and the length l ij on this road section, and C tij = pd ij × K ∑ k=1 (s k × x tij k ) is used to measure the accident consequences of this allocation. The hazard range and surrounding population density are used to measure the severity of accidents when hazardous accidents occur. However, the traffic accident probability, population density, and accident impact range that determine the allocation risk are difficult to accurately measure. How to quantitatively describe these uncertain factors determines the allocation risk value at each period.

The Traffic Accident Probability
Considering the lack of data on the occurrence of accidents involving hazardous materials transport vehicles on the corresponding road sections, the accident probability for measuring the allocation risk can refer to the vehicle traffic accidents on the road sections. The historical interval frequency is used to describe the probability of traffic accidents for hazardous materials in the future allocation periods, i.e., pj ij = [pj l ij , pj u ij ], where pj l ij and pj u ij refer to the minimum and maximum frequency of traffic accidents on the road section from a i to d j in the statistical period.

Population Density
Due to the differences of mobility and regional distribution of the population, the population density on the accident road section cannot be accurately predicted. In order to simplify the analysis, the population density on the accident roadway section is equated to the average population density on this roadway section. According to the results of the latest census, the population density on the allocation road section is described by the number of intervals, i.e., pd ij = [pd l ij , pd u ij ], where pd l ij and pd u ij are the minimum and maximum average population density on the road section from a i to d j .

Accident Impact Range
Different types of hazardous materials have different levels of hazard in the event of an accident. For example, the explosion range of explosion-prone hazardous materials can be used to measure the accident impact range. If the explosion range of allocated hazardous materials is large, the hazard range of the accident is large, and the explosion range is small, the hazard range of the accident is relatively small. However, it is difficult to estimate the specific type, quantity, and impact range of the superimposed explosion of hazardous materials when the accident occurs. Therefore, fuzzy numbers can be used to measure the hazard range of the accident of allocated hazardous materials, i.e., [s l , where s l and s u are, respectively, the minimum and maximum accident impact range in all types of hazardous materials.

Allocation Risk Measures
It can be seen from the above analysis that the allocation risk of hazardous materials allocated from a i to d j in period t can be expressed as: According to the multiplication rules of interval numbers, the multiplication of any two interval numbers [a b] and [c d] yields: where min(ac, ad, bc, bd) and max(ac, ad, bc, bd) are, respectively, the minimum and maximum values of the resulting four interval boundary combinations ac, ad, bc and bd. This multiplication operation preserves the uncertainty information of the fuzzy variables by enlarging the boundary of the interval, but inevitably enlarges the measurement of the allocation risk of the road section. The type-II fuzzy variable is proposed on the basis of traditional fuzzy set theory [28]. It adds a fuzzy membership function on the basis of the type-I fuzzy variable, increases the freedom of uncertainty modeling, and can more comprehensively describe the uncertainty information of the system [29]. In order to make more effective use of interval boundary information, using trapezoidal interval type-II fuzzy variables to process the boundary combination of the eight risk intervals pj l ij pd l ij s l , pj l ij pd u ij s u , pj l ij pd u ij s l , pj l ij pd u ij s u , pj u ij pd l ij s l , pj u ij pd l ij s u , pj u ij pd u ij s l , pj u ij pd u ij s u obtained by interval multiplication, the allocation risk on the road section can be defined as follows; where type-II fuzzy numbers r ij are defined as follows: where r u ij and r l ij are type-I fuzzy variables defined on the upper membership function (r u ij1 , r u ij2 , r u ij3 , r u ij4 , ω u ij ) and lower membership function (r l ij1 , r l ij2 , r l ij3 , r l ij4 , ω l ij ), the value of each trapezoidal boundary in r ij is determined according to the size of the eight different interval boundaries obtained by Equation (5), as shown in Figure 2.

Multi-Period Hazardous Materials Allocation Model
In order to ensure the reliability of the completion of allocation tasks in each period and reduce the probability that some reserve points cannot meet the allocation requirements due to their own reasons, the model takes the minimum number of reserve points performing the allocation task in each period and the minimum cumulative allocation risk as the double-objective functions to obtain the following model:

Multi-Period Hazardous Materials Allocation Model
In order to ensure the reliability of the completion of allocation tasks in each period and reduce the probability that some reserve points cannot meet the allocation requirements due to their own reasons, the model takes the minimum number of reserve points performing the allocation task in each period and the minimum cumulative allocation risk as the double-objective functions to obtain the following model: where Equation (7) represents that the cumulative number of reserve points providing hazardous materials at each period is the least; Equation (8) represents that the cumulative allocation risk is the minimum; Equation (9) means that the quantity of hazardous materials allocated at each period shall meet the demand quantity of each type of hazardous material at each consumption point during that period; Equation (10) represents that the quantity of each type of hazardous material allocated by the reserve point at each period shall not exceed the remaining quantity of this type of hazardous material at the reserve point during that period; Equation (11) represents the risk constraint, that is, the allocation risk from the reserve point to the consumption point shall not exceed the specified risk threshold; and Equation (12) represents the value limit on the quantity of each type of hazardous material allocated, and only natural numbers can be used.

Bilevel Chance Constrained Programming Model
According to the fuzzy chance constrained programming model [30], given confidence levels α u and α l , the above type-II fuzzy double-objective optimization model can be transformed into a bilevel chance constrained programming model.
According to the upper level membership function (r u ij1 , r u ij2 , r u ij3 , r u ij4 , ω u ij ) of the allocation risk, the upper level chance constrained programming model can be obtained as follows: Similarly, according to the lower level membership function (r u ij1 , r u ij2 , r u ij3 , r u ij4 , ω u ij ), the lower level chance constrained programming model can be obtained as follows: Appl. Sci. 2022, 12, 11970 8 of 22 Next, according to the confidence equivalent formula of the interval type-II fuzzy model [31], the above opportunity constraints can be converted as follows: where, Cr{A} represents the degree of confidence of event A, h ij u and h ij l are defined as follows.
Combining Equations (13)- (17), the above bilevel chance constrained programming model can be transformed as follows with the weighted method.

Solution Method
The differential Evolution (DE) algorithm is a global merit-seeking method based on the idea of difference to simulate the survival of the fittest in nature, mainly through mutation, crossover, and selection operations to complete the continuous evolution of the population. Since the hazardous materials allocation studied in this paper only considers the integer batch allocation, which is a kind of large-scale integer programming problem, it is appropriate to use the natural number coding method in the structure of the solution, and the DE algorithm can often find a more excellent solution in solving this kind of problem [32].
The multi-objective differential evolution (MODE) algorithm is the application of the DE algorithm in solving multi-objective optimization problems, and many scholars improved the differential evolution algorithm to different degrees for specific problems [33][34][35]. Considering the multi-dimensional characteristics of the multi-period allocation problem of hazardous materials, which contains multi-periods, multi-reserve points, multiconsumption points, and multi-hazardous material types, in order to improve the search efficiency of the algorithm, the MODE algorithm is improved in this paper by redesigning the crossover operation, variation operation and selection strategy, and designing a coding repair strategy combined with the constraints.

Structure of Individuals and Initialization of Populations
The above multi-dimensional allocation problem of hazardous materials containing multi-periods, multi-reserve points, multi-consumption points, and multi-hazardous material types can be regarded as a four-dimensional integer programming problem. The structure of the solution can take the form of a three-dimensional integer vector, i.e., P= [X 1 , · · · , X t , · · · , X T ], as shown in Figure 3.

Structure of Individuals and Initialization of Populations
The above multi-dimensional allocation problem of hazardous materials containing multi-periods, multi-reserve points, multi-consumption points, and multi-hazardous material types can be regarded as a four-dimensional integer programming problem. The structure of the solution can take the form of a three-dimensional integer vector, i.e., , as shown in Figure 3. where each matrix t X represents the hazardous material allocation scheme for each pe- represents the allocation scheme for allocating each type of hazardous material from i a to j d in period t .
According to the individual structure in Figure 3, the population initialization is completed by coding all individuals in the following way: where, randi[min, max] means to randomly generate an integer between min and max , and the operation is repeated until the required number PS of individuals for the initial population is satisfied. Where each matrix X t represents the hazardous material allocation scheme for each period t. Matrix element x tij = [x tij 1 , · · · , x tij k , · · · , x tij K represents the allocation scheme for allocating each type of hazardous material from a i to d j in period t.
According to the individual structure in Figure 3, the population initialization is completed by coding all individuals in the following way: where, randi[min, max] means to randomly generate an integer between min and max, and the operation is repeated until the required number PS of individuals for the initial population is satisfied.

Coding Repair Strategy
During the actual coding operation, there will be problems of individual coding conflicting with constraints, and these coding methods are invalid, which will lead to a large number of infeasible individuals in the evolutionary process of the population, reducing the convergence speed of the population and leading to a decrease in the efficiency of the algorithm search [36]. Therefore, whenever the coding changes, the population coding needs to be checked and repaired to ensure the validity of the individuals of the population and to improve the search efficiency of the solution space. The invalid coding repair strategy is as follows: Step 1: Calculate the final remaining reserves of each type of hazardous material ε k Ti ← y k i − ∑ T t=1 x k tij at each reserve point after the individual completed all periods of allocation tasks.
Step 2: Check ε k Ti . If there exists i ∈ {1, 2, . . . , n} satisfying ε Ti k < 0, it means that the individual has a conflict of reserve quantities at the reserve point. Select the period t when a i allocates the least quantity in this type k of hazardous materials, and update x t ij k = 0 and ε k Ti . Repeat this step until ε Ti k > 0 for any reserve point a i . Step 3: Check the allocation risk of each road section in each period. If there exists t ∈ {1, 2, . . . , T} satisfying (h ij u + h ij l ) × ∑ K k=1 x tij k ≥ θ R , the allocation risk of this period on this road section is too high to accept. It is necessary to reduce the number of hazardous materials on this road section at this allocation. Randomly select a type of hazardous material k that satisfies x tij k > 0, and do x tij k ← x tij k − 1 . Repeat this step until Step 4: Check the demand satisfaction of each period. ∀t = 1, 2, . . . , T, if there exist k ∈ {1, . . . , K} and j ∈ {1, . . . , m} satisfying ∑ n i=1 x tij k < c tj k , it means that the demand for hazardous materials of type k at consumption point d j is not satisfied within that period and additional allocation is required; then, choose randomly a reserve point a i that satisfies x ti * j k < ε ti * k , and complete Similarly, if there exist k ∈ {1, . . . , K} and j ∈ {1, . . . , m} satisfying ∑ n i=1 x tij k < c tj k , it means that the quantity of hazardous materials allocated in this period is too much, which causes waste of resources; then, randomly select a reserve point a i * meeting x ti * j k > 0, and complete After each update of x ti * j k , update ε k Ti at the same time. Repeat this step until ∑ n i=1 ∑ m j=1 x tij k = c t k is established for demands at all consumption points.
Step 5: Recheck ε k Ti and allocation risks of each road section according to Step 2-4. If all conditions are met, the repair will be completed, otherwise go to the corresponding step of Step 2-4.

Hierarchical Mutation Operation
In the traditional DE algorithm, coding mutation is mainly carried out by constructing a difference vector. For example, in the DE/rand/1 algorithm [37], three different individuals are randomly selected from the initial population, and the mutation operation is performed by constructing a difference vector by the following equation to obtain new individuals: where F is the scaling factor, and F = 1 considering the value limitation of individual coding. To ensure that the quantity of hazardous materials allocated will not be negative, take the absolute value of the above difference vector.
For the multi-period allocation scheme problem of hazardous materials proposed in this paper, considering the different selection schemes of reserve points and consumption points at different periods among individuals, the allocation schemes between reserve points and consumption points of new individuals obtained from the difference vector constructed by Equation (22) tend to be more dispersed, which makes the number of reserve points involved in allocation at each period greatly increased in new individuals, and although it enriches the diversity of individuals in the population, the new individuals produced are often of low quality, which is not conducive to the evolution of the population. Therefore, in order to improve the quality of the population, we consider grouping differential mutation based on the selection sequence of the same reserve points in each period of the allocation scheme to optimize the evolutionary direction of the population.
In order to solve the problem of population diversity decline and easily fall into the local optimum brought by population grouping differential mutation. Combined with the three-dimensional integer vector coding structure of individual in Figure 3, and referring to the two-point mutation strategy in genetic algorithm [38], each individual coding in the initial population is firstly subjected to a hierarchical progressive self-mutation operation in three dimensions from period scheme layer (X 1 , . . . , X T ), reserve point scheme layer (x t1 , . . . , x tn ), and consumption point scheme layer (x ti1 , . . . , x tim ) according to certain probability.
if c > r 1 ∀x ti , (. . . , x tia 2 , . . . , x tia 1 , . . .) ← (. . . , x tia 1 , . . . , where,c = rand[0, 1], r 1 , r 2 and r 3 are the probabilities of mutation at different levels, respectively. The initial population P 0 is grouped according to the same reserve point selection sequence, and after expanding the number of individuals in the group by performing two self-mutation operations on each group element separately, individuals are selected from the same group element to construct new individuals according to Equation (22), and the operation is repeated until a new population P 1 is generated.

Hierarchical Crossover Operation
In order to enrich the diversity of population individuals and improve the search efficiency of the Pareto solution set, similarly, crossover operations can be completed for individuals from three levels: period scheme layer, reserve point scheme layer, and consumption point scheme layer according to the structure of individual in Figure 3.
Individuals p 0 and corresponding p 1 are randomly selected from the initial population P 0 and its hierarchical mutation population P 1 . After the crossover probabilities are determined, and the crossover operation can be completed by assigning one from the period scheme layer, reserve the point scheme layer and consumption point scheme layer to produce two new individuals p 2 and p 3 according to the random interval probabilities, as shown in Figure 4.  Continue to select the remaining individuals in the population, and repeat the hierarchical crossover operation in Figure 4 Where, and represent the allocation scheme of each period w h i c h are two-dimensional integer vector matrices. Continue to select the remaining individuals in the population, and repeat the hierarchical crossover operation in Figure 4 until two new populations P 2 and P 3 are generated.

Population Selection Strategy
Calculate the fitness values of all individuals in the initial population and two crossover populations P 2 and P 3 , that is, the cumulative number of reserve points participating in the allocation and the cumulative allocation risk. The concepts of Pareto non-dominated sorting and individual crowding distance in non-determined sorting genetic algorithm (NSGA) are introduced to cut the population hierarchically [39], and the PS optimal individuals are retained as the initial population P 1 0 of the next iteration according to the elite retention mechanism. The steps are as follows: Step 1: Calculate the double-objective fitness value of all individuals, determine the dominance relationship between each individual according to the non-dominance ranking, and retain the non-dominated individuals in it as the first Pareto frontier Γ 1 .
Step 2: Individuals in Γ 1 are removed from the three populations above, the dominance relationships among the remaining individuals are determined again, and the non-dominant individuals among them are retained as the second Pareto frontier Γ 2 . The process is repeated until the hierarchy of all individuals is divided.
Step 3: The optimal individuals are retained as the initial population for the next iteration according to the hierarchical relationship. The more advanced the hierarchy, the better the quality of the individual solution. If the number of individuals exceeds PS in the hierarchical retention, the last layer to be retained is cut according to the individual crowding distance, and the individuals with smaller crowding distance are removed to meet the population size PS. The population selection strategy is shown in Figure 5. Step 1: Calculate the double-objective fitness value of all individuals, determine the dominance relationship between each individual according to the non-dominance ranking, and retain the non-dominated individuals in it as the first Pareto frontier 1 Γ .
Step 2: Individuals in 1 Γ are removed from the three populations above, the dominance relationships among the remaining individuals are determined again, and the nondominant individuals among them are retained as the second Pareto frontier 2 Γ . The process is repeated until the hierarchy of all individuals is divided.
Step 3: The PS optimal individuals are retained as the initial population for the next iteration according to the hierarchical relationship. The more advanced the hierarchy, the better the quality of the individual solution. If the number of individuals exceeds PS in the hierarchical retention, the last layer to be retained is cut according to the individual crowding distance, and the individuals with smaller crowding distance are removed to meet the population size PS . The population selection strategy is shown in Figure 5. Crowding distance is generally used to describe the size of the density around individuals and the retention of individuals with larger crowding distance makes the preserved individuals less similar and maintains the diversity of the population. The crowding distance is calculated by the following equation.
where 1 x g + ,  Crowding distance is generally used to describe the size of the density around individuals and the retention of individuals with larger crowding distance makes the preserved individuals less similar and maintains the diversity of the population. The crowding distance is calculated by the following equation.
where g x+1 , g x−1 , f x+1 and f x−1 are the corresponding values taken by the individuals before and after in the Pareto frontier of the individual when sorted by the number of cumulative reserve points participating in the allocation and cumulative allocation risk, respectively; g max , g min , f max and f min are the maximum and minimum values of the number of cumulative reserve points participating in the allocation and cumulative allocation risk on the Pareto frontier of the individual, respectively.

Experimental Scene Setup
The HDA-CR proposed in Section 3 is an improvement on the coding repair and non-determined sorting-based differential evolution (ERNS-DE) algorithm proposed in literature [40], which aims to solve the problem of multi-dimensional allocation problem of hazardous materials. The parameters of HDA-CR are set as shown in Table 2. Considering the lack of relevant hazardous materials transfer risk data, in order to verify the effectiveness of the model and algorithm as much as possible, this example contains two different scale experimental environment settings, two hazardous materials supply and demand scenarios, and two periods settings, including long-term allocation and short-term allocation, as well as a total of eight difference experimental scenarios, comparing HDA-CR with the classical multi-objective algorithm and the ERNS-DE algorithm in different scenarios. Environment 1: Consider the smaller area hazardous material allocation problem, four hazardous material types, seven reserve points, and three consuming points, and the locations of the reserve points and consuming points are randomly generated according to the generalized distance, as shown in Figure 6a. The numbers in Figure 6 represent the numbering of consumption points and reserve points respectively. Appl

Experimental Scene Setup
The HDA-CR proposed in Section 3 is an improvement on the coding repair and nondetermined sorting-based differential evolution (ERNS-DE) algorithm proposed in literature [40], which aims to solve the problem of multi-dimensional allocation problem of hazardous materials. The parameters of HDA-CR are set as shown in Table 2. Considering the lack of relevant hazardous materials transfer risk data, in order to verify the effectiveness of the model and algorithm as much as possible, this example contains two different scale experimental environment settings, two hazardous materials supply and demand scenarios, and two periods settings, including long-term allocation and short-term allocation, as well as a total of eight difference experimental scenarios, comparing HDA-CR with the classical multi-objective algorithm and the ERNS-DE algorithm in different scenarios. Environment 1: Consider the smaller area hazardous material allocation problem, four hazardous material types, seven reserve points, and three consuming points, and the locations of the reserve points and consuming points are randomly generated according to the generalized distance, as shown in Figure 6a. The numbers in Figure 6 represent the numbering of consumption points and reserve points respectively. Case 1: Consider hazardous materials allocation problem under resource-rich. The sum of the quantities of hazardous materials available at all reserve points is far greater than the sum of the demands at all consumption points in all periods.
Case 2: Consider the hazardous materials allocation problem under resource constraints. The sum of the quantities of hazardous materials available at all reserve points is exactly equal to the sum of the demands of all consumption points in all periods.
Period parameter setting: Consider the short-term allocation problem of hazardous materials, and let the number of periods T = 5; consider the long-term allocation problem of hazardous materials, and let the number of periods T = 24. The specific experimental scenarios are set up as shown in Table 3. In this experiment, the traffic accident probabilities pj l ij and pj u ij , population density pd l ij and pd u ij , the influence range s l and s u of hazardous materials, and the membership values , respectively. Additionally, θ R = 100 is adopted as the allocation risk threshold.

Algorithm Comparison Analysis
In order to test the performance of HDA-CR proposed in this paper and solve the multiperiod allocation model of hazardous materials, HDA-CR is compared with the traditional MODE, the single period multi-dimensional allocation problem solving algorithm (ERNS-DE) in literature [40], as well as other classical multi-objective algorithms, such as NSGA-II [41] and multi-objective particle swarm optimization (MOPSO) [42], by combining the double-objective optimization model in Equation (18) and setting the confidence level α u = α l = 0.05. Among them, the basic parameters of MODE and ERNS-DE are consistent with the corresponding parameter settings in Table 2. NSGA-II and MOPSO use the corresponding parameters in literature [41] and literature [42], respectively. Figure 7a-h shows the comparison of the Pareto optimal solution evolution curves of five algorithms for solving the multi-objective model of the multi-period allocation of hazardous materials (Equation (18)) under different scenarios. It can be seen from the evolution curves of the Pareto optimal solution under eight scenarios that HDA-CR converges faster than other algorithms, and the size of Pareto solution space searched is larger than other algorithms. For example, as shown in Figure 7a, the evolution curve of the HDA-CR converges after about 120 iterations and can search 12 Pareto optimal solutions in Scenario 1, while other algorithms converge relatively slowly and the number of Pareto optimal solutions searched does not exceed 9. It shows that HDA-CR can effectively improve the diversity of the population in the experimental scenario and accelerate the evolution efficiency of the population compared with other algorithms. In addition, it can be seen from the comparison results of (a-d) and (e-h) in Figure 7 that with the increase in the number of periods, the above five algorithms need more iterations to converge, but the advantages of HAD-CR still exist, and a greater difference in the number of Pareto optimal solutions is found. This indicates that HDA-CR has a better solution effect in the long-term allocation of hazardous materials. For example, Figure 7a,e is in the same experimental environment and resource configuration. In Figure 7e, HDA-CR can search 30 Pareto optimal solutions, and other algorithms can search no more than 23 Pareto optimal solutions. This shows that HDA-CR has more obvious search advantages when considering the number of allocated periods is 24 rather than 5. of periods, the above five algorithms need more iterations to converge, but the advantages of HAD-CR still exist, and a greater difference in the number of Pareto optimal solutions is found. This indicates that HDA-CR has a better solution effect in the long-term allocation of hazardous materials. For example, Figure 7a,e is in the same experimental environment and resource configuration. In Figure 7e, HDA-CR can search 30 Pareto optimal solutions, and other algorithms can search no more than 23 Pareto optimal solutions. This shows that HDA-CR has more obvious search advantages when considering the number of allocated periods is 24 rather than 5.  Figure 8 shows the distribution of the Pareto solution set obtained by the above five algorithms in different scenarios clearly. According to the properties of the Pareto solution, it can be seen intuitively that the Pareto optimal solution obtained by HDA-CR is superior to the other four algorithms in terms of quantity and scale in different scenarios.  Figure 8 shows the distribution of the Pareto solution set obtained by the above five algorithms in different scenarios clearly. According to the properties of the Pareto solution, it can be seen intuitively that the Pareto optimal solution obtained by HDA-CR is superior to the other four algorithms in terms of quantity and scale in different scenarios. For example, Figure 8h shows the distribution of two objective function values of the Pareto solution obtained by five algorithms in the multi-period allocation model of hazardous materials (Equation (18)) under Scenario 8. It can be seen that the distribution of Pareto solution set searched by HDA-CR is more uniform and the quality is obviously superior (the smaller the values of the two objective functions, the higher the quality of the solution).  In order to compare the quantity of Pareto optimal solutions of different algorithms more accurately, the average running time (T), two-set coverage (SC), and hypervolume (HV) of the algorithms are selected to further compare the optimization performance of each algorithm from the perspectives including computational cost, accuracy, diversity, and convergence of Pareto optimal solution, respectively.
1. Average running time (T): it indicates the average time of algorithm running, which can be used to measure the time complexity of each algorithm, and the smaller the value, the higher the computational efficiency of the algorithm. In order to compare the quantity of Pareto optimal solutions of different algorithms more accurately, the average running time (T), two-set coverage (SC), and hypervolume (HV) of the algorithms are selected to further compare the optimization performance of each algorithm from the perspectives including computational cost, accuracy, diversity, and convergence of Pareto optimal solution, respectively.
1. Average running time (T): it indicates the average time of algorithm running, which can be used to measure the time complexity of each algorithm, and the smaller the value, the higher the computational efficiency of the algorithm.
where T i is the time consumption of each independent operation and N is the number of independent runs of the algorithm. 2. Two-set coverage (SC) [43]: also known as C metric, it is used to compare the accuracy of the Pareto solutions obtained by each algorithm, and its lower value indicates the better quality of the Pareto solution set.
where the set A i and the set B i are the sets of Pareto solutions obtained by different algorithms in each operation, respectively, and a ≺ b indicates that the individual a is dominated by the individual b. For comparison, the ERNS-DE is chosen as the benchmark algorithm to compare the accuracy of Pareto solutions obtained by the HDA-CR with the other three algorithms in this paper; when calculating the two-set coverage of ERNS-DE, the HDA-CR is chosen as the benchmark algorithm to calculate and compare the two-set coverage with that of HDA-CR in a two-by-two comparison.
3. Hypervolume (HV) [44]: It is used to measure the ratio of the size width of the Pareto solution set obtained by the algorithm in the area covered by the target space, and this index can measure the convergence and diversity of the algorithm at the same time. The larger the value, the more uniform the individual distribution and better the convergence of the Pareto solution set obtained by the algorithm.
where v i is the hypervolume enclosed between the reference point and individual i. After normalizing the fitness values of the individuals, the point (1,1) is selected as the reference point. According to the above multi-objective algorithm evaluation index, the values of each index of different algorithms under different scenarios are calculated, and the results are shown in Table 4. As can be seen from the comparison of the three indicators in Table 4, the HDA-CR proposed in this paper can effectively improve the quality of the solutions in the Pareto optimal solution set searched for at the expense of certain time resources, and outperforms other algorithms in terms of solution accuracy, diversity, and convergence; at the same time, it can effectively alleviate the resource conflict problem, reducing the cost of computing in resource-constrained scenarios. As for the formulation of multi-period allocation schemes for hazardous materials, the main purpose is to determine the optimal long-term allocation scheme, that is, to ensure the quality of the Pareto solution, so that the hazardous materials at each reserve point are consumed in an orderly and reasonable manner, and the running time of the algorithm can be disregarded. Therefore, it can be considered that the HDA-CR performs better in the multi-period resource allocation of hazardous materials.

Discussion
The solution obtained by the above method is often a set of Pareto solution sets, that is, a set of generally incomparable compromise solutions, while the decision process usually needs to provide the decision-maker with the optimal solution choice. The decisionmaker's preference weight w can be introduced to select the optimal solution satisfying the decision-maker from the Pareto solution set.
where g and f r are the normalized objective function values, and different preference weights are selected according to the decision-maker's preference for different objectives to provide the decision-maker with the corresponding most appropriate multi-period allocation scheme of hazardous materials. Figure 9 shows the allocation scheme with the lowest number of reserve points participating in allocation in Sce 1. considered that the HDA-CR performs better in the multi-period resource allocation of hazardous materials.

Discussion
The solution obtained by the above method is often a set of Pareto solution sets, that is, a set of generally incomparable compromise solutions, while the decision process usually needs to provide the decision-maker with the optimal solution choice. The decisionmaker's preference weight w can be introduced to select the optimal solution satisfying the decision-maker from the Pareto solution set.
where g′ and r f ′ are the normalized objective function values, and different preference weights are selected according to the decision-maker's preference for different objectives to provide the decision-maker with the corresponding most appropriate multi-period allocation scheme of hazardous materials. Figure 9 shows the allocation scheme with the lowest number of reserve points participating in allocation in Sce 1. From the perspective of material allocation, the current research mainly focuses on the allocation of hazardous materials in a single period, where the best scheme for the allocation of hazardous materials is found from different perspectives and methods. For example, in literature [10], transportation risk, transportation time, and fuel consumption are considered to find the optimal distribution network. However, from the perspective of supply and demand, the supply and demand relationship will change dynamically in each period. It is not enough to consider only the optimal scheme setting in a single period.
In order to further compare the impact of single-period and multi-period decision making methods on hazardous material allocation scheme. The proposed method in this paper is compared with the traditional single-period decision making method under different scenarios to calculate the cumulative number of reserve point participating allocation tasks, the cumulative allocation risk, and the comprehensive objective value. The results are shown in Figure 10 and the comprehensive objective value is defined as follows: From the perspective of material allocation, the current research mainly focuses on the allocation of hazardous materials in a single period, where the best scheme for the allocation of hazardous materials is found from different perspectives and methods. For example, in literature [10], transportation risk, transportation time, and fuel consumption are considered to find the optimal distribution network. However, from the perspective of supply and demand, the supply and demand relationship will change dynamically in each period. It is not enough to consider only the optimal scheme setting in a single period.
In order to further compare the impact of single-period and multi-period decision making methods on hazardous material allocation scheme. The proposed method in this paper is compared with the traditional single-period decision making method under different scenarios to calculate the cumulative number of reserve point participating allocation tasks, the cumulative allocation risk, and the comprehensive objective value. The results are shown in Figure 10 and the comprehensive objective value is defined as follows: where f max , f min , g max and g min are the maximum and minimum values of the cumulative allocation risk and cumulative number of reserve points participating in the allocation, respectively, in the Pareto solution under the difference scenarios. The traditional decision making approach only considers the optimal target of this period when developing the allocation scheme without considering the future allocation of hazardous materials. From the comparison results in Figure 10, it can be seen clearly that the integrated consideration of multiple periods can lead to a safer and more suitable ideal allocation scheme. Moreover, in the scenario of long-term allocation and limited resources, the comparison of the two decision methods becomes more and more obvious. For example, as shown in Figure 10a, the cumulative numbers of reserve points of the two decision making methods are both 6 in Scenario 3; there is no difference between them and the difference is also small in Scenario 1. However, when resources are limited or the number of periods increases, the multi-period decision making method can obtain an allocation scheme with fewer cumulative reserve points participating in allocation, and the difference between them is larger, which can be seen from the comparison results of Scenario 2, Scenario 4, Scenario 5, Scenario 6, Scenario 7, and Scenario 8 in Figure 10a. Therefore, it is very important for the sustainable management of hazardous materials to develop a suitable multi-period allocation scheme of hazardous materials.

Conclusions
This paper solves the problem of formulating the optimal allocation scheme of hazardous materials and realizing the sustainable management of hazardous materials. Firstly, it abstractly models the realistic situation that needs to be considered for the multiperiod allocation scheme of hazardous materials under uncertainty, and gives a Bilevel chance constrained programming model for formulating a double-objective multi-period allocation scheme of hazardous materials that considers the minimum cumulative number of reserve points participating in the allocation and the minimum cumulative allocation risk. Then, we improved the differential evolution algorithm and established a hierarchical difference evolution algorithm considering multi-dimensional problems to solve the model and find the optimal allocation scheme of hazardous materials.
The experience results provide the following conclusions: (1) In solving the multiperiod allocation model, the hierarchical differential evolution algorithm converges faster than the other multi-objective evolution algorithm, and can search more Pareto solutions, which means there is more space for selecting the optimal allocation scheme. (2) The Pareto solution set obtained by the hierarchical difference algorithm is densely and evenly distributed between the cumulative allocation risk and the cumulative number of reserve points participating in the allocation, with high similarity, and better solution quality (the quality of the searched allocation scheme set is better). (3) Compared with the single-period decision making method, the method proposed in this paper can obtain a safer and more efficient hazardous material allocation scheme, which is conducive to the sustainable management of hazardous materials.
The practical implications of this study include the following: (1) Type-II fuzzy numbers are used in the hazardous materials allocation model proposed in Section 3.3 to The traditional decision making approach only considers the optimal target of this period when developing the allocation scheme without considering the future allocation of hazardous materials. From the comparison results in Figure 10, it can be seen clearly that the integrated consideration of multiple periods can lead to a safer and more suitable ideal allocation scheme. Moreover, in the scenario of long-term allocation and limited resources, the comparison of the two decision methods becomes more and more obvious. For example, as shown in Figure 10a, the cumulative numbers of reserve points of the two decision making methods are both 6 in Scenario 3; there is no difference between them and the difference is also small in Scenario 1. However, when resources are limited or the number of periods increases, the multi-period decision making method can obtain an allocation scheme with fewer cumulative reserve points participating in allocation, and the difference between them is larger, which can be seen from the comparison results of Scenario 2, Scenario 4, Scenario 5, Scenario 6, Scenario 7, and Scenario 8 in Figure 10a. Therefore, it is very important for the sustainable management of hazardous materials to develop a suitable multi-period allocation scheme of hazardous materials.

Conclusions
This paper solves the problem of formulating the optimal allocation scheme of hazardous materials and realizing the sustainable management of hazardous materials. Firstly, it abstractly models the realistic situation that needs to be considered for the multi-period allocation scheme of hazardous materials under uncertainty, and gives a Bilevel chance constrained programming model for formulating a double-objective multi-period allocation scheme of hazardous materials that considers the minimum cumulative number of reserve points participating in the allocation and the minimum cumulative allocation risk. Then, we improved the differential evolution algorithm and established a hierarchical difference evolution algorithm considering multi-dimensional problems to solve the model and find the optimal allocation scheme of hazardous materials.
The experience results provide the following conclusions: (1) In solving the multiperiod allocation model, the hierarchical differential evolution algorithm converges faster than the other multi-objective evolution algorithm, and can search more Pareto solutions, which means there is more space for selecting the optimal allocation scheme. (2) The Pareto solution set obtained by the hierarchical difference algorithm is densely and evenly distributed between the cumulative allocation risk and the cumulative number of reserve points participating in the allocation, with high similarity, and better solution quality (the quality of the searched allocation scheme set is better). (3) Compared with the singleperiod decision making method, the method proposed in this paper can obtain a safer and more efficient hazardous material allocation scheme, which is conducive to the sustainable management of hazardous materials.
The practical implications of this study include the following: (1) Type-II fuzzy numbers are used in the hazardous materials allocation model proposed in Section 3.3 to measure the transportation risk in the hazardous materials allocation process, which provides a new measurement method for the subsequent risk assessment of nuclear materials, explosives, hazardous chemicals, and other hazardous materials. (2) This study can solve the long-term allocation scheme of hazardous materials from a global perspective, and realizes that multiple storage points can allocate multiple hazardous material resources for multiple distribution points in a coordinated manner without causing resource conflicts, which provides a certain method reference for the subsequent solution to the long-term optimal allocation of large-scale resources in industrial production. (3) This paper designs a new hierarchical differential evolution algorithm, which provides an alternative global optimization algorithm for subsequent solution of multi-dimensional and multi-objective task allocation problems.
It is worth noting that the method is proposed to make a multi-period scheduling plan for hazardous materials on the basis of inherent inventory without considering the replenishment of inventory, the uncertainty of demand at each consumption point in each period, and the inherent attributes of each hazardous material, such as residual life, economic benefits, and other factors, which cannot be ignored when making a long-term allocation scheme for hazardous materials, and further research is still needed in the follow-up work.