Compilation of Load Spectrum for 5MN Metal Extruder Based on Long Short-Term Memory Network

: As an important condition for fatigue analysis and life prediction, load spectrum is widely used in various engineering ﬁelds. The extrapolation of load samples is an important step in compiling load spectrum. It is of great signiﬁcance to select an appropriate load extrapolation method. This paper proposes a load extrapolation method based on long short-term memory (LSTM) network, introduces the basic principle of the extrapolation method, and applies the method to the data set collected under the working state of 5MN metal extruder. The comparison between the extrapolated load data and the actual load shows that the trend of the extrapolated load data is basically consistent with the original tendency. In addition, this method is compared with the rain ﬂow extrapolation method based on statistical distribution. Through the comparison of the short-term load spectrum compiled by the two extrapolation methods, it is found that the load spectrum extrapolation method based on LSTM network can better realize load prediction and optimize the compilation of load spectrum.


Introduction
Metal extrusion is one of the main methods for the production of non-ferrous metals and steel materials and the forming and processing of parts. It has been applied in mechanical, aerospace, transportation and other manufacturing fields due to its advantages of high efficiency, low consumption and less chips. Extruder is the main equipment of extrusion processing, and its safe and reliable operation is the basis of producing extrusion products with high quality. In the design process of the extruder, not only the analogy design and the traditional design method based on strength of material are adopted, but also the finite element method is used to analyze the key components in detail to ensure their static strength. However, in addition to the static strength failure of the key components in practice, the extruder constantly bears complex random loads in the service process, and fatigue failure is likely to occur in the service process. In order to better evaluate the fatigue life of the extruder, it is necessary to compile its load spectrum.
The load spectrum truly reflects the corresponding relationship between the load characteristic parameters and their occurrence frequency in the real load-time history of mechanical structure in a certain period of time. Load spectrum is an important condition for fatigue life prediction, reliability design and structural fatigue test of mechanical structures. It is often expressed in many forms, such as tables, matrices and graphics. At present, load spectrum is widely used in aerospace [1], vehicle [2], wind energy [3], railway [4] and other fields [5]. The compilation of load spectrum is mainly composed of load signal testing, load data processing, load cycle counting and load extrapolation. However, due to the limitations of testing technology, time and cost, the complete load information is difficult to obtain directly. Load extrapolation has become the key point in the compilation of load spectrum. In order to solve different problems in various fields of engineering, a variety of extrapolation methods have been developed, such as rain flow matrix extrapolation [6], peak threshold extrapolation [7], mileage and quantity extrapolation [8], and parameter extrapolation [9,10].
The earliest applied load spectrum extrapolation technique is the rain flow extrapolation method. This method is based on mathematical statistics and completes the extrapolation of short-term load spectrum based on statistical characteristics. Initially, the extrapolation of the load spectrum was a cyclic counting method of short-term load-time history, and the load amplitude was extrapolated after fitting the amplitude distribution according to the amplitude-frequency histogram. However, this method only considers the influence of load amplitude on fatigue damage and ignores the mean load, and the final load spectrum is not perfect. Subsequently, scholars such as Nagode performed rain flow counting processing on random loads, used mixed two-parameter Weibull distribution to fit the load amplitude and applied it to forklift parts to achieve parameter extrapolation [11,12]. On the basis of one-dimensional amplitude extrapolation, Nagode extended the idea of two-dimensional rain flow matrix, using mixed Weibull distribution and mixed normal distribution to fit the amplitude and mean of the load respectively. According to the joint probability density function of the load, a parameter rain flow extrapolation method based on mixed distribution is proposed [13]. Due to the wide application range of statistical distribution and good extrapolation effect, the extrapolation of rain flow load spectrum based on mathematical statistics has gradually attracted people's attention and has been widely used. Although the mixed distribution has a good fitting effect on the load, the parameter estimation process is relatively cumbersome, and the calculation time becomes the main problem of this method.
For the weak periodicity of the load signal generated during the service of the extruder, a simple linear model is difficult to obtain good prediction accuracy. With the development of artificial intelligence, deep learning methods have flourished and are gradually applied to solve corresponding problems in the engineering field. Under the guidance of deep learning theory, many variant models of neural networks are proposed, which can learn complex nonlinear data well. At present, as one of the important branches of deep learning, recurrent neural network (RNN) has achieved many successes in the fields of computer vision [14], speech recognition [15] and natural language processing [16]. Because of the memory ability of RNN model to time series data, it is widely used in data prediction. However, due to the problems of gradient disappearance and gradient explosion, LSTM [17] is proposed as a variant of RNN model. Compared with the traditional neural network, LSTM network can capture the characteristics in a longer time series. Therefore, this paper applies LSTM network to 5MN metal extruder, and higher prediction accuracy is obtained through load data prediction to optimize the compilation of load spectrum.

Rain Flow Counting Method
Load cycle counting is the central part of statistical processing of random load-time history. The essence of counting is to study the load characteristic value and frequency of random load-time history from the perspective of fatigue damage. There are three most commonly used cycle counting methods in engineering [18]: horizontal cross method, peak cycle method and rain flow counting method. Frendahl and Socie [19] confirmed through a large amount of data analysis and research that the fatigue life predicted by the rain flow cycle counting method is the most consistent with the actual fatigue life of the mechanical structure. The counting method is divided into single parameter counting method and double parameter counting method. Because the single parameter counting method only considers the single variable of peak and valley value, it cannot accurately describe the characteristics of the load cycle. As a counting method based on the twoparameter method, the rain flow counting method has become a widely accepted method for processing random signals and performing fatigue analysis. The rain flow counting method was proposed by Endo [20] in 1968, and then Rychlik [21] gave the mathematical definition of rain flow counting method. The rain flow counting method can directly extract the load cycles in the load sequence, and obtain the two parameters of load mean and amplitude. The load mean and amplitude represent the static strength and dynamic strength of the mechanical structure respectively. The rain flow counting method can obtain more complete load information that affects the strength of the mechanical structure, which lays the foundation for the load spectrum design and fatigue life prediction of the mechanical structure. In addition, this method can not only count the trend of load changes but also relate the fatigue characteristics of materials, so it is widely used in the field of fatigue life prediction. Figure 1a is the strain-time history generated under the action of alternating load, and the corresponding stress-strain response of the mechanical structure is shown in Figure 1b. Figure 1c shows that the load cycles obtained by counting rain flow is consistent with this. Therefore, the load cycles extracted by the rain flow counting method can represent the stress-strain response of the mechanical structure.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 3 of 13 method only considers the single variable of peak and valley value, it cannot accurately describe the characteristics of the load cycle. As a counting method based on the twoparameter method, the rain flow counting method has become a widely accepted method for processing random signals and performing fatigue analysis. The rain flow counting method was proposed by Endo [20] in 1968, and then Rychlik [21] gave the mathematical definition of rain flow counting method. The rain flow counting method can directly extract the load cycles in the load sequence, and obtain the two parameters of load mean and amplitude. The load mean and amplitude represent the static strength and dynamic strength of the mechanical structure respectively. The rain flow counting method can obtain more complete load information that affects the strength of the mechanical structure, which lays the foundation for the load spectrum design and fatigue life prediction of the mechanical structure. In addition, this method can not only count the trend of load changes but also relate the fatigue characteristics of materials, so it is widely used in the field of fatigue life prediction. Figure 1a is the strain-time history generated under the action of alternating load, and the corresponding stress-strain response of the mechanical structure is shown in Figure 1b. Figure 1c shows that the load cycles obtained by counting rain flow is consistent with this. Therefore, the load cycles extracted by the rain flow counting method can represent the stress-strain response of the mechanical structure. Place the obtained load time history as shown in Figure 2, which is similar to a multilayer roof. Assuming that raindrops flow down the multi-layer roof from point O, the counting result and the rain flow counting rule corresponding to the stress-strain response are as follows: (1) In the process of raindrops flowing down the roof, if there is no roof blocking, the raindrops will continue to flow down until it stops; (2) Raindrops that start at the peak load point will end when they encounter a peak load point higher than it; (3) Raindrops that start at the load valley will also end when they encounter a load valley lower than it; (4) When raindrops flow, they stop when they encounter the rain stream from the roof above.
The path that the raindrops flow from the starting point to the ending point represents the load cycles. The from-to data sets in all load cycles can be calculated to obtain the amplitude and mean value data sets. The amplitude and the mean value can respectively represent the load cycles extracted from the rain flow count. The amplitude ( ) and mean ( ) value of the load cycle can be expressed as: Place the obtained load time history as shown in Figure 2, which is similar to a multilayer roof. Assuming that raindrops flow down the multi-layer roof from point O, the counting result and the rain flow counting rule corresponding to the stress-strain response are as follows: (1) In the process of raindrops flowing down the roof, if there is no roof blocking, the raindrops will continue to flow down until it stops; (2) Raindrops that start at the peak load point will end when they encounter a peak load point higher than it; (3) Raindrops that start at the load valley will also end when they encounter a load valley lower than it; (4) When raindrops flow, they stop when they encounter the rain stream from the roof above.
The path that the raindrops flow from the starting point to the ending point represents the load cycles. The from-to data sets in all load cycles can be calculated to obtain the amplitude and mean value data sets. The amplitude and the mean value can respectively represent the load cycles extracted from the rain flow count. The amplitude (S a ) and mean (S m ) value of the load cycle can be expressed as:

LSTM
LSTM can predict the future by extracting the obvious characteristics of the c data set, and it has become one of the most feasible tools in the task of data pr LSTM is a special type of recurrent neural network, which is an improved type network. Through the deliberately designed neural network structure, it can sa long-term information, which solves the problem of model failure due to gradien sion and gradient descent in the traditional RNN algorithm. The key to LSTM is state, which is similar to a conveyor belt. It is free from the interference of oth mation while information flows, so as to achieve the function of long-term mem excellent generalization ability.
The core of the design of LSTM is to add a structure called a gate, which is a of selecting information. The structure of the LSTM algorithm is shown in Figure  has a total of three gates to control the addition or deletion of the content of cells. gate is the forget gate, which will read the output of the previous unit state ( ) input information (ℎ ) at the current moment, and then decide to transmit or information at the previous moment.

= ℎ , +
Here, σ is the logistic sigmoid function, which outputs the values in range f 1, is the forget gate, is the weight of the forget gate and is the bias of th gate.

LSTM
LSTM can predict the future by extracting the obvious characteristics of the collected data set, and it has become one of the most feasible tools in the task of data prediction. LSTM is a special type of recurrent neural network, which is an improved type of RNN network. Through the deliberately designed neural network structure, it can save more long-term information, which solves the problem of model failure due to gradient explosion and gradient descent in the traditional RNN algorithm. The key to LSTM is the cell state, which is similar to a conveyor belt. It is free from the interference of other information while information flows, so as to achieve the function of long-term memory and excellent generalization ability.
The core of the design of LSTM is to add a structure called a gate, which is a method of selecting information. The structure of the LSTM algorithm is shown in Figure 3. LSTM has a total of three gates to control the addition or deletion of the content of cells. The first gate is the forget gate, which will read the output of the previous unit state (X t ) and the input information (h t−1 ) at the current moment, and then decide to transmit or lose the information at the previous moment.

LSTM
LSTM can predict the future by extracting the obvious characteristics of the collected data set, and it has become one of the most feasible tools in the task of data prediction. LSTM is a special type of recurrent neural network, which is an improved type of RNN network. Through the deliberately designed neural network structure, it can save more long-term information, which solves the problem of model failure due to gradient explosion and gradient descent in the traditional RNN algorithm. The key to LSTM is the cell state, which is similar to a conveyor belt. It is free from the interference of other information while information flows, so as to achieve the function of long-term memory and excellent generalization ability.
The core of the design of LSTM is to add a structure called a gate, which is a method of selecting information. The structure of the LSTM algorithm is shown in Figure 3. LSTM has a total of three gates to control the addition or deletion of the content of cells. The first gate is the forget gate, which will read the output of the previous unit state ( ) and the input information (ℎ ) at the current moment, and then decide to transmit or lose the information at the previous moment.
Here, σ is the logistic sigmoid function, which outputs the values in range from 0 to 1, is the forget gate, is the weight of the forget gate and is the bias of the forget gate.
The second gate is the input gate, and this structure is divided into two parts. The first part is the sigmoid layer, which determines the content that needs to be updated in Here, σ is the logistic sigmoid function, which outputs the values in range from 0 to 1, f t is the forget gate, W f is the weight of the forget gate and b f is the bias of the forget gate.
The second gate is the input gate, and this structure is divided into two parts. The first part is the sigmoid layer, which determines the content that needs to be updated in the Appl. Sci. 2021, 11, 9708 5 of 13 old cell. In the second part, a new candidate value is generated in the tanh layer, and the sigmoid layer and the tanh layer are used to update the cell state at the same time.
where W i and W c represents the corresponding weight, b i and b c represents the corresponding bias, tanh is the hyperbolic tangent activation function, and the output range is −1 to 1.
Combine the preparations made in the first two steps to update the state of the cell and obtain a new cell state (C t ).
The third gate is the output gate, which determines the output value based on the neuron state. The sigmoid layer of the output gate will first determine the unit state to be output, and then the neuron state to be output will be multiplied by the output of the tanh layer and the sigmoid layer to obtain the output value.
where W o and b o respectively represent the weight and bias of the output gate.

Selection of Measuring Point
The frame is the main bearing component of the 5MN metal extruder, which is mainly composed of front beam, back beam, tie rod, compression sleeve and nut. The front beam produces relatively large bending deformation when subjected to preload and working load. The finite element analysis results of the outer and inner sides of the front beam are shown in Figure 4a,b respectively. Therefore, we choose the outer side with greater stress as the stress measurement position. The dangerous parts on this surface are mainly two parts, namely the front beam outlet and the middle of the four edges. Due to the symmetry of the front beam structure, some of the dangerous parts with greater stress are selected as stress measurement points. The distribution of the measurement points is shown in Figure 5.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 5 of 13 the old cell. In the second part, a new candidate value is generated in the tanh layer, and the sigmoid layer and the tanh layer are used to update the cell state at the same time.
where and represents the corresponding weight, and represents the corresponding bias, tanh is the hyperbolic tangent activation function, and the output range is −1 to 1.
Combine the preparations made in the first two steps to update the state of the cell and obtain a new cell state ( ).
The third gate is the output gate, which determines the output value based on the neuron state. The sigmoid layer of the output gate will first determine the unit state to be output, and then the neuron state to be output will be multiplied by the output of the tanh layer and the sigmoid layer to obtain the output value.
where and respectively represent the weight and bias of the output gate.

Selection of Measuring Point
The frame is the main bearing component of the 5MN metal extruder, which is mainly composed of front beam, back beam, tie rod, compression sleeve and nut. The front beam produces relatively large bending deformation when subjected to preload and working load. The finite element analysis results of the outer and inner sides of the front beam are shown in Figure 4a,b respectively. Therefore, we choose the outer side with greater stress as the stress measurement position. The dangerous parts on this surface are mainly two parts, namely the front beam outlet and the middle of the four edges. Due to the symmetry of the front beam structure, some of the dangerous parts with greater stress are selected as stress measurement points. The distribution of the measurement points is shown in Figure 5.

Data Description
The stress of the front beam of the extruder was tested during a cert during the production process of the titanium electrode of the 5MN meta experiment, DH3820 stress-strain testing system as shown in Figure 6 i The system can filter the high-frequency noise signal, and the filter cu 1/2.56 times of the sampling frequency. According to the previous test extruder stress, the sampling frequency of the system is set to 10 Hz for At the same time, because the main stress direction of the front beam o certain, the stress state at one point must be determined by three indepe Therefore, four 45° three-directional strain gauge rosettes are installe measurement points of the front beam, and the extrusion load data for are collected during the use of the extruder. It can be seen from Figure 7 the Front Beam 1# measuring point of the extruder is the largest (fatig likely to occur), so the stress data of the 1# measuring point is used as subsequent analysis and processing in this paper. Effective load rem peaks and valleys, load signal filtering and other data preprocessing about 30 groups of extrusion cycle data are obtained. The original data two parts: the first 25 groups of extrusion cycle data are used as the tra rest are used as test data.

Data Description
The stress of the front beam of the extruder was tested during a certain period of time during the production process of the titanium electrode of the 5MN metal extruder. In this experiment, DH3820 stress-strain testing system as shown in Figure 6 is used for testing. The system can filter the high-frequency noise signal, and the filter cut-off frequency is 1/2.56 times of the sampling frequency. According to the previous testing experience of extruder stress, the sampling frequency of the system is set to 10 Hz for data acquisition. At the same time, because the main stress direction of the front beam of the frame is uncertain, the stress state at one point must be determined by three independent quantities. Therefore, four 45 • three-directional strain gauge rosettes are installed at the selected measurement points of the front beam, and the extrusion load data for a period of time are collected during the use of the extruder. It can be seen from Figure 7 that the stress of the Front Beam 1# measuring point of the extruder is the largest (fatigue failure is most likely to occur), so the stress data of the 1# measuring point is used as the basic data for subsequent analysis and processing in this paper. Effective load removal of abnormal peaks and valleys, load signal filtering and other data preprocessing work, and finally about 30 groups of extrusion cycle data are obtained. The original data are divided into two parts: the first 25 groups of extrusion cycle data are used as the training set, and the rest are used as test data.

Data Description
The stress of the front beam of the extruder was tested during a certain period during the production process of the titanium electrode of the 5MN metal extrude experiment, DH3820 stress-strain testing system as shown in Figure 6 is used for The system can filter the high-frequency noise signal, and the filter cut-off freq 1/2.56 times of the sampling frequency. According to the previous testing exper extruder stress, the sampling frequency of the system is set to 10 Hz for data acq At the same time, because the main stress direction of the front beam of the fram certain, the stress state at one point must be determined by three independent qu Therefore, four 45° three-directional strain gauge rosettes are installed at the measurement points of the front beam, and the extrusion load data for a period are collected during the use of the extruder. It can be seen from Figure 7 that the the Front Beam 1# measuring point of the extruder is the largest (fatigue failure likely to occur), so the stress data of the 1# measuring point is used as the basic subsequent analysis and processing in this paper. Effective load removal of a peaks and valleys, load signal filtering and other data preprocessing work, an about 30 groups of extrusion cycle data are obtained. The original data are divi two parts: the first 25 groups of extrusion cycle data are used as the training set, rest are used as test data.

Evaluation for Forecast Result
Three indicators are usually used to evaluate the performance of load forecasting models. They are root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE). The definitions of them are as follows: Here is the actual value of the ith sample, is the predicted value of the i-th sample, and N is the total number of sample data. The loss function is usually a good evaluation of the performance of model predictions for deeper algorithm optimization.

Experiment Result
In order to better reflect the improved prediction performance of LSTM model based on RNN model, LSTM and RNN algorithm are established as the comparison of neural network in this paper. The experiment was implemented on a computer equipped with Intel i7 3.4GHz CPU, 16 GB memory and NVIDIAGTX1060 GPU. Both algorithms were trained 100 times under the same experimental conditions. In our experiment, we used 25 sets of extrusion cycle data collected at the 1# measuring point to make predictions. The prediction results and original load data of the five sets of extrusion cycles in the test set are shown in Figure 8. Under the same experimental environment and training times, the prediction results of the load data during the service process of the extruder can be seen. Due to the problems of gradient disappearance and gradient explosion, the unmodified RNN algorithm can not meet the prediction requirements in the burst stage of data, although there has been a slight fitting in the rising and falling trend. The predicted load of LSTM algorithm has similar extrusion cycle characteristics with the actual extrusion load, and the predicted results are closer to the actual data, which reflects the strong memory and learning ability of LSTM network in time series.

Evaluation for Forecast Result
Three indicators are usually used to evaluate the performance of load forecasting models. They are root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE). The definitions of them are as follows: Here y i is the actual value of the ith sample,ŷ i is the predicted value of the i-th sample, and N is the total number of sample data. The loss function is usually a good evaluation of the performance of model predictions for deeper algorithm optimization.

Experiment Result
In order to better reflect the improved prediction performance of LSTM model based on RNN model, LSTM and RNN algorithm are established as the comparison of neural network in this paper. The experiment was implemented on a computer equipped with Intel i7 3.4GHz CPU, 16 GB memory and NVIDIAGTX1060 GPU. Both algorithms were trained 100 times under the same experimental conditions. In our experiment, we used 25 sets of extrusion cycle data collected at the 1# measuring point to make predictions. The prediction results and original load data of the five sets of extrusion cycles in the test set are shown in Figure 8. Under the same experimental environment and training times, the prediction results of the load data during the service process of the extruder can be seen. Due to the problems of gradient disappearance and gradient explosion, the unmodified RNN algorithm can not meet the prediction requirements in the burst stage of data, although there has been a slight fitting in the rising and falling trend. The predicted load of LSTM algorithm has similar extrusion cycle characteristics with the actual extrusion load, and the predicted results are closer to the actual data, which reflects the strong memory and learning ability of LSTM network in time series. Appl. Sci. 2021, 11, x FOR PEER REVIEW 8 of 13 According to the prediction result indicators of the two models on the test set, the loss function values of different models are shown in It is found that compared with RNN model, the prediction data error of LSTM network is closer to zero. The higher prediction accuracy further reflects the prediction performance of LSTM network, so LSTM model can better adapt to the situation of random load prediction and meet the needs of load spectrum extrapolation.

Classification of Load Spectrum
The data sets of 25 extrusion cycles collected by 1# measuring point in the extrusion process of 5MN metal extruder are obtained by rain flow counting method. However, it is difficult to analyze and process the continuous distribution of load amplitude and mean data. Therefore, the loads that are not much different are divided into a group and the unified value in the group is used to replace each load value to achieve classification. The continuous load cycle frequency accumulation curve is transformed into a stepped frequency accumulation curve, which can generally be divided into four levels, eight levels, sixteen levels, and thirty-two levels. German engineer Gassner proposed to divide the load spectrum amplitude into eight levels, and Conover also found that dividing the load spectrum into eight levels can accurately reflect the fatigue effect, which has now been applied to engineering practice [22].
In this paper, the load spectrum is divided into eight levels, which can be divided according to the equal interval method and amplitude ratio coefficient method. Considering that the damage caused by large amplitude load is much greater than that caused by small amplitude load. Therefore, the division density of large amplitude value should be greater than that of small amplitude value. The amplitude ratio coefficient is set as 1, 0.95, 0.85, 0.725, 0.575, 0.425, 0.275 and 0.125, and the mean ratio coefficient is set as 1, 0.875, 0.75, 0.625, 0.5, 0.375, 0.25 and 0.125. The eight-level load spectrum is shown in Table  2.

Classification of Load Spectrum
The data sets of 25 extrusion cycles collected by 1# measuring point in the extrusion process of 5MN metal extruder are obtained by rain flow counting method. However, it is difficult to analyze and process the continuous distribution of load amplitude and mean data. Therefore, the loads that are not much different are divided into a group and the unified value in the group is used to replace each load value to achieve classification. The continuous load cycle frequency accumulation curve is transformed into a stepped frequency accumulation curve, which can generally be divided into four levels, eight levels, sixteen levels, and thirty-two levels. German engineer Gassner proposed to divide the load spectrum amplitude into eight levels, and Conover also found that dividing the load spectrum into eight levels can accurately reflect the fatigue effect, which has now been applied to engineering practice [22].
In this paper, the load spectrum is divided into eight levels, which can be divided according to the equal interval method and amplitude ratio coefficient method. Considering that the damage caused by large amplitude load is much greater than that caused by small amplitude load. Therefore, the division density of large amplitude value should be greater than that of small amplitude value. The amplitude ratio coefficient is set as 1, 0.95, 0.85, 0.725, 0.575, 0.425, 0.275 and 0.125, and the mean ratio coefficient is set as 1, 0.875, 0.75, 0.625, 0.5, 0.375, 0.25 and 0.125. The eight-level load spectrum is shown in Table 2.

Comparison of Methods
In order to better evaluate the load extrapolation effect based on the LSTM model. We use five sets of unused extrusion cycle data to compare the load spectra under the two methods.
For the rain flow extrapolation method, we extract the load amplitude and mean value from the original data. Through the joint distribution function of load amplitude and mean value, the times of original load data in each load area can be calculated. The frequency calculation formula is: where the total cumulative frequency of load obtained from the data set is N w , S a i and S a i+1 is the upper and lower limit of each group's load amplitude, S m j and S m j+1 is the upper and lower limit of the average load of each group, N ij is the frequency of load amplitude falling in interval i and load mean falling in interval j.
To obtain the load spectrum of the extruder in a specific extrusion cycle on this basis, the total cumulative frequency needs to be divided by the corresponding number of extrusion cycles. The expression for averaging the above load frequency to M extrusion cycles is: The data of 25 extrusion cycles are drawn by the rainflow counting method to obtain the statistical histogram of the mean and amplitude, and then the statistical distribution function is fitted. The fitting results of load amplitude and mean are shown in Figure 9. The parameter estimation results of the load amplitude and the mean value show that the correlation coefficient of the amplitude following the Weibull distribution is 0.92, and the correlation coefficient of the mean following the normal distribution is 0.97, which further confirms the distribution fitting results of the load amplitude and the mean. The short-term load spectrum compiled by the frequency calculated by the joint distribution function averaged to five extrusion cycles is shown in Table 3.

Comparison of Methods
In order to better evaluate the load extrapolation effect based on the LSTM model. We use five sets of unused extrusion cycle data to compare the load spectra under the two methods.
For the rain flow extrapolation method, we extract the load amplitude and mean value from the original data. Through the joint distribution function of load amplitude and mean value, the times of original load data in each load area can be calculated. The frequency calculation formula is: where the total cumulative frequency of load obtained from the data set is , and is the upper and lower limit of each group's load amplitude, and is the upper and lower limit of the average load of each group, is the frequency of load amplitude falling in interval i and load mean falling in interval .
To obtain the load spectrum of the extruder in a specific extrusion cycle on this basis, the total cumulative frequency needs to be divided by the corresponding number of extrusion cycles. The expression for averaging the above load frequency to extrusion cycles is: * = The data of 25 extrusion cycles are drawn by the rainflow counting method to obtain the statistical histogram of the mean and amplitude, and then the statistical distribution function is fitted. The fitting results of load amplitude and mean are shown in Figure 9. The parameter estimation results of the load amplitude and the mean value show that the correlation coefficient of the amplitude following the Weibull distribution is 0.92, and the correlation coefficient of the mean following the normal distribution is 0.97, which further confirms the distribution fitting results of the load amplitude and the mean. The shortterm load spectrum compiled by the frequency calculated by the joint distribution function averaged to five extrusion cycles is shown in Table 3.        The frequency distribution of the eight-level load spectrum drawn based on the three types of data are shown in Figure 10a-c. It is found by comparing the short-term load spectrum compiled based on the rain flow extrapolation method and LSTM with the actual load spectrum. For the frequency distribution of load amplitude and mean value at all levels, the results predicted by the LSTM model are more consistent with the actual situation than the rain flow extrapolation method. Because the load spectrum compiled by the rain flow extrapolation method is only a statistical analysis of the original data and no new load information is regenerated on the basis of the original data. Faced with the random load generated by the 5MN metal extruder, the effect of predicting the trend of the load change is not significant. In addition, artificially fitting the joint distribution of the load amplitude and the mean value will also produce certain errors and cause the extrapolation of the load spectrum to be unreliable. For the LSTM network that can memorize the load time series information for a long time, the load spectrum compiled by its predicted data can better match the actual load spectrum. At the same time, LSTM can also capture the load frequency in each stage of the load spectrum. Therefore, the LSTM method can be selected to extrapolate the load of the extruder to compile the load spectrum in actual engineering.
Appl. Sci. 2021, 11, x FOR PEER REVIEW 11 of 13 no new load information is regenerated on the basis of the original data. Faced with the random load generated by the 5MN metal extruder, the effect of predicting the trend of the load change is not significant. In addition, artificially fitting the joint distribution of the load amplitude and the mean value will also produce certain errors and cause the extrapolation of the load spectrum to be unreliable. For the LSTM network that can memorize the load time series information for a long time, the load spectrum compiled by its predicted data can better match the actual load spectrum. At the same time, LSTM can also capture the load frequency in each stage of the load spectrum. Therefore, the LSTM method can be selected to extrapolate the load of the extruder to compile the load spectrum in actual engineering.
(a) Load spectrum-rain flow extrapolation data (b) Load spectrum-LSTM forecast data

Conclusions and Discussion
Although the rain flow extrapolation method can extrapolate the load frequency, it does not complete the two-way extrapolation of load and frequency and does not accurately predict or even ignore the extreme load that may have a great impact on the fatigue life in the whole life cycle of 5MN metal extruder. At the same time, due to the randomness of the test load, the load data measured in each test are different. For irregular load information, simply using the basic distribution function to describe its characteristics must have errors. Therefore, we must try to fit with more distributions or mixed distributions in the extrapolation process. The closer the degree of fitting is, the more accurate the extrapolation result will be, but it will also face serious parameter estimation and calculation problems and strong human factor constraints.
LSTM model essentially belongs to the category of time domain extrapolation. With the increase of the number of products, the frequent change of working objects and the renewal of design and manufacturing technology, the load information is also very complex. Time domain extrapolation directly extrapolates the tested load sequence without a series of transformations, so the generated cyclic sequence is more real. The method reduces the error accumulation caused by the calculation of multiple links, and can generate new extreme value data to better predict the service life of the system or parts. The establishment of time domain extrapolation method makes up for the uncertainty of extrapolation distribution function fitting in the past and avoids the existence of subjectivity.
Another advantage of time domain extrapolation is that the result retains the sequence of load cycles, and the extrapolated load of any mileage can also be obtained. The time domain sequence can be obtained only by applying Markov chain model transformation to the extrapolation result of rain flow.
Load extrapolation is a key step in the process of compiling the load spectrum, and the accuracy of the extrapolated data directly determines the validity of the load spectrum. This paper proposes the use of LSTM method to extrapolate the load spectrum of the 5MN metal extruder, and uses the advantage of LSTM to learn the long-distance time series dependence to predict the trend of the load data. The model was verified with actual load data, and compared the short-term load spectrum compiled by the forecast data and the rain flow extrapolation method. The results show that the proposed method improves the reliability of the load spectrum and has great potential in engineering applications. In the

Conclusions and Discussion
Although the rain flow extrapolation method can extrapolate the load frequency, it does not complete the two-way extrapolation of load and frequency and does not accurately predict or even ignore the extreme load that may have a great impact on the fatigue life in the whole life cycle of 5MN metal extruder. At the same time, due to the randomness of the test load, the load data measured in each test are different. For irregular load information, simply using the basic distribution function to describe its characteristics must have errors. Therefore, we must try to fit with more distributions or mixed distributions in the extrapolation process. The closer the degree of fitting is, the more accurate the extrapolation result will be, but it will also face serious parameter estimation and calculation problems and strong human factor constraints.
LSTM model essentially belongs to the category of time domain extrapolation. With the increase of the number of products, the frequent change of working objects and the renewal of design and manufacturing technology, the load information is also very complex. Time domain extrapolation directly extrapolates the tested load sequence without a series of transformations, so the generated cyclic sequence is more real. The method reduces the error accumulation caused by the calculation of multiple links, and can generate new extreme value data to better predict the service life of the system or parts. The establishment of time domain extrapolation method makes up for the uncertainty of extrapolation distribution function fitting in the past and avoids the existence of subjectivity.
Another advantage of time domain extrapolation is that the result retains the sequence of load cycles, and the extrapolated load of any mileage can also be obtained. The time domain sequence can be obtained only by applying Markov chain model transformation to the extrapolation result of rain flow.
Load extrapolation is a key step in the process of compiling the load spectrum, and the accuracy of the extrapolated data directly determines the validity of the load spectrum. This paper proposes the use of LSTM method to extrapolate the load spectrum of the 5MN metal extruder, and uses the advantage of LSTM to learn the long-distance time series dependence to predict the trend of the load data. The model was verified with actual load data, and compared the short-term load spectrum compiled by the forecast data and the rain flow extrapolation method. The results show that the proposed method improves the reliability of the load spectrum and has great potential in engineering applications. In the