Machine-Learning-Based Optimization of Energy Management in a Novel Hybrid Powertrain of Concrete Truck Mixers

: The new energy of concrete truck mixers is of great signiﬁcance to achieve energy conservation and emission reduction. Unlike general-purpose vehicles, in addition to driving conditions, upper-mixing system conditions, operation scenarios, and variable loads are the key factors to be considered during the new energy of concrete truck mixers. This study focuses on the machine-learning-based approximate optimal energy management design for a concrete truck mixer equipped with a novel extended-range powertrain from two aspects: trip information and energy management strategy. Firstly, an optimal control database is constructed, which beneﬁts from a global optimization algorithm with dimension reduction for the constrained time-varying two-point boundary value problems with two control variables, and the driving data analysis through machine learning and data-driven methods. Then, different machine-learning-based driving condition identiﬁers are constructed and compared. Finally, a vehicle mass and power demand of an upper-part system based novel neural network energy management strategy is designed based on a constructed optimal control database. Simulation results show that the intelligent optimization algorithm based on the ML of trip information and energy management is an appropriate way to solve the online energy management problem of the concrete truck mixer equipped with the proposed novel powertrain.


Introduction
Currently, the drive for hybridization is still an effective solution to improve fuel economy [1,2], especially for medium or heavy commercial vehicles. Powertrain configuration, system modeling, and energy management strategies are the key factors for hybridization [3]. This research is committed to solving the problem of improving the fuel economy of concrete truck mixers (CTMs). At present, most of the CTMs on the road are traditional vehicles. There are a few hybrid or pure electric structures [4,5] which only aim at the upper mixing system or CTMs with small agitation capacity, respectively. Different from the general-purpose vehicles, the concrete truck mixer has the characteristics of significantly variable vehicle mass during a trip, multi-operation scenarios, a dual energy source and dual energy consumption source, and time-varying double driving conditions including a vehicle driving system and an upper mixing system, which means existing technology cannot be directly used for CTMs. Thus, it is necessary to explore the methods to improve fuel economy for such a multi-source high coupling system.
The performance of CTMs is inseparable from its energy management strategy that distributes the power demand of energy sources between the dual energy consumption source, vehicle driving system, and upper mixing system. To achieve better vehicle control, the development of a real-time energy management strategy has been a matter of great interest. The main real-time energy management strategies are comprised of a rulebased strategy [6,7], Pontryagin's minimum principle (PMP) [8], model predictive control

Powertrain Configuration and Model
In this work, a novel hybrid powertrain structure is proposed for the concrete truck mixers, and the characteristics of the structure and system models are presented in the following.

Powertrain Specification and System Modeling
As can be found in Figure 1, the novel hybrid powertrain consists of a dual energy source and dual energy consumption source, including engine and battery, an upper mixing system, and a vehicle driving system, respectively. The clutch between the engine and generator makes the powertrain mode diversified and reliability improved, which makes the powertrain an extended-range configuration for the vehicle driving system and a single-axle parallel configuration for the upper mixing system. The driving system can be propelled by one or both of the batteries and the auxiliary power unit (APU) composed of engine, clutch, and generator, and the power demand of the upper mixing system is provided by the engine or generator individually or jointly. Therefore, the CTM studied in the research is regarded as an extended-range electric concrete truck mixer (E-RE-CTM). Actually, the proposed powertrain configuration is also applicable to other vehicles with similar characteristics.

Powertrain Specification and System Modeling
As can be found in Figure 1, the novel hybrid powertrain consists of a dual energy source and dual energy consumption source, including engine and battery, an upper mixing system, and a vehicle driving system, respectively. The clutch between the engine and generator makes the powertrain mode diversified and reliability improved, which makes the powertrain an extended-range configuration for the vehicle driving system and a single-axle parallel configuration for the upper mixing system. The driving system can be propelled by one or both of the batteries and the auxiliary power unit (APU) composed of engine, clutch, and generator, and the power demand of the upper mixing system is provided by the engine or generator individually or jointly. Therefore, the CTM studied in the research is regarded as an extended-range electric concrete truck mixer (E-RE-CTM). Actually, the proposed powertrain configuration is also applicable to other vehicles with similar characteristics. A CTM with 12 m 3 agitator capacity carrying the proposed novel hybrid powertrain is taken as the research object, and the major parameters of the E-RE-CTM are listed in Table 1. To realize the E-RE-CTM control, the powertrain models, mainly including the engine, motor, generator, battery pack, and upper mixing system, are established, and previous studies are available for details [23]. A CTM with 12 m 3 agitator capacity carrying the proposed novel hybrid powertrain is taken as the research object, and the major parameters of the E-RE-CTM are listed in Table 1. To realize the E-RE-CTM control, the powertrain models, mainly including the engine, motor, generator, battery pack, and upper mixing system, are established, and previous studies are available for details [23].

Energy Optimal Problem Formulation
As for the E-RE-CTM, the core task of energy management control is to realize the dynamic distribution of dual energy source between dual energy consumption source within each trip. The energy management control enables the vehicle fuel economy to improve under the normal operations of the energy consumption sources. According to the powertrain analysis described in Section 2.1, the system needs to set two control variables. The output power of the engine and battery, which are given by P ice and P batt , respectively, are selected as control variables. The battery SOC is chosen as a state variable. Aiming at enhancing the battery recharging ability and improving the vehicle fuel economy, the energy optimization problem of the E-RE-CTM is formulated as Equations (1)-(3): where L(·) can be expressed as: Combined with the description of Section 2.1, the objective function can be further formulated as: where γ is a positive weighting factor, P e gen (k) is the electric power of the generator at time k, P e mot (k) means the known disturbance of the vehicle driving system at time k, P up (k) is the power demand of the upper mixing system, P e mot (k), P up (k) are system inputs, and SOC k f , SOC * k f are the target battery SOC and actual battery SOC in the end, respectively. N denotes the length of a trip.
N can be calculated by, where ∆t is the time step, T means the driving cycle duration. Some other equality or inequality constraints should be satisfied during the process of calculation, as: where SOC(k 0 ) is the allowable upper limit of battery SOC, P m gen (k) is the mechanical power of generator at time k, respectively, P MI N batt (k), P MI N ice (k), P MAX batt , and P MAX ice (k) denote the upper and lower limits of battery and engine output power at time k, respectively, which can be calculated by: , P max batt P MI N ice (k) = max f 1 p ice , P up , P e mot , f 1 , η gen , P m gen , P min ice P MAX ice (k) = min f 1 p ice , P up , P e mot , f 2 , η gen , P m gen , P ice (k) + ∆P max ice , P max ice (6) where P min batt , P min ice , P max batt and P max ice mean the allowable power range of corresponding components, C ch and C dis mean the maximum charge and discharge rate of the battery, and ∆P ice represents the maximum variable rate of the diesel engine out power in unit time and the value of it is chosen as 15 kW/s. Thus, the change rate of APU generating power is less than 15 kW/s. By synthesizing Equations (1)- (6), it can be concluded that the energy optimization problem of E-RE-CTM belongs to the two-point boundary value problem in the finite time domain, which has the characteristics of constrained time-varying and double control variables.

Global Optimal Energy Management Strategy
In this section, DP is applied to solve the energy optimization problem of E-RE-CTM. The total driving cycle known in advance is the premise of using DP [24]. Meanwhile, due to the iterative calculation of DP, the more complex the problem is, the greater the computational burden and the more memory utilization as the time increases [25,26]. For these reasons, as shown in Figure 2, the CTM real-world driving cycle data were collected, and the DP method with dimensionality reduction is presented, which is divided into two steps described as follows.

Dynamic Programming to Solve the Optimization Problem
Step 1: To achieve lower complexity and memory utilization in the solution process, it is necessary to reduce the dimension of control variables. In this work, by designing an optimal efficiency curve of the generator in driving mode, the two-point boundary value problem in the finite time domain with two control variables described in Section 2.2 is

Driving Data Obtain
As shown in Figure 2a, a driving cycle of a CTM can be divided into five route segments, which are charging, transporting, waiting, discharging, and returning. Although the route segments of the CTM are the same each time daily, each driving cycle datum is different. The total driving mileage of the CTM is about 120 km per day according to the survey and real-world driving cycle data. Figure 2b shows the CTM real-world driving cycle data collected in eight days in Beijing by using a vehicle data recorder. The data of the first seven days are utilized as sample library construction and that of the last day is used for testing [27].

Dynamic Programming to Solve the Optimization Problem
Step 1: To achieve lower complexity and memory utilization in the solution process, it is necessary to reduce the dimension of control variables. In this work, by designing an optimal efficiency curve of the generator in driving mode, the two-point boundary value problem in the finite time domain with two control variables described in Section 2.2 is transformed into the single control variable problem. The optimal efficiency curve of the generator in driving mode is shown in Figure 2c, and the generator efficiency model of the powertrain is described as: According to the equality constraint in Equation (5), once the generator efficiency η gen is determined, the engine output power P ice or the battery output power P batt can be selected as a single control variable to solve the problem. Since the engine and the generator are coaxially connected, the speed is the same. Therefore, the objective function can be formulated with the selected only control variable P ice in this work, as: Step 2: It is realized for the DP method to minimize the total fuel consumption of the E-RE-CTM and enhance the recharging ability of the battery over the given driving conditions in MATLAB. The schematic diagram of the DP process is shown in Figure 2d. Driving cycle time is divided into N stages. The control variable and state variable are P ice and battery SOC, respectively. s i k means the i th state at stage k. The cost-to-go is the objective function J in Equation (8). For the detailed implementation of DP, refer to [24]. Significantly, to further improve the calculation efficiency, feasible control domain of stage k and feasible state domain of stage k + 1, given by U i k and S i k+1 , respectively have been determined before the backward solution in this work.
As a global optimization algorithm, DP is quite difficult to implement online due to the limitations of the known global information and high computational burden and memory utilization [25]. It can be the basis of designing other strategies according to the theoretically optimal control database.

Approximate Optimal Energy Management Design Based on Machine Learning
To make the developed energy management strategy closer to the performance of DP, it is an effective solution to use a ML algorithm to learn the optimal decisions of DP. Meanwhile, it is necessary to combine with a ML algorithm to obtain driving condition information to further improve the performance of energy management strategy. The structure of ML-based optimal power control for the E-RE-CTM is described in Figure 3.

Machine Learning of the Trip Information
In this section, typical driving conditions construction based on unsupervised learning and driving condition identifier development based on supervised learning are introduced as follows.

Typical Driving Conditions Construction Based on Unsupervised Learning
To realize the driving condition construction, the following four steps are required.
Step 1: Features vector selection and dimension reduction. To build the driving data sample library, the sample division method based on microtrips is adopted, and the collected real-world driving data are subdivided into n (n = 1788) micro-trips (Figure 3a) after data supplement and elimination. The main method of data supplement is interpolation, and the limits of driving motor power and acceleration are mainly considered to eliminate the samples where the wrong data are located.
To describe the characteristics of obtained micro-trips, the remaining six characteristic parameters listed in Table 2 are selected through correlation analysis. The features matrix V F of the micro-trips can be expressed as: where v = v max , a n max , σ(a p ·v), (a n ·v), r 0,15 v , r 30,50 v means the feature vector of a sample, and m and n denote the number of features and samples, respectively.  The above six features still have different degrees of correlation, and the more features, the higher the computational complexity. Principal component analysis (PCA) is a dimension reduction method that projects high-dimensional interrelated variables into low-dimensional orthogonal principal component space. The first several principal components with a large variance contribution rate are selected to describe the original variables. The principal components matrix M p can be expressed as: where A r p means the load coefficient matrix of principal component, p (1 ≤ p ≤ 3) and r (1 ≤ r ≤ m) represent the reference of principal components and features, respectively. Figure 4a shows the variance contribution rate of each principal component. We can clearly see that the cumulative variance contribution rate of the first three principal components is 86.4305%, greater than 85%. Therefore, the number of principal components is three, which denotes that the six features of the micro-trips can be characterized by the three features.  The absolute value of the features load coefficient is related to the characteristic difference of the characteristic sample. Figure 4b describes the load coefficients of six features in the three principal components, respectively. The larger the absolute value of the load The absolute value of the features load coefficient is related to the characteristic difference of the characteristic sample. Figure 4b describes the load coefficients of six features in the three principal components, respectively. The larger the absolute value of the load coefficient of the feature, the greater the influence of the feature on the sample difference. As we can see, the contribution level of the six features for the first principal component is the same, and the sixth and fifth features, the ratio between 0 and 15 and the ratio between 30 and 50 of vehicle speed, have the highest contribution to the second principal component.
Step 2: Samples clustering and performance verification. K-means, a widely used unsupervised learning algorithm, needs to randomly select the cluster center, which makes classification results greatly affected by the selection of cluster center. Given this, k-means + +, proposed by Arthur and Vassilvitskii [28], is an iterative clustering algorithm based on the probability method to select the initial centroids, and it can generate better quality clusters [29]. K-means + + is utilized in this work. The elbow rule is employed to determine the optimal cluster number, and three types of driving conditions are obtained shown in Figure 3b.
Silhouette value, an evaluating indicator of clustering effect, is employed in this work, and the Silhouette values based on the k-means + + algorithm are presented in Figure 4c. The closer the value is to one, the better the clustering effect is. Except for a few samples belonging to type 2 whose Silhouette values are less than zero, the Silhouette values of others are relatively large, which denotes the k-means + + method is reasonable and feasible. Choose the first 100 micro-trips closest to the center of each type, which is presented in Figure 4d. It can be observed that the maximum speed of micro-trips under type 3 is the lowest, and the time when the speed is greater than zero is short, which belongs to congested urban driving conditions. The maximum speed of micro-trips under type 2 is greater than that under the others and the time when speed is greater than zero is the longest, which shows smoother urban driving conditions. The micro-trips under type 1 represent the relatively smoother urban working conditions with medium speed and a high idle ratio.
To test the sensitivity of the clustering method to the data sample library, a standard driving cycle called China bus driving cycle divided into 14 micro-trips is used in the section. The standard driving cycle and classification results are shown in Figure 4e. We can conclude that the results obtained and the characteristics of corresponding microtrips are highly consistent with those described in Figure 4d, indicating that the method mentioned above has low sensitivity to driving data sample library.
Step 3: Typical driving conditions construction based on random sampling. The various typical driving conditions are the basis for designing the energy management strategy of E-RE-CTMs based on machine learning. Considering the dynamic performance of the E-RE-CTM and the equal probability of each sample, a random sampling method without return is performed to construct the typical driving conditions. Significantly, once the target mileage of the typical driving condition is greater than the total that of all samples of a type, repeat the random sampling process without return. Figure 3c shows the obtained three typical driving conditions with 120 km mileage. The average working time is taken for charging, waiting, and discharging under each driving cycle, and the vehicle is from full load to no load.
Step 4: Results of optimal control. The three types of driving conditions, including the power demand of the mixing drum system, the known disturbance of the vehicle driving system, and the whole vehicle mass in five route segments, are chosen as the simulation input. The power of dual energysource and dual energy consumption source and battery SOC under the constructed three typical driving conditions are in Figure 5a-c, respectively, by using DP. As we can see, under different driving condition types, the engine output power is always greater than a fixed value, ranging from 21 to 23 kW when the engine is in a stable operation state, which indicates that DP makes the engine work in the highly efficient area to the greatest extent. The battery SOC can just reach the minimum allowable value at the end of the trip under the different types, which improves vehicle fuel economy and recharging capacity. The battery SOC can just reach the minimum allowable value at the end of the trip under the different types, which improves vehicle fuel economy and recharging capacity.

Development of Driving Condition Identifier Based on Supervised Learning
In this section, we establish three different identifiers and compare their performance.
Step 1: Based on the established sample library of the E-RE-CTM, three methods based on ML, including LVQ, Random Forest (RF), and Extreme Learning Machine (ELM), are designed to realize the online identification of driving condition types, which can be described as follows.
Learning Vector Quantization: An LVQ neural network is a supervised forward neural network, of which a global optimum can be obtained by directly calculating the distance between the input vector and the competitive layer without preprocessing input vectors [21,30]. As shown in Figure 3d, the LVQ neural network consists of three layers: an input layer, a competition layer, and linear output layer, and the connection modes between layers are full connection and partial connection, respectively. The input layer contains six neurons corresponding to the six features mentioned in Table 2

Development of Driving Condition Identifier Based on Supervised Learning
In this section, we establish three different identifiers and compare their performance.
Step 1: Based on the established sample library of the E-RE-CTM, three methods based on ML, including LVQ, Random Forest (RF), and Extreme Learning Machine (ELM), are designed to realize the online identification of driving condition types, which can be described as follows.
Learning Vector Quantization: An LVQ neural network is a supervised forward neural network, of which a global optimum can be obtained by directly calculating the distance between the input vector and the competitive layer without preprocessing input vectors [21,30]. As shown in Figure 3d, the LVQ neural network consists of three layers: an input layer, a competition layer, and linear output layer, and the connection modes between layers are full connection and partial connection, respectively. The input layer contains six neurons corresponding to the six features mentioned in Table 2. The number of competition layer neurons is set to be 30 by continuous experiments. Three neurons related to the three types of driving conditions mentioned in Section 4.1.1 are included in the linear output layer. The LVQ neural network selects the winning neuron by calculating the Euclidean distance of weights of the input vector and competitive neurons, as: where ω ij means the connection weight of the i th competitive layer neurons and the j th input layer neurons, I and C represent the dimension of the input vector and number of the competitive neurons, respectively. Only the winning neuron and its corresponding output neuron are set to one [31]. On that basis, the actual type is obtained. Then the weight of the input vector and competitive neurons is updated by: where k is the number of iterations, and η represents the learning efficiency. Random Forest: RF is an ensemble learning method, a branch of ML, based on a single decision tree [32]. The final decision results C m (·) of RF are generated by equal voting of all decision trees C p . Currently, RF has attracted extensive attention due to its excellent performance [33,34], however, there is still a lack of research on driving condition identification. The calculation process of RF is described in Figure 3e. The size of decision tree affects the performance of the random forest algorithm, and the value 450 with high mean accuracy and moderate scale is selected by continuous experiments.
Extreme Learning Machine: An ELM neural network is a single hidden layer neural network proposed by Huang [35], which is aimed at the shortcomings of a single hidden layer feedforward neural network algorithm. It has the advantages of good generalization performance, fast learning speed, and ease of use. We can obtain the unique optimal solution by selecting the number of hidden layer neurons. Figure 3f shows the structure of the ELM neural network. The threshold B of hidden neurons and the input weights W are randomly generated and remain unchanged in the training process. By solving Equation (13) can get corresponding output weights β without iteration [20], as: where H means the hidden layer output matrix, T is the expected output, and H + denotes the Moore-Penrose of H. The input vectors are equal to that of LVQ. The experimental method is still used to ensure the number of hidden layer neurons to realize the balance between computational efficiency and recognition accuracy.
Step 2: Performance Comparison. To verify and compare the performance of the above identification methods, Figure 6 shows a combined driving cycle constructed from the testing driving data by the random sampling method and its real types.
As presented in Table 3, Kappa coefficient κ, total accuracy A m and identification time t a are chosen as indexes to evaluate and compare the identifiers performance. The total accuracy A m can directly reflect the correct proportion of classification or recognition; however, if the samples are unbalanced, the accuracy will fail. Kappa coefficient κ, ranging from −1 to 1, can penalize the bias of the model compared with total accuracy A m [36]. It is more suitable for the performance evaluation of classification problems with unbalanced category samples. Reference [37] proposed the model performance analysis criteria according to the Kappa value. The higher the κ or A m , the better the performance; on the contrary, a lower t a indicates better performance. The window size, W t , and identification time, I t , intervals are vital to the accuracy of the identification and real-time implementation. According to the identifiers performance comparison results under different combinations of the two key parameters, W t = 120 s and I t = 1 s are optimal for RF and ELM and W t = 80 s and I t = 1 s are optimal for LVQ. The Kappa coefficients of the three methods are all greater than 0.40, which means that the identification results have reached an ideal consistency with the real results [37]. If κ is greater than 0.61, it indicates that the identification is highly consistent with the real situation [37]. Thus, it can indicate the accuracy and feasibility of the three methods. Meanwhile, we can clearly see that the ELM method outperforms the others by comparing the three indexes under the parameter combination [W t , I t ] = [120, 1]. Although the kappa coefficient and total accuracy of LVQ is superior to these of RF and ELM under the parameter combination [W t , I t ] = [80, 1], the identification time is the longest. Comparing the optimal results of ELM and LVQ mentioned above, it can be concluded that ELM has better recognition performance in this work. Hence, the driving condition identifier based on ELM would be chosen to be applied for the online energy management strategy. As presented in Table 3, Kappa coefficient κ, total accuracy and identification time are chosen as indexes to evaluate and compare the identifiers performance. The total accuracy can directly reflect the correct proportion of classification or recognition; however, if the samples are unbalanced, the accuracy will fail. Kappa coefficient κ, ranging from −1 to 1, can penalize the bias of the model compared with total accuracy [36]. It is more suitable for the performance evaluation of classification problems with unbalanced category samples. Reference [37] proposed the model performance analysis criteria according to the Kappa value. The higher the κ or , the better the performance; on the contrary, a lower indicates better performance. The window size, , and identification time, , intervals are vital to the accuracy of the identification and real-time implementation. According to the identifiers performance comparison results under different combinations of the two key parameters, = 120 s and = 1 s are optimal for RF and ELM and = 80 s and = 1 s are optimal for LVQ. The Kappa coefficients of the three methods are all greater than 0.40, which means that the identification results have reached an ideal consistency with the real results [37]. If κ is greater than 0.61, it indicates that the identification is highly consistent with the real situation [37]. Thus, it can indicate the accuracy and feasibility of the three methods. Meanwhile, we can clearly see that the ELM method outperforms the others by comparing the three indexes under the parameter combination [ , ] = [120,1]. Although the kappa coefficient and total accuracy of LVQ is superior to these of RF and ELM under the parameter combination [ , ] = [80,1], the identification time is the longest. Comparing the optimal results of ELM and LVQ mentioned above, it can be concluded that ELM has better recognition performance in this work. Hence, the driving condition identifier based on ELM would be chosen to be applied for the online energy management strategy.   1 The optimal window size and identification interval of RF and ELM; 2 the optimal window size and identification interval of LVQ.
Step 3: Optimization of identifier based on ELM. The input weights and hidden layer thresholds of the basic ELM are generated randomly, and still have the potential to further improve performance. In this section, an ELM identifier based on genetic algorithm optimization is proposed. Meanwhile, to obtain better performance, k-fold (k = 10) cross-validation method is used to determine the optimal number of hidden layer nodes. Taking the recognition accuracy as the fitness value, the flow chart of the optimized ELM method is presented in Figure 7. The identification results are listed in Table 3 under the same driving cycle in Step 2. We can clearly see that the Kappa coefficient κ of optimized ELM is 0.65, which is 0.02 higher than that of unoptimized ELM, and the total accuracy A m of optimized ELM is increased by 0.01 than that of unoptimized ELM. The identification time t a is also shortened to a certain extent. the optimal number of hidden layer nodes. Taking the recognition accuracy as the fitness value, the flow chart of the optimized ELM method is presented in Figure 7. The identification results are listed in Table 3 under the same driving cycle in Step 2. We can clearly see that the Kappa coefficient κ of optimized ELM is 0.65, which is 0.02 higher than that of unoptimized ELM, and the total accuracy of optimized ELM is increased by 0.01 than that of unoptimized ELM. The identification time is also shortened to a certain extent.

Neural Network Structure Determination
Based on the optimal control results of different driving conditions obtained in Section 3.2, neural network models are designed to find the optimal power split associated with driving condition types.
The input and output variables of the neural network need to be determined in advance [15]. According to Equation (8), the control variable P ice is selected as the NN module output variable. Considering the characteristics of the system, in addition to basic input variables, such as the power demand of the vehicle driving system, battery SOC, and the vehicle speed, the vehicle mass and power demand of the upper mixing system are also significant for the NN module construction. The vehicle mass that affects the power demand of the dual energy consumption source is chosen as the NN module input variable. The power demand of the upper mixing system affects whether the dual energy source will output power or how much power it will output. As an inertial system, the change rate of the diesel engine is limited in unit time [8], so the last power output of the engine is also taken as an input variable of the NN module. Vehicle acceleration, total and current mileage, and time are also treated as the input variable of the NN module. Therefore, the novel efficient NN module structure is preliminarily constructed with 11 input variables.
To further reduce the calculation time and the memory usage of the control strategy, the input variables are reduced to eight by calculating the Spearman correlation coefficient. Figure 3g describes the final NN module structure with eight input variables, two hidden layers, and one output variable. The number of NN module hidden layer nodes are different under different types of driving conditions in this study.

Neural Network Training and Verification
The training data of NN modules under different types are generated by data preprocessing. As the main evaluation index of the neural network training effect, the smaller the mean square errors (MSEs), the better the learning effect [38]. It can be calculated as: where P out ice is the NN module output, and P tar ice means the target output value. The MSE is 0.00032, 0.0020 and 0.00021, respectively, as presented in Table 4. From Figure 8a-c, the engine output power difference is concentrated within plus or minus 15 kW, which indicates that the engine output power generated by the NN modules are very close to that DP results under the three types and the neural network designed above is reasonable and feasible. Therefore, the intelligent optimization algorithm based on the ML of trip information and energy management is an appropriate way to solve the online energy management problem of E-RE-CTMs. In the online application, it can automatically switch to the optimal NN modules under current micro-trips shown in Figure 3h according to recognition results. Therefore, the intelligent optimization algorithm based on the ML of trip information and energy management is an appropriate way to solve the online energy management problem of E-RE-CTMs. In the online application, it can automatically switch to the optimal NN modules under current micro-trips shown in Figure 3h according to recognition results.

Conclusions
In this paper, an ML-based approximate optimal energy management strategy for concrete truck mixers equipped with a novel hybrid powertrain was designed from two aspects: trip information and energy management strategy, which mainly included the following: 1. For the CTMs equipped with a proposed novel hybrid powertrain, a global optimization algorithm based on DP was proposed to solve the two-point boundary value problem in the finite time domain, which has the characteristics of constrained timevarying and double control variables. By designing an optimal efficiency curve of the generator in driving mode and establish the generator efficiency model, the complexity of solving the energy optimization problem can be reduced;

Conclusions
In this paper, an ML-based approximate optimal energy management strategy for concrete truck mixers equipped with a novel hybrid powertrain was designed from two aspects: trip information and energy management strategy, which mainly included the following: 1.
For the CTMs equipped with a proposed novel hybrid powertrain, a global optimization algorithm based on DP was proposed to solve the two-point boundary value problem in the finite time domain, which has the characteristics of constrained time-varying and double control variables. By designing an optimal efficiency curve of the generator in driving mode and establish the generator efficiency model, the complexity of solving the energy optimization problem can be reduced; 2.
An optimal control database can be obtained based on the ML and data-driven method; different ML-based driving condition identifiers were constructed and compared. Simulation results showed that the total performance of ELM is superior to the RF and LVQ through the comparison of kappa coefficient, identification time, and identification accuracy. An optimized ELM identifier based on genetic algorithm was presented, which can further promote online identification performance; 3.
For the E-RE-CTM, a vehicle mass and power demand of an upper-part system based novel neural network energy management strategy was designed based on a constructed optimal control database. Simulation results showed that the designed neural network is reasonable and feasible.
Our future work could potentially focus on the neural network input variables recognition or prediction based on the study of this study to developing an online energy optimization algorithm with stronger adaptability to more driving information.  Data Availability Statement: The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.