2.1. Mathematical Model of Underground Articulated LHD (Scraper)
In terms of mathematical modeling, the kinematics model of the underground articulated scraper has been widely used in the field of path tracking due to its simple motion mechanism and the ability to obtain an accurate model [
28].
2.1.1. Kinematics Model of Articulated Scraper
The underground intelligent scraper belongs to the articulated car [
29]. The car body is divided into front and rear ends, the front end and the rear end are connected by the hinge point [
30]. In the deep tunnel environment, the car body has a small steering radius; thus, it is more flexible.
To determine the geometric relationship between the real-time position change information and the motion variables of the LHD, it is essential to analyze the kinematics of the underground articulated LHD and establish the kinematics equation [
31]. The body structure of the underground articulated scraper is shown in
Figure 2 above and below.
In the coordinate system, the front part of the car body is P1(X1,Y1), the rear part of the car body is P2(X2,Y2); the center of mass velocity of the front body is v1, and the center of mass velocity of the rear body is v2; the length of the front body is L1, and the length of the rear body is L2; the slip angle of the front body is , and the slip angle of the rear body is ; the heading angle of the front body is and the heading angle of the rear body is ; the heading angular velocity of the front body is expressed as , and the heading angular velocity of the rear body is expressed as .
Define the course angular velocity:
Under normal circumstances, the operation speed of the underground scraper is slow and generally will not be over 30 km/h [
32]. Provided that the influence of tire deformation and vehicle body slip is ignored, that is,
=
, then the motion state model of the underground articulated scraper is [
33]:
Note: is the articulated angular velocity.
By selecting the midpoint of the front axle as the reference point of the vehicle state, the kinematics model of the underground articulated scraper can be obtained as:
According to Equation (3), the motion state of the whole vehicle body can be controlled by the articulated angular velocity of the front car body of the underground articulated scraper [
34].
2.1.2. Location Prediction Model
Using prediction location of the current state of movement to solve the motion state of the next moment, the control strategy introduced in predicting the location can be appropriate to control the amount of compensation in advance, reduce the error of the controller in the future and thus enable more reasonable control output, avoid overcontrol and excessive control, and ensure the quality of underground articulated scarper (LHD) path tracking control [
35].
An articulated vehicle running curve and the parameter definition are shown in
Figure 3. Curve A is the ideal path of the underground articulated scraper.
Regarding the ideal path of track curve A and the actual path of track curve B,
is the ideal speed of the articulated scraper, and
is the actual running speed of the articulated scraper;
is the ideal heading angle of articulated scraper, and
is the actual heading angle of articulated scraper;
is the ideal reference point of the articulated scraper, and
is the actual reference point of the articulated scraper;
is the ideal steering center of the articulated scraper, and
is the actual steering center of the articulated scraper [
36].
In addition, the center of the front car body of the underground articulated scraper is P, and the predicted position point of the scraper is ; the steering center of the front car body is , and the steering center of the rear car body is ; the steering radius of the front body is , and the steering radius of the rear body is .
According to the motion equation of the articulated vehicle, the change rate of the front steering angle of the vehicle is:
Set the sampling interval as
, and the predicted heading angle of the front end of the vehicle is:
Assuming that the clockwise rotation direction of the vehicle body is the opposite direction, the solution can be obtained:
To distinguish the rotating motion state of the front car body from the linear motion state of the rear car body, the steering state is set as
, and the following conditions are satisfied:
where the threshold value is selected to be small. When the degree of steering angle
of the underground articulated LHD falls within the range of the threshold value, it can be assumed that the LHD does not rotate [
37]. When the underground articulated scraper moves from
to
, the time taken is
, and the change rate of the steering angle of the front segment of the vehicle is
; then, the angle of the front segment of the vehicle turn is:
According to the geometric relation, the forward distance of the vehicle can be calculated:
It can be deduced that the midpoint
of the front end of the car body in the next period is:
Note: is the deviation between the predicted heading angle and the heading angle of the current position.
2.1.3. Deviation Dynamics Model
The core of Part I is to take the speed and hinge angle of the underground articulated scraper in the current control scheme as the main control variables, so as to realize the control of the state of the hinge angle [
38]. Based on the predicted heading angle and the current predicted heading deviation, the control system can effectively improve the control accuracy of reaction path tracking [
39]. The average moving speed of the vehicle body is
v, and the kinematic equation can be calculated according to the error obtained by comparing the actual path of the underground articulated scraper with the ideal path. The error model of the underground articulated scraper is shown in
Figure 4.
Position deviation
is the lateral position error between the reference point
P of the underground articulated scraper and the relative point
P on the planned path:
Driving direction deviation
is the difference between the direction angle of the reference location point
P of the underground articulated scraper and the direction angle of the reference location point
P on the ideal motion trajectory:
Curvature deviation is the curvature error between the reference locus
P of the underground articulated scraper and the reference locus
P of the locus:
According to Equations (13)–(15) and
L = L1 +
L2, it can be obtained:
Since the underground roadway is narrow, the underground scraper can only run at a low speed when the two sides of the roadway are close to each other [
40]. Therefore, the real-time control of the hinged steering angle of the underground scraper is the key and difficult point during the path tracking. To realize the path tracking of the LHD in the roadway, the steering angular velocity and speed of the articulated angle were selected as the control variables, and the error dynamics model was established based on the actual path and expected path of the LHD [
41]. The speed of the articulated vehicles is slow, the change of the articulated angle is small, and the articulated angular acceleration is generally negligible. Therefore, the deviation equation in the above equation is simplified, and the deviation dynamic equation of the underground articulated scraper is obtained as following [
42]:
The difference between homeopathic heading angle and expected heading angle was defined as heading angle deviation . The distance between the reference location point and the expected reference location point of the underground scraper is taken as the transverse deviation (positive when the reference location point is on the right side of the expected path): the curvature deviation between the reference location point and the expected reference location point is .
Under the condition of uniform speed, the dynamic model of the deviation of the articulated scraper is a linear time-invariant system model, and the articulated angle can be controlled by controlling the error variable.
2.2. Path Tracking of Underground Articulated LHD Based on LQR Controller
The control object of LQR optimal controller is a linear system expressed by state space and other basic structures in modern control theory, and all state variables of the system are required to be fully controllable and observable [
43]. The core concept of LQR control is to achieve the maximum control effect with the minimum control variable, that is, the minimum energy consumption [
44]. The optimal state design stage of LQR refers to the K design of diverse state feedback microcontrollers required in the optimization stage, which requires that we can simultaneously make the two objective vector functions of quadratic type Q and R take the lowest value. The state feedback matrix K of the target is the unique determination of Q and R matrices in LQR control, and both Q and R matrices are positive definite matrices [
45].
According to the LQR control theory, the state-space equation of the controlled object requires being determined first. The state equation of the underground articulated scraper can be set as the following equations [
46]:
Note: is the system state space, , and R is the n-dimensional real matrix. is the input vector of the system, , and R is the n-dimensional vector. is the system output matrix, , and R is the n-dimensional real matrix. is the state feedback vector, , and R is the n-dimensional real vector. are system input variables, state variables, and outputs.
According to the above mathematical model of the scraper and vehicle parameters of the articulated scraper in
Table 1, the parameters of the state-space equation of the underground articulated scraper can be determined as following [
47]:
According to the above matrix parameters, the energy control and visual analysis of the system are:
From the above equation, it can be seen that the observability and the controllability matrix of the system are full rank, which means that the underground articulated scraper system is fully controllable and observable, meeting the basic conditions of LQR control [
48]. The state space equation model of the system can be determined by the deviation dynamics model and parameters of the underground articulated LHD:
Through the output deviation matrix
of the control system, the ideal articulation angle input of the car body can be obtained, and the optimal control performance index can be established [
49].
Note: is the time domain integral of the deviation of the underground articulated scraper is the error performance index. represents the time domain integral of control quantity, namely the energy consumption index.
The core of LQR control is to achieve the best control effect with the minimum error performance index and the minimum energy consumption index, establish the feedback control rate
to achieve
, and establish the Riccati equation:
Note: P is some definite positive matrix, and let , E is the identity matrix.
To achieve the best input of the control system, it is necessary to reasonably configure the
Q and
R matrices to achieve the ideal output of the control quantity [
50]. The configuration process of
Q and
R parameters is shown in
Figure 5.
The input of the LQR controller is composed of three parts. One is the acquisition of the position and posture information of the scraper related to the current position and the input of the LQR controller [
51]. Secondly, the acquisition of ideal path information means that the input of LQR controller is related to the position information of the ideal path. The third is the collection of predictive position information, which is solved by using the deviation dynamics model [
52]. Therefore, the controller input of LQR should be a linear superposition of the above three variables. In order to prevent overcontrol and under-control of the underground articulated LHD, different weights should be given to the three variables after comprehensive consideration. The strategy adopted in Part I is to obtain the difference between the current position information and the ideal path, and the predicted position information is to obtain the error by calculating the deviation dynamics model. The final input of the LQR controller is the weighted superposition of the two, as shown in Equation (24).
Note: is the Current Tracking Deviation Matrix; is the Prediction Information Deviation Matrix; a and b are the weight factors, a + b = 1; is the final input for the LQR controller.
2.3. Algorithms
The state variable weight matrix Q in the LQR control can control the amount of weight matrix R and then use that to determine the state feedback vector. The selection of matrix Q and R parameters will directly relate to the effects of control. According to the parameter-setting problem in the classical control theory, the selection of two positive matrices Q and R are prone to rely on a grand amount of engineering experience, which takes longer [
53]. Moreover, the optimal parameter configuration of Q and R cannot be obtained. The selection of the weight matrix Q of the state variable and the weight matrix R of the control quantity can be simply regarded as the Travel Salesperson Problem. It is difficult to obtain the optimal solution for such problems with general methods, so we require solving them with the help of some enlightening intelligent clustering algorithms, such as genetic algorithm (GA), Ant Colony Algorithm (ACA), and Micro Particle Swarm Optimization (PSO) [
54].
2.3.1. Adaptive GA Algorithm Optimization
- (1)
Disadvantages of simple genetic algorithms
The simple genetic algorithm (SGA) is of great significance in practical engineering applications, but nowadays, many defects of the classical simple genetic algorithm are exposed in the process of engineering practice, such as “population precocity”, population differentiation and various groups still do not show the identity after various choices, and so on [
55]. The unreasonable structure of natural selection, crossover, and the mutation algorithm is the fundamental reason for the precocity problem of the population. The precocity problem cannot be avoided, which is also a major feature of an intelligent clustering algorithm. Therefore, it is necessary to improve the crossover operator and mutation operator of the classical genetic algorithm to solve the problem of population precocity to some extent [
56].
- (2)
Improved adaptive genetic algorithm LQR control (LQR–AGA)
In LQR control, the improved adaptive genetic algorithm, as an improved intelligent clustering algorithm, conquers the shortcomings of traditional LQR control. The parameter selection of
Q and
R is optimized by the population, and the crossover mutation operator of the improved genetic algorithm has a strong global optimization ability, and it can find the best state feedback matrix in the selected space [
57].
The improved adaptive genetic algorithm process is shown in
Figure 6, whose main function is to optimize the
Q,
R two matrices in the LQR controller, that is, to determine the parameters of the optimal
and
R matrix.
Encoding
The improved adaptive genetic algorithm can encode chromosomes and genes by using the real encoding method because the real encoding method is intuitive, simple, and easy to calculate [
58]. This method is suitable for the calculation of genetic algorithms with complex fitness function and can greatly diminish the calculation amount of genetic algorithms, so as to speed up the running efficiency of the genetic algorithm [
59].
For example, the chromosome is assumed to be ; of these, q1, q2, and q3 are genes on the chromosome, while Q is the operation of selection, crossover, and mutation operators after the participation of chromosomes and individuals.
Group Value Range
Chromosome
Q is generated according to the MATLAB random number matrix; that is, the initial trial of the population has a strong randomness, which expands the global optimization ability of the improved adaptive genetic algorithm.
It can be seen from Equation (25) that
,
,
, the stability of the whole system and the state feedback matrix must exist, which can be guaranteed by such a value [
60].
Interleaved Mode
The improved crossover operator is used to select the parent generation for crossover change to produce the offspring with strong search ability.
Note: is the n gene above the parent line i chromosome; is the n gene above the offspring clause i chromosome; , b, and c represent the cross variants.
The two crossover modes are selected according to whether the children cross the boundary or not. If the children cross the boundary, they cross according to Equation (26); otherwise, they cross in accordance with Equation (27) [
61].
Variation
According to the adaptive mutation operator, the random gene location on the chromosome was mutated to ensure the diversity of the population and enhance its global optimization ability. Meanwhile, it also ensured that the population could have the identity and converge to the optimal solution in the later iteration period.
Note: ; represents size range of the variation action; represents genes; represents the minimum range of variation of the previous generation; represents the maximum range of variation of the previous generation; represents algebraic variants.
When the number of iterations is small, the gene probability is large, and the global optimization ability of the population is strong. When the number of iterations is high, the mutation probability is small, the computational speed of the genetic algorithm is high, and the required time is short [
62].
Parameter Selection
The number of initialized population individuals was set to be 30, and 50 generations were bred. The probability of crossover between two chromosomes was 0.2, the variation action constant b = 3, and the range of population living space was [0, 50].
Through
discretization of LQR, the fitness equation can be obtained as follows:
Note: T represents the total sampling time length; q1, q2, and q3 represent the Q matrix diagonal elements.
To sum up, the block diagram of the LQR-AGA control system can be drawn, as shown in
Figure 7 [
63].
- (3)
Simulation experiment of LQR-AGA control algorithm
For the simulation of the path,
Figure 8 shows the wavy roadway and the halfway point of the cross-section of the roadway in the attachment for the ideal scraper run path, which controls the target path. This path has continuous turning and other complex road conditions, so the controller detection needs to have strict conditions to embody the scraper movement in actual operation [
64]. In addition, in order to ensure the safe operation of the scraper, the maximum lateral deviation, namely, the safe distance, should be set within 0.6 m [
65].
For example, it is presumed that the population size N is set to 30 and the number of iterations G can be set to 50 generations. The probability of crossover is about 0.2, which can ensure that the population has strong adaptability and can have a better global optimization ability and algorithm iteration speed. The adaptive variation constant 0.2 can ensure that the population has a relatively good global optimization ability and avoid falling into local optimization in the iteration. The variation action constant b = 3 guarantees the global capability at the initial stage of variation and ensures that the local optimum will not fall into at the end of variation. The survival range of the population [0, 50] ensures the positive nature of the control matrix and the stability of the whole control system [
66].
In summary, all the parameters of LQR-AGA are set as shown in
Table 2 below.
The initial test coordinate of the articulated scraper is set as [0.00, 6.50], the initial test heading angle is set as 0π, the hinged steering angle is returned to zero, and the driving speed is constant at 3.5 m/s. The AGA algorithm is used to optimize the weighted matrix in the path tracking controller of the articulated scraper. After repeated experiments, we found that in the first 20 iterations of the AGA algorithm, the population has already had strong spatial distribution and global search ability. The spatial distribution and variation are wide. In the last 20 generations, the population shows strong convergence and quickly converges to the optimal solution of the living space. The adaptations of whole populations to humans and other creatures in nature are the same as the adaptations of humans to populations [
67].
Table 3 shows the results of parameter optimization. The results reveal that other parameters have the optimal adaptability in the iterative time environment. In the 50 iterations, the fitness of the AGA algorithm showed monotonically soaring, indicating that both the individuals and the population were evolving toward the position of the optimal solution, and the fitness of the population remained stable at the end of the iteration, indicating that the entire population had converged to the optimal solution [
68].
As can be seen from
Figure 9,
Figure 10 and
Figure 11, the population has a strong global optimization ability at the beginning, the convergence rate is fast at the later stages of iteration, and the average fitness of the population is high at the end of the iteration. From the perspective of the simulation environment, the lateral error of the LHD on the simulated path is less than 0.1 m, so it can be seen that the weighted matrix Q optimized by the AGA algorithm makes the actual route of the articulated LHD coincide with the ideal path [
69].
2.3.2. Optimization of QPSO Algorithm
- (1)
Disadvantages of simple PSO algorithm
The model of the simple PSO algorithm is the BOID (bird-oid) model of birds’ predation behavior, which simulates the predation characteristics of gregarious creatures. Due to its low requirement for the objective function, simple programming, and easy programming, this model algorithm plays an important role in data scheduling, optimization processing, function optimization analysis, intelligent training and neural network, and other emerging disciplines. However, the BOID model also has obvious disadvantages, such as the severe precocity problem of the population, the strong randomness of the optimization results, and that the global optimal advantage can only be found when the number of iterations approaches infinity. These reasons will result in that the PSO algorithm of the BOID model cannot satisfy the parameter optimization function of the LQR controller, because the objective function of the LQR is a complex multi-peak function, the randomness of solving it by the BOID model is too large, and it cannot guarantee that the optimization results can meet the path tracking requirements of the underground articulated scraper. Therefore, it is necessary to improve the simple PSO algorithm to achieve the optimization ability of an LQR objective function [
70].
- (2)
Quantum Behavior PSO Algorithm (QPSO)
This particle swarm optimization algorithm strengthens the global optimization ability of each individual in the population. Combined with the linear weight reduction strategy, the inertia of the individual group can be reduced at the end of the iteration to accelerate its convergence rate and accelerate the group searching efficiency. The optimized particle swarm speed iterative algorithm is shown in Equation (30):
Note: represents the velocity of the i-th particle; represents the location of the i-th particle; Pbest represents the historical optimal position of the particle; and Gbest represents the historical optimal location of the population.
In view of the particle swarm optimization algorithm (PSO), in the late iteration of high dimension, it is easy to fall into local optimum and other problems. Therefore, the concept of the hand velocity factor needs to be introduced to increase the velocity of particles at the end of the iteration. Its formula is shown in Equation (31):
In the early stage of the algorithm iteration, because the particle distribution is relatively scattered and the particle has a large inertia weight at this time, it will explore the space globally according to its initial velocity, and the particle at this time has a strong global exploration ability. Therefore, the
K values should be large initially. In the late iteration of the algorithm, the population needs to have strong convergence characteristics, so as to speed up the operation speed of the algorithm, and the population requires changing in a small spatial range. To sum up,
K values should show a monotonically decreasing characteristic with the increase in the number of iterations, so we can set the function of K values changing with the number of iterations as shown in Equation (32).
Note: T represents number of current iterations; Max_Gen represents the maximum number of iterations.
QPSO adjusts the position update strategy in the simple PSO algorithm by canceling the attribute of speed and replacing it with the probability distribution function, which means that particles are distributed according to probability rather than velocity. Therefore, the spatial attribute of each particle in the population needs to be determined by every observation. The formula for calculating the average value of the historical optimal fitness of a single particle is as follows:
Note: N is the sum of the number of particles; represents the optimal fitness of a single particle in the i-th iteration.
The position update of particles is based on the probability distribution function, as shown in Equation (34):
Note: represents the probability function obeying the uniform distribution on (0,1); represents the expansion coefficient, the probability of positive is 50%, and the probability of negative is 50%.
The particle swarm optimization analysis algorithm directly uses a particle group; each particle in the individual information is used to group the comprehensive analysis and information sharing of the information to directly promote the coordinated motion of particle groups, so that it will directly produce the evolution process from disorder to order in space in the process of solving a population problem. Thus, we can directly obtain the optimization and understanding of a group problem whose basic flow A and block diagram B of the LQR-QPSO system are shown in
Figure 12 and
Figure 13.
- (3)
LQR-QPSO control algorithm simulation experiment
The simulation environment is shown in
Figure 8; see
Table 1 for the body parameters of the underground articulated scraper; the QPSO parameter configuration is shown in
Table 4. The maximum lateral deviation is required to be less than 0.6 m [
71].
The initial test coordinate of the articulated scraper is set as [0.00, 6.50], and the heading angle of the initial test is set as zero angle, namely, 0°. In addition, the steering angle of the front and rear car bodies of the articulated scraper is set to zero, which means that the body keeps moving forward, and the traveling speed is 3.5 m/s, so it is inconvenient to maintain the speed constant. After repeated trial and simulation tests, it is found that in all experiments, the population convergence rate of the QPSO algorithm is slow. In the first 30 generations, each particle varies greatly in the global scope, showing obvious divergence, but after 60 generations, particle swarm gradually converges to the optimal value. The controller parameters obtained in the 80 generations can make errors in the operation process of the articulated scraper within a reasonable range [
72], as shown in
Figure 14.
Besides, the optimization results of the QPSO algorithm are shown in
Table 5, the simulation results of QPSO are shown in
Figure 15, and the deviation range of QPSO is shown in
Figure 16.
2.3.3. ACA Optimization of Ant Colony Algorithm
- (1)
Ant Colony Algorithm LQR Controller (LQR-ACA)
An LQR-ACA path tracking controller can be established as shown in
Figure 17 [
73].
- (2)
LQR-ACA control algorithm simulation experiment
The validity and reliability of the LQR-ACA path traceability controller were tested and verified by MATLAB simulation [
74]. To make this ant group have better search ability and code iteration speed, the ant number is set as 30, and the search time G is set as 50 generations. The hormone play factor is set to 0.4, and the search range is set to (0, 50), [
75]. The ACA parameter configuration is shown in
Table 6.
The initial coordinate of the articulated scraper is set as [0.00, 6.50], the heading angle of the initial test is set as 0 π, the hinged steering angle is returned to zero, and the driving speed is constant at 3.5 m/s. The ant colony algorithm is used to configure the parameters of the LQR controller of the articulated scraper. After testing repeatedly, it is found that the ant colony algorithm of an ant colony in 100 iterations will converge to different extreme value points, and the position of most of the ants in the number of iterations is more than 20. Since it no longer changes generations, the fitness function of the LQR controller is a more extreme value point function, and there were many equal fitness points in the solution space [
76], as shown in
Figure 18.
Besides, ACA algorithm optimization results are shown in
Table 7.
The parameter optimization results are brought into the simulation environment of the articulated scraper to complete the path tracking simulation, and the results as shown in
Figure 19 can be obtained. The weighted matrix Q obtained by the ACA algorithm makes the actual route of the articulated scraper coincide with the ideal path [
77]. Besides, the deviation range of ACA is shown in
Figure 20.