Cooperative Carrying Control for Multi-Evolutionary Mobile Robots in Unknown Environments

This study provides an effective cooperative carrying and navigation control method for mobile robots in an unknown environment. The manager mode switches between two behavioral control modes—wall-following mode (WFM) and toward-goal mode (TGM)—based on the relationship between the mobile robot and the unknown environment. An interval type-2 fuzzy neural controller (IT2FNC) based on a dynamic group differential evolution (DGDE) is proposed to realize the carrying control and WFM control for mobile robots. The proposed DGDE uses a hybrid method that involves a group concept and an improved differential evolution to overcome the drawbacks of the traditional differential evolution algorithm. A reinforcement learning strategy was adopted to develop an adaptive WFM control and achieve cooperative carrying control for mobile robots. The experimental results demonstrated that the proposed DGDE is superior to other algorithms at using WFM control. Moreover, the experimental results demonstrate that the proposed method can complete the task of cooperative carrying, and can realize navigation control to enable the robot to reach the target location.


Introduction
Mobile robot control has been widely used in several applications, such as navigation, obstacle avoidance, path planning, and cooperative transport. To enhance the robot control quality, Cupertino [1] adopted a fuzzy controller. The fuzzy controller possesses robustness and an anti-noise ability; the controller can identify and calculate signals with uncertainties. However, to design an applicable fuzzy controller, designers must spend a considerable amount of time analyzing the experimental input and output data of a mobile robot. Thus, machine-learning technology has gradually attracted considerable research attention. Zhu and Yang [2] and Rusu et al. [3] used supervised learning methods to adjust the parameters of the if-then rules in fuzzy neural networks by training data. The disadvantages of supervised learning [2,3] are that it is difficult to collect training data in advance and obtain precise training data.
Recently, reinforcement learning [4] has been widely used in control applications for mobile robots. The method not only can automatically construct a complete fuzzy neural network in the absence of precise training data but also adjust the parameters of the system through the machine leaning algorithm to complete the navigation task. Therefore, designing a mobile robot using evolutionary computing in unknown environments has become a topic of interest. Hsu and Juang [5] implemented a wall-following control using reinforcement learning. A fitness function was used to evaluate a The e-puck mobile robot is a two-wheeled mobile robot with an axle diameter of 4.12 cm and a maximum speed of 15 cm/s. The infrared sensor has a sensing range of approximately 360° and is symmetrical. The sensor performs tasks such as object detection, distance detection, and obstacle avoidance. The sensors S0-S3 that are on the right side of the robot are mounted at 10°, 45°, 90°, and 135°, respectively. Each sensor can detect a distance between 1 and 6 cm, as shown in Figure 1b.

Proposed Type-2 Fuzzy Controller Based on an Evolutionary Algorithm
An IT2FNC was proposed to realize wall-following control. The associated DGDE learning algorithm can be used to adjust the parameters of the IT2FNC.

Interval Type-2 Fuzzy Neural Controller
In this section, the structure of the IT2FNC is introduced. Figure 2 displays the structure of the IT2FNC.
, ⋯ , represents the input of IT2FNC, whereas and represent the left and right wheel speed of the robot, respectively. To reduce the computational complexity of the order reduction during defuzzification, this study adopted the centers of sets (COS) [11,19] to conduct the order reduction process. A functional link neural network (FLNN) with a nonlinear combination input was added in consequent part of a fuzzy rule [20]. Figure 2 presents the five-layer structure of an IT2FNC. The IT2FNC consists of an input layer, a membership function layer, a firing layer, a consequent layer, and an output layer. The if-then rule can be expressed as follows: Rule j: IF is and is … and is … and is THEN = ∑ ∅ , where is the number of inputs, represents the ith input, denotes the output of the jth fuzzy rule, , , … , represent the interval type-2 fuzzy sets, is the link weight, ∅ represents the basis trigonometric function, and is the number of basis functions.

Proposed Type-2 Fuzzy Controller Based on an Evolutionary Algorithm
An IT2FNC was proposed to realize wall-following control. The associated DGDE learning algorithm can be used to adjust the parameters of the IT2FNC.

Interval Type-2 Fuzzy Neural Controller
In this section, the structure of the IT2FNC is introduced. Figure 2 displays the structure of the IT2FNC. X 1 , · · · , X n represents the input of IT2FNC, whereas Y Le f t and Y Right represent the left and right wheel speed of the robot, respectively. To reduce the computational complexity of the order reduction during defuzzification, this study adopted the centers of sets (COS) [11,19] to conduct the order reduction process. A functional link neural network (FLNN) with a nonlinear combination input was added in consequent part of a fuzzy rule [20]. Figure 2 presents the five-layer structure of an IT2FNC. The IT2FNC consists of an input layer, a membership function layer, a firing layer, a consequent layer, and an output layer. The if-then rule can be expressed as follows: Rule j : IF x 1 is A 1j and x 2 is A 2j . . . and x i is A ij . . . and x n is A nj THEN y j = ∑ M k=1 ω kj ∅ k , = ω 1j ∅ 1 + ω 2j ∅ 2 + . . . + ω Mj ∅ M , (1) where n is the number of inputs, x i represents the ith input, y j denotes the output of the jth fuzzy rule, A 1j , A 2j , . . . , A nj represent the interval type-2 fuzzy sets, ω kj is the link weight, ∅ k represents the basis trigonometric function, and M is the number of basis functions.  The five-layer structure of the IT2FNC is described as follows: Layer 1 (input layer): This layer only imports the input data into the next layer: Layer 2 (membership function layer): This layer performs the fuzzification. Each node in this The five-layer structure of the IT2FNC is described as follows: Layer 1 (input layer): This layer only imports the input data into the next layer: Layer 2 (membership function layer): This layer performs the fuzzification. Each node in this layer defines an interval type-2 fuzzy set, as displayed in Figure 3. The Gaussian primary membership function has an uncertainty mean m ij1 , m ij2 and standard deviation σ ij and is expressed as follows: The membership degree of the Gaussian primary membership function u (2) ij is called the footprint of uncertainty (FOU) and is expressed as the upper bound u (2) ij and the lower bound u (2) ij . The membership degree is expressed as follows: . ( The output of each node is represented by the following interval: u ij .

Layer 3 (firing layer):
Each node is a rule node and uses an algebraic product operation to obtain the firing strengths ( ) and ( ) of each rule node. The firing strength of each rule node is defined as follows: where ∏ ( ) and ∏ ( ) represent the firing strength of the interval's upper bound and lower bound, respectively.

Layer 4 (consequent layer):
Interval type-2 fuzzy sets are reduced to interval type-1 fuzzy sets through a type-reduction operation. The traditional type-2 order reduction method is based on highly complex calculation. Therefore, the centers of sets (COS) method [11] is adopted for implementing the reduction process and is described as follows:

Layer 3 (firing layer):
Each node is a rule node and uses an algebraic product operation to obtain the firing strengths u j of each rule node. The firing strength of each rule node is defined as follows: u where ∏ i u (2) ij and ∏ i u (2) ij represent the firing strength of the interval's upper bound and lower bound, respectively. Layer 4 (consequent layer): Interval type-2 fuzzy sets are reduced to interval type-1 fuzzy sets through a type-reduction operation. The traditional type-2 order reduction method is based on highly complex calculation. Therefore, the centers of sets (COS) method [11] is adopted for implementing the reduction process and is described as follows: and where ∑ M k=1 ω kj φ k represents a nonlinear combination of FLNN inputs, ω kj represents the link weight, and ∅ k represents the functional expansion of FLNN inputs. The functional expansion is based on basis trigonometric functions and defined as follows: where M = 3 × n is the number of basis functions and n is the number of inputs.

Layer 5 (output layer):
The node output of layer 5 is defuzzified by computing the average of u (4) and u (4) . The crisp value y is obtained as follows:

Proposed DGDE
DE has the advantages of fast convergence and simple implementation. However, some problems, such as low precision and becoming easily trapped into local optima, are encountered when complex problems are solved. Therefore, an efficient DGDE algorithm is proposed to overcome the shortcomings of traditional DE. The steps of DGDE are described as follows: Step 1: Initialization and coding By setting the parameter vectors of DE and randomly initializing the target vector X i,G , the mathematical model is expressed as follows: where i = 1, 2, · · · , NP; NP is the number of population; X i,G represents the ith parameter vector in Gth generation, and D is the number of dimensions. The proposed DGDE is used to adjust the parameters of the IT2FNC. All the parameters in the IT2FNC are coded into one vector. Each vector represents an IT2FNC. The adjustable parameters in each IT2FNC are the uncertainty mean m ij , standard deviation σ ij , displacement value of the uncertainty mean d ij , and link weight ω kj . Where the d ij = m ij2 − m ij1 > 0. The coding format is presented in Figure 4.
Gth generation, and is the number of dimensions. The proposed DGDE is used to adjust the parameters of the IT2FNC. All the parameters in the IT2FNC are coded into one vector. Each vector represents an IT2FNC. The adjustable parameters in each IT2FNC are the uncertainty mean , standard deviation , displacement value of the uncertainty mean , and link weight . Where the = − > 0. The coding format is presented in Figure 4.  Step 2: Vector group The fitness values of all parameter vectors are sorted in descending order. The initial group number of all vectors is set to zero ( Figure 5). The computation of the fitness values for the individuals in the population will be discussed later, see Equations (24)-(30). The highest fitness value of the vectors is set as the new group leader. This implies that the group number is updated from zero to one, as shown in Figure 6. On the basis of the average distance difference and the average fitness difference between these ungrouped vectors (i.e., group number 0) and the group leader, the threshold value of similarity comprises the threshold values of fitness and distance: Step 2: Vector group The fitness values of all parameter vectors are sorted in descending order. The initial group number of all vectors is set to zero ( Figure 5).
Gth generation, and is the number of dimensions. The proposed DGDE is used to adjust the parameters of the IT2FNC. All the parameters in the IT2FNC are coded into one vector. Each vector represents an IT2FNC. The adjustable parameters in each IT2FNC are the uncertainty mean , standard deviation , displacement value of the uncertainty mean , and link weight . Where the = − > 0. The coding format is presented in Figure 4. Step 2: Vector group The fitness values of all parameter vectors are sorted in descending order. The initial group number of all vectors is set to zero ( Figure 5). The computation of the fitness values for the individuals in the population will be discussed later, see Equations (24)-(30). The highest fitness value of the vectors is set as the new group leader. This implies that the group number is updated from zero to one, as shown in Figure 6. On the basis of the average distance difference and the average fitness difference between these ungrouped vectors (i.e., group number 0) and the group leader, the threshold value of similarity comprises the threshold values of fitness and distance: The computation of the fitness values for the individuals in the population will be discussed later, see Equations (24)-(30). The highest fitness value of the vectors is set as the new group leader. This implies that the group number is updated from zero to one, as shown in Figure 6. On the basis of the average distance difference and the average fitness difference between these ungrouped vectors (i.e., group number 0) and the group leader, the threshold value of similarity comprises the threshold values of fitness and distance: Average_Distance where D represents the encoded dimension, NP denotes the number of parameter vectors, L g j is the jth dimension of the gth group leader, N I is the total number of ungrouped vectors, and ADIS g and AFIT g represent the distance threshold value and the fitness threshold value of the gth group, respectively.
where represents the encoded dimension, denotes the number of parameter vectors, is the jth dimension of the gth group leader, is the total number of ungrouped vectors, and and represent the distance threshold value and the fitness threshold value of the gth group, respectively.  The difference of the distance values ( ) and the difference of the fitness values between the ungrouped vectors and the leader vectors are calculated as follows: The difference of the distance values Dis i and the difference of the fitness values Fit i between the ungrouped vectors and the leader vectors are calculated as follows: If the Dis i < ADIS g and Fit i < AFIT g conditions are satisfied for a vector, then the vector is similar to the gth group leader. Therefore, these vectors are grouped together and the group number is updated to g. Otherwise, the vectors are not grouped together. Figure 7 shows that the vectors X 2,G and X i,G are similar to the 1st group leader. The difference of the distance values ( ) and the difference of the fitness values between the ungrouped vectors and the leader vectors are calculated as follows: If the < and < conditions are satisfied for a vector, then the vector is similar to the gth group leader. Therefore, these vectors are grouped together and the group number is updated to . Otherwise, the vectors are not grouped together. Figure 7 shows that the vectors X2,G and Xi,G are similar to the 1st group leader. Next, if any ungrouped vectors exist, the ungrouped vector with the highest fitness value is set as the group leader in a new group. Figure 8 presents the vector X3,G from the ungrouped vectors that has the highest fitness value and is set as a new group leader (i.e., the 2nd group leader). The grouping process is completed when no ungroup vectors exist. Step 3: Mutation Two new mutation methods are proposed for improving the disadvantages of the traditional DE algorithms. The modified formula is expressed as follows: Mutation method 1: Next, if any ungrouped vectors exist, the ungrouped vector with the highest fitness value is set as the group leader in a new group. Figure 8 presents the vector X 3,G from the ungrouped vectors that has the highest fitness value and is set as a new group leader (i.e., the 2nd group leader). The grouping process is completed when no ungroup vectors exist. The difference of the distance values ( ) and the difference of the fitness values between the ungrouped vectors and the leader vectors are calculated as follows: If the < and < conditions are satisfied for a vector, then the vector is similar to the gth group leader. Therefore, these vectors are grouped together and the group number is updated to . Otherwise, the vectors are not grouped together. Figure 7 shows that the vectors X2,G and Xi,G are similar to the 1st group leader. Next, if any ungrouped vectors exist, the ungrouped vector with the highest fitness value is set as the group leader in a new group. Figure 8 presents the vector X3,G from the ungrouped vectors that has the highest fitness value and is set as a new group leader (i.e., the 2nd group leader). The grouping process is completed when no ungroup vectors exist. Step 3: Mutation Two new mutation methods are proposed for improving the disadvantages of the traditional DE algorithms. The modified formula is expressed as follows: Mutation method 1: Step 3: Mutation Two new mutation methods are proposed for improving the disadvantages of the traditional DE algorithms. The modified formula is expressed as follows: Mutation method 1: Mutation method 2: where , F is the mutation weight factor. Moreover, the weight factor is set to 0.5 as a general rule, X best, G is the best fitness vector, and X rL,G is a random leader selected from all the group leaders. Because the traditional DE method easily falls into local optima, this study employed the random leader as the base vector to increase the search ability effectively, as presented in Equation (19). In Equation (20), the best vector is set as the base vector, and three random vectors and one random leader vector are used. The mutated vector in Equation (20) revolves around the best vector and enhances the search ability in the solution space. Step 4: Recombination The recombination operation crosses the mutation vector with the target vector and generates a new vector V i,G+1 . The definition of V i,G+1 is expressed as follows: where rand j (0, 1) represents random values between zero and one in jth dimension, and CR is the crossover rate. The higher CR value represents the higher similarity between the vector and the mutation vector.
Step 5: Selection The fitness values of vectors are evaluated for selecting the target vectors in the next generation. If the fitness value of a vector is less favorable than the current target vector, the target vector will remain in the next generation. The selection operation is described as follows: The flowchart of DGDE is presented in Figure 9.
where , = , , , , ⋯ , , , is the mutation weight factor. Moreover, the weight factor is set to 0.5 as a general rule, , is the best fitness vector, and , is a random leader selected from all the group leaders. Because the traditional DE method easily falls into local optima, this study employed the random leader as the base vector to increase the search ability effectively, as presented in Equation (19). In Equation (20), the best vector is set as the base vector, and three random vectors and one random leader vector are used. The mutated vector in Equation (20) revolves around the best vector and enhances the search ability in the solution space.
Step 4: Recombination The recombination operation crosses the mutation vector with the target vector and generates a new vector , . The definition of , is expressed as follows: where (0,1) represents random values between zero and one in jth dimension, and CR is the crossover rate. The higher CR value represents the higher similarity between the vector and the mutation vector.
Step 5: Selection The fitness values of vectors are evaluated for selecting the target vectors in the next generation. If the fitness value of a vector is less favorable than the current target vector, the target vector will remain in the next generation. The selection operation is described as follows: The flowchart of DGDE is presented in Figure 9.  Figure 9. Flowchart of the proposed dynamic group differential evolution (DGDE).

Wall-Following Control of Mobile Robots
Recently, some researchers have proposed the wall-following control of mobile robots for using reinforcement learning. Jhang et al. [21] used an interval type-2 recurrent fuzzy cerebellar model articulation controller (IT2RFCMAC). They adopted a Takagi-Sugeno-Kang (TSK) in the consequent part in IT2RFCMAC. The TSK is a linear function and simple implementation. However, in complex problems, nonlinear functions are better in terms of performance than linear functions. Therefore, in this study a nonlinear functional link neural network of the proposed IT2FNC was used as the consequent part to improve control performance. The proposed IT2FNC that based on DGDE is demonstrated. Wall-following control was utilized by the IT2FNC to control the mobile robot. A reinforcement learning strategy was used to adjust the controller parameters of the proposed IT2FNC. The block diagram for the wall-following control of the mobile robot is presented in Figure 10. The four input signals of the proposed IT2FNC are the S 0 , S 1 , S 2 , and S 3 distances, which are measured by the infrared sensor. The outputs of the IT2FNC are the rotational speeds V L and V R of the two wheels.
Recently, some researchers have proposed the wall-following control of mobile robots for using reinforcement learning. Jhang et al. [21] used an interval type-2 recurrent fuzzy cerebellar model articulation controller (IT2RFCMAC). They adopted a Takagi-Sugeno-Kang (TSK) in the consequent part in IT2RFCMAC. The TSK is a linear function and simple implementation. However, in complex problems, nonlinear functions are better in terms of performance than linear functions. Therefore, in this study a nonlinear functional link neural network of the proposed IT2FNC was used as the consequent part to improve control performance. The proposed IT2FNC that based on DGDE is demonstrated. Wall-following control was utilized by the IT2FNC to control the mobile robot. A reinforcement learning strategy was used to adjust the controller parameters of the proposed IT2FNC. The block diagram for the wall-following control of the mobile robot is presented in Figure 10. The four input signals of the proposed IT2FNC are the , , , and distances, which are measured by the infrared sensor. The outputs of the IT2FNC are the rotational speeds VL and VR of the two wheels. To allow mobile robots to be used in different environments, the training environment in this study featured straight lines, corners, right-angled corners, and slope lines. Figure 11 presents the 1.7 × 1.6 m 2 training environment. To avoid collision with obstacles and deviation from the wall during the wall-following control learning process, three terminal conditions of wall-following control learning were specified: 1. If the total moving distance of the mobile robot was larger than the predefined maximal distance of the training environment, the mobile robot successfully moved in a circular path in an unknown environment. 2. The mobile robot collided with the wall when the measured distance from any infrared sensor was less than 1 cm, as displayed in Figure 12a. 3. The mobile robot deviated from the wall when the measured distance was greater than 6 cm, as displayed in Figure 12b. To allow mobile robots to be used in different environments, the training environment in this study featured straight lines, corners, right-angled corners, and slope lines. Figure 11 presents the 1.7 × 1.6 m 2 training environment.
Recently, some researchers have proposed the wall-following control of mobile robots for using reinforcement learning. Jhang et al. [21] used an interval type-2 recurrent fuzzy cerebellar model articulation controller (IT2RFCMAC). They adopted a Takagi-Sugeno-Kang (TSK) in the consequent part in IT2RFCMAC. The TSK is a linear function and simple implementation. However, in complex problems, nonlinear functions are better in terms of performance than linear functions. Therefore, in this study a nonlinear functional link neural network of the proposed IT2FNC was used as the consequent part to improve control performance. The proposed IT2FNC that based on DGDE is demonstrated. Wall-following control was utilized by the IT2FNC to control the mobile robot. A reinforcement learning strategy was used to adjust the controller parameters of the proposed IT2FNC. The block diagram for the wall-following control of the mobile robot is presented in Figure 10. The four input signals of the proposed IT2FNC are the , , , and distances, which are measured by the infrared sensor. The outputs of the IT2FNC are the rotational speeds VL and VR of the two wheels. To allow mobile robots to be used in different environments, the training environment in this study featured straight lines, corners, right-angled corners, and slope lines. Figure 11 presents the 1.7 × 1.6 m 2 training environment. To avoid collision with obstacles and deviation from the wall during the wall-following control learning process, three terminal conditions of wall-following control learning were specified: 1. If the total moving distance of the mobile robot was larger than the predefined maximal distance of the training environment, the mobile robot successfully moved in a circular path in an unknown environment. 2. The mobile robot collided with the wall when the measured distance from any infrared sensor was less than 1 cm, as displayed in Figure 12a. 3. The mobile robot deviated from the wall when the measured distance was greater than 6 cm, as displayed in Figure 12b. To avoid collision with obstacles and deviation from the wall during the wall-following control learning process, three terminal conditions of wall-following control learning were specified: 1.
If the total moving distance of the mobile robot was larger than the predefined maximal distance of the training environment, the mobile robot successfully moved in a circular path in an unknown environment.

2.
The mobile robot collided with the wall when the measured distance from any infrared sensor was less than 1 cm, as displayed in Figure 12a.

3.
The mobile robot deviated from the wall when the measured distance S 2 was greater than 6 cm, as displayed in Figure 12b. The proposed DGDE was used to train the parameters of the IT2FNC. Each vector in the DGDE represents a solution of an IT2FNC. When any terminal condition during the wall-following control learning process is satisfied, a fitness function was used to evaluate the performance of the mobile robot. The proposed fitness function comprises three sub-fitness functions-, , and The proposed DGDE was used to train the parameters of the IT2FNC. Each vector in the DGDE represents a solution of an IT2FNC. When any terminal condition during the wall-following control learning process is satisfied, a fitness function was used to evaluate the performance of the mobile robot. The proposed fitness function comprises three sub-fitness functions-SF 1 , SF 2 , and SF 3 . SF 1 , SF 2 , and SF 3 represent the total moving distance, the distance between the robot and the wall, and the degree of parallelism between the robot and the wall, respectively. The three sub-fitness functions are defined as follows: (1) SF 1 : If the moving distance R dis was greater than the predefined value R stop , the robot successfully moved around a circular path in the training environment and set R dis = R stop . The sub-fitness function SF 1 is defined as follows: (2) SF 2 : The goal of the wall-following control was to maintain a fixed distance between the side of the robot and the wall. Therefore, the sub-fitness function SF 2 is defined as the average of WD(t), where WD(t) represents the distance between the side of the robot and the wall at each time step, and is defined as follows: where d wall is the pre-defined fixed distance (i.e., d wall = 4 cm), as presented in Figure 13a. T stop is the total number of time steps in a learning process. If the robot remains at a fixed distance from the wall, the SF 2 value is equal to zero. (3) SF 3 : This sub-fitness function was used for evaluating the degree of parallelism between the robot and the wall. If the robot was parallel to the wall, the angle θ between the robot and wall was 90 • . On the basis of the law of cosines, x(t) must have the same value as that of RS 2 , as presented in Figure 13b.
where r is the radius of the robot; δ 1 and δ 2 represent the distance between the sensor 1 and the wall and that between sensor 2 and the wall, respectively, and SF 3 represents the average value of the degree of parallelism during movement. If the robot is parallel to the wall, SF 3 is equal to zero.
Therefore, the proposed fitness function is defined as follows: Therefore, the proposed fitness function is defined as follows: (a) (b) Figure 13. Definition of (a) and (b) degree of parallelism Figure 14 shows the block diagram of learning process of wall-following control. Each solution represents an IT2FNC controller, which is evaluated by the fitness function. Additionally, it automatically adjust parameters of IT2FNC using evolutionary strategies. The best solution will be replaced when a better solution exits in each generation.  Figure 14 shows the block diagram of learning process of wall-following control. Each solution represents an IT2FNC controller, which is evaluated by the fitness function. Additionally, it automatically adjust parameters of IT2FNC using evolutionary strategies. The best solution will be replaced when a better solution exits in each generation.
(a) (b) Figure 13. Definition of (a) and (b) degree of parallelism Figure 14 shows the block diagram of learning process of wall-following control. Each solution represents an IT2FNC controller, which is evaluated by the fitness function. Additionally, it automatically adjust parameters of IT2FNC using evolutionary strategies. The best solution will be replaced when a better solution exits in each generation.

Experimental Results of the Wall-Following Control
To verify the effectiveness of the proposed method, the performance of the WFM controller while using the proposed DGDE-1 (mutation method 1) and DGDE-2 (mutation method 2) were

Experimental Results of the Wall-Following Control
To verify the effectiveness of the proposed method, the performance of the WFM controller while using the proposed DGDE-1 (mutation method 1) and DGDE-2 (mutation method 2) were compared with the performance of the WFM control while using other methods. Each method was evaluated 10 times to verify the stability of each algorithm.
The initial parameters of the DGDE are the number of the population (NP), crossover rate, generation, weighting factor of mutation, and number of fuzzy rules, as presented in Table 1. Moreover, we considered different fuzzy rule numbers for performance evaluation. Table 2 presents the performance evaluation results of different fuzzy rule numbers. The IT2FNC with six fuzzy rules was more efficient than those with five and seven fuzzy rules.   Table 3 presents the performance evaluations of different algorithms. In this table, the performance indexes include the best fitness function, the worst fitness function, the average fitness function, the standard deviation (STD), the number of successful runs, and computation time for one training. The number of successful runs is the number of times the robot moved successfully around a circular path in the training environment. Figure 15 presents the learning curves of the WFM control when various evolutionary algorithms are used. The proposed DGDE achieved superior fitness values and successful runs than other methods. On the other hand, the proposed method also compares with the IT2RFCMAC controller [21]. Table 4 shows that the proposed IT2FNC with nonlinear functional link neural network performs better than IT2RFCMAC with linear TSK architecture [21].  To verify the WFM control performance of different learning algorithms, a training environment and two unknown testing environments were created, as presented in Figures 16-18. The testing environment in Figure 17 focuses on many difficult and large curves, whereas the other testing environment in Figure 18 focuses on many right angle curves. The fitness value was used to evaluate the WFM, and the detailed comparison results are presented in Table 5. To verify the WFM control performance of different learning algorithms, a training environment and two unknown testing environments were created, as presented in Figures 16-18. The testing environment in Figure 17 focuses on many difficult and large curves, whereas the other testing environment in Figure 18 focuses on many right angle curves. The fitness value was used to evaluate the WFM, and the detailed comparison results are presented in Table 5.  [14] 0.941 fail fail PSO [12] 0.947 0.828 0.721 ABC [15] 0.932 fail fail Figure 15. Learning curves of the wall-following control when various evolutionary algorithms are employed.

Cooperative Carrying and Navigation Control of Multi-Evolutionary Mobile Robots
In this section, the cooperative carrying and navigation control of multi-evolutionary mobile robots is discussed. Figure 19 shows that the distance between two robots was set to 15 cm and a rectangular object was placed on the two robots. The front and rear robots represent the leader and follower, respectively. In the experiments, the leader explored the front environment, and the follower assisted the leader in lifting objects to achieve obstacle avoidance and prevent objects being dropped during cooperative carrying.

Cooperative Carrying and Navigation Control of Multi-Evolutionary Mobile Robots
In this section, the cooperative carrying and navigation control of multi-evolutionary mobile robots is discussed. Figure 19 shows that the distance between two robots was set to 15 cm and a rectangular object was placed on the two robots. The front and rear robots represent the leader and follower, respectively. In the experiments, the leader explored the front environment, and the follower assisted the leader in lifting objects to achieve obstacle avoidance and prevent objects being dropped during cooperative carrying.

Cooperative Carrying and Navigation Control of Multi-Evolutionary Mobile Robots
In this section, the cooperative carrying and navigation control of multi-evolutionary mobile robots is discussed. Figure 19 shows that the distance between two robots R d was set to 15 cm and a rectangular object was placed on the two robots. The front and rear robots represent the leader and follower, respectively. In the experiments, the leader explored the front environment, and the follower assisted the leader in lifting objects to achieve obstacle avoidance and prevent objects being dropped during cooperative carrying.

Wall-Following Control of the Cooperative Carrying Method
A dual controller is proposed for the cooperative carrying of two mobile robots. An auxiliary controller was incorporated in the follower robot to learn the WFM of cooperative carrying. The block diagram is presented in Figure 20. The auxiliary controller contained five input signals and two output signals. The inputs are the sensed distances by the follower's sensors ( , , , and ) and the distances between two robots ( ). The outputs are the rotational speeds VL and VR of Figure 19. Cooperative carrying of two mobile robots.

Wall-Following Control of the Cooperative Carrying Method
A dual controller is proposed for the cooperative carrying of two mobile robots. An auxiliary controller was incorporated in the follower robot to learn the WFM of cooperative carrying. The block diagram is presented in Figure 20. The auxiliary controller contained five input signals and two output signals. The inputs are the sensed distances by the follower's sensors (S 0 , S 1 , S 2 , and S 3 ) and the distances between two robots (R d ). The outputs are the rotational speeds V L and V R of the two wheels. Figure 19. Cooperative carrying of two mobile robots.

Wall-Following Control of the Cooperative Carrying Method
A dual controller is proposed for the cooperative carrying of two mobile robots. An auxiliary controller was incorporated in the follower robot to learn the WFM of cooperative carrying. The block diagram is presented in Figure 20. The auxiliary controller contained five input signals and two output signals. The inputs are the sensed distances by the follower's sensors ( , , , and ) and the distances between two robots ( ). The outputs are the rotational speeds VL and VR of the two wheels. To implement cooperative carrying in an unknown environment, the training environment featured straight lines, smooth curves, U-shaped curves, and continuous curves to train the follower's auxiliary controller. Figure   To avoid collision with obstacles and objects being dropped during the cooperative carrying learning process, five terminal conditions were specified: To implement cooperative carrying in an unknown environment, the training environment featured straight lines, smooth curves, U-shaped curves, and continuous curves to train the follower's auxiliary controller. Figure

Wall-Following Control of the Cooperative Carrying Method
A dual controller is proposed for the cooperative carrying of two mobile robots. An auxiliary controller was incorporated in the follower robot to learn the WFM of cooperative carrying. The block diagram is presented in Figure 20. The auxiliary controller contained five input signals and two output signals. The inputs are the sensed distances by the follower's sensors ( , , , and ) and the distances between two robots ( ). The outputs are the rotational speeds VL and VR of the two wheels. To implement cooperative carrying in an unknown environment, the training environment featured straight lines, smooth curves, U-shaped curves, and continuous curves to train the follower's auxiliary controller. Figure 21 displays the 1.5 × 1.4 m 2 training environment. To avoid collision with obstacles and objects being dropped during the cooperative carrying learning process, five terminal conditions were specified: To avoid collision with obstacles and objects being dropped during the cooperative carrying learning process, five terminal conditions were specified: (1) If the measured distance from one of the sensors in the follower robot is less than 1 cm, the follower robot collides with the obstacles. (2) If the measured distance of the sensor S 2 is higher than 6 cm, the follower robot deviates from the wall. (3) If the measured distance between the leader robot and follower robot R d is less than 10 cm or higher than 20 cm, the leader and follower robots are inferred to be very close or very far. (4) If the measured distance of the sensor is less than the height R h of the robot, the object is dropped by robots. (5) If the measured distance S wall of the sensor is less than 1 cm or greater than 7.5 cm, the object approaches the wall or deviates from the wall.
When one of the aforementioned conditions is satisfied, the cooperative carrying of the robots has failed. If the robots engaging in cooperative carrying can successfully move around a circular path in the training environment, the auxiliary controller completes the training process. The time step of the robot moving in the training environment is defined as the fitness function F(·):

Navigation Control of Cooperative Carrying
This paper also proposes an effective navigation control method for cooperative carrying in unknown environments. The manager mode automatically selects between the TGM and WFM on the basis of the relative position of the mobile robot and the target location.

(1) Toward-Goal Mode
In the navigation control of the unknown environment, the robot used infrared sensors to detect the object. In order to turn towards the goal position, the mobile robot calculates the angle difference θ TG between the current direction of the robot and the target direction, as presented in Figure 22: where θ Robot is the angle between the mobile robot and the x axis and θ Goal is the angle between the goal and the x axis in inertial coordinate system.
(3) If the measured distance between the leader robot and follower robot is less than 10 cm or higher than 20 cm, the leader and follower robots are inferred to be very close or very far. (4) If the measured distance of the sensor is less than the height of the robot, the object is dropped by robots. (5) If the measured distance of the sensor is less than 1 cm or greater than 7.5 cm, the object approaches the wall or deviates from the wall.
When one of the aforementioned conditions is satisfied, the cooperative carrying of the robots has failed. If the robots engaging in cooperative carrying can successfully move around a circular path in the training environment, the auxiliary controller completes the training process. The time step of the robot moving in the training environment is defined as the fitness function (•):

Navigation Control of Cooperative Carrying
This paper also proposes an effective navigation control method for cooperative carrying in unknown environments. The manager mode automatically selects between the TGM and WFM on the basis of the relative position of the mobile robot and the target location.

(1) Toward-Goal Mode
In the navigation control of the unknown environment, the robot used infrared sensors to detect the object. In order to turn towards the goal position, the mobile robot calculates the angle difference θ between the current direction of the robot and the target direction, as presented in Figure 22: where θ is the angle between the mobile robot and the x axis and θ is the angle between the goal and the x axis in inertial coordinate system. To avoid objects being dropped during the cooperative carrying process, both the follower and leader move in the same direction and at the same speed, as displayed in Figure 23. To avoid objects being dropped during the cooperative carrying process, both the follower and leader move in the same direction and at the same speed, as displayed in Figure 23. (

2) Manager Mode
The robot is divided into three zones-, , and , as displayed in Figure 24. In the process of TGM, the manager mode switches to the left WFM or the right WFM on the basis of the zone of the robot and on the basis of which sensor (i = 0, 2, 3, …,7) detected an obstacle.

(2) Manager Mode
The robot is divided into three zones-O 1 , O 2 , and O 3 , as displayed in Figure 24. In the process of TGM, the manager mode switches to the left WFM or the right WFM on the basis of the zone of the robot O i and on the basis of which sensor S i (i = 0, 2, 3, . . . ,7) detected an obstacle.

(2) Manager Mode
The robot is divided into three zones-, , and , as displayed in Figure 24. In the process of TGM, the manager mode switches to the left WFM or the right WFM on the basis of the zone of the robot and on the basis of which sensor (i = 0, 2, 3, …,7) detected an obstacle.  Left wall-following control: (i) The goal direction is located at , and or detects obstacles. (ii) The goal direction is located at , and , , and detect obstacles.
 Right wall-following control: (i) The goal direction is located at , and or detects obstacles. (ii) The goal direction is located at , and , , and detect obstacles.
Before switching to the WFM, the manager mode determines which robot is closest to the obstacle. If the leader robot encounters an obstacle, it will execute a prerotation process. This aim of this process is to enable the two robots to approach the wall for a smooth navigational control, as displayed in Figure 25.

•
Left wall-following control: The goal direction is located at O 1 , and S 7 or S 6 detects obstacles.
The goal direction is located at O 2 , and S 7 , S 6 , and S 5 detect obstacles.
• Right wall-following control: The goal direction is located at O 1 , and S 0 or S 1 detects obstacles.
The goal direction is located at O 3 , and S 0 , S 1 , and S 2 detect obstacles.
Before switching to the WFM, the manager mode determines which robot is closest to the obstacle. If the leader robot encounters an obstacle, it will execute a prerotation process. This aim of this process is to enable the two robots to approach the wall for a smooth navigational control, as displayed in Figure 25.

(2) Manager Mode
The robot is divided into three zones-, , and , as displayed in Figure 24. In the process of TGM, the manager mode switches to the left WFM or the right WFM on the basis of the zone of the robot and on the basis of which sensor (i = 0, 2, 3, …,7) detected an obstacle.  Left wall-following control: (i) The goal direction is located at , and or detects obstacles. (ii) The goal direction is located at , and , , and detect obstacles.
 Right wall-following control: (i) The goal direction is located at , and or detects obstacles. (ii) The goal direction is located at , and , , and detect obstacles.
Before switching to the WFM, the manager mode determines which robot is closest to the obstacle. If the leader robot encounters an obstacle, it will execute a prerotation process. This aim of this process is to enable the two robots to approach the wall for a smooth navigational control, as displayed in Figure 25.  The leader robot turns in order to be positioned parallel to the wall and records the angle of rotation θ L . Then, this message is sent to the follower robot. The follower robot turns to π − θ L • , maintains a fixed distance from the leader robot, and moves toward the obstacle. The manager mode of the two robots switch to WFM until the pre-rotation process of the follower robot is completed. The pre-rotation process is presented in Figure 26. If the follower robot encounters an obstacle, the manager mode switches to the WFM; this process is displayed in Figure 27. If the target direction is located in O 1 of the follower robot and the distance between the sensor on right side S 1 (the sensor of left side S 6 ) and the wall is greater than 6 cm, the object being held by the two robots is inferred to have passed the obstacle. Then, the manager mode switches to the TGM, as displayed in Figure 28. completed. The pre-rotation process is presented in Figure 26.
If the follower robot encounters an obstacle, the manager mode switches to the WFM; this process is displayed in Figure 27. If the target direction is located in of the follower robot and the distance between the sensor on right side (the sensor of left side ) and the wall is greater than 6 cm, the object being held by the two robots is inferred to have passed the obstacle. Then, the manager mode switches to the TGM, as displayed in Figure 28.   If the follower robot encounters an obstacle, the manager mode switches to the WFM; this process is displayed in Figure 27. If the target direction is located in of the follower robot and the distance between the sensor on right side (the sensor of left side ) and the wall is greater than 6 cm, the object being held by the two robots is inferred to have passed the obstacle. Then, the manager mode switches to the TGM, as displayed in Figure 28.   the distance between the sensor on right side (the sensor of left side ) and the wall is greater than 6 cm, the object being held by the two robots is inferred to have passed the obstacle. Then, the manager mode switches to the TGM, as displayed in Figure 28.

Experimental Results of Cooperative Carrying Control
In this subsection, the proposed method is used to verify the success of cooperative carrying control in unknown environments. Figure 29 demonstrates that the robots complete the wall-following control of cooperative carrying in the training environment. Moreover, to verify the performance of navigation control, two different test environments were created for testing whether the robots successfully accomplished cooperative carrying and navigation control. The experimental results of the two test environments are presented in Figure 30. In this experiment, the average distance (RD) between the two robots and the average distance (FWD) between the follower robot and the wall were evaluated. The results are presented in Table 6. If the RD is large, the two robots (i.e., the leader robot and the follower robot) are not at a suitable distance during the cooperative transport control, and the object drops easily. However, if the FWD is very large or very small, the robots pass the curves with poor efficiency, and the object is easily moved and dropped. follower robot and the wall were evaluated. The results are presented in Table 6. If the RD is large, the two robots (i.e., the leader robot and the follower robot) are not at a suitable distance during the cooperative transport control, and the object drops easily. However, if the FWD is very large or very small, the robots pass the curves with poor efficiency, and the object is easily moved and dropped.

Conclusions
Aiming at the navigation control for cooperative carrying in an unknown environment, this study proposed an IT2FNC based on a dynamic group differential evolution to realize the carrying control and WFM control for mobile robots. At the same time, the developed DGDE learning the two robots (i.e., the leader robot and the follower robot) are not at a suitable distance during the cooperative transport control, and the object drops easily. However, if the FWD is very large or very small, the robots pass the curves with poor efficiency, and the object is easily moved and dropped.

Conclusions
Aiming at the navigation control for cooperative carrying in an unknown environment, this study proposed an IT2FNC based on a dynamic group differential evolution to realize the carrying control and WFM control for mobile robots. At the same time, the developed DGDE learning

Conclusions
Aiming at the navigation control for cooperative carrying in an unknown environment, this study proposed an IT2FNC based on a dynamic group differential evolution to realize the carrying control and WFM control for mobile robots. At the same time, the developed DGDE learning algorithm adopts dynamic grouping and local search methods, which enhance the search ability and convergence stability of the traditional DE method, and is used to adjust IT2FNC parameters. On this basis, manager mode is established to assist mobile robots in navigation control. The manager mode automatically selects between the WFM or the TGM base on the relative position between the mobile robot and the target location. In addition, the pre-rotation mechanism was employed to accomplish cooperative carrying control. In the training process, the best fitness function, the worst fitness function, the average fitness function, the standard deviation (STD), the number of successful runs, and the computation time of the proposed DGDE were 0.962, 0.919, 0.942, 0.009, 10, and 4:38:56, respectively. Although DE and ABC are shorter than the proposed method at the computation time, DE and ABC had only eight successful runs during 10 runs. Experimental results revealed that the proposed method achieved a superior WFM performance than other methods and successfully accomplished the navigation control of cooperative carrying to target locations in unknown environments. Since the trained controller using reinforcement learning needs to take a long time, future research work needs to implement the proposed learning algorithm on chip to improve the learning speed.