Classiﬁcation and Recognition of Goat Movement Behavior Based on SL-WOA-XGBoost

: Aiming at the problem of time-consuming, labor-intensive, and low-accuracy monitoring of goat motion behavior (lying, standing, walking, and running) while relying on the three-axis acceleration sensor and taking the acceleration data obtained from the goat back collection point as the research object, a method based on social learning (SL) is proposed using the Whale Optimization Algorithm (WOA) and XGBoost for goat motion behavior recognition. In this method, the XGBoost parameters are optimized by the WOA combined with social learning strategies to improve the classiﬁcation and recognition accuracy. The results show that the recognition rate of lying behavior was as high as 97.14%, and the average recognition rate of the four movement behaviors was 94.42%, meeting the requirements of goat motion behavior recognition. Compared with the conventional XGBoost algorithm, the average recognition rate was increased by 3.41% and the recognition accuracy was improved. The results of this study can provide a reference for goat health assessment and intelligent disease warning.


Introduction
The abundant grassland resources of Inner Mongolia have allowed the sheep husbandry industry to flourish, which has greatly promoted the economic development of the region [1]. In recent years, while the method of breeding sheep has gradually changed from traditional stocking to large-scale house feeding and captivity, the method of breeding goats has gradually changed from traditional stocking to large-scale breeding, the density of goat herds has increased, their movement space has gradually been compressed, and it is difficult to guarantee a sufficient amount of exercise and proper living environment, making goats extremely susceptible to diseases and affecting their growth and health [2,3]. At the early stage of disease, due to goats' strong tolerance to illness, the onset of symptoms is not obvious, but after they become sick, the frequency and duration of their lying, standing, walking, and running behaviors will change. By observing the movement behavior of goats over time, their health status can be indirectly predicted, and then appropriate response measures can be taken in time to ensure their health.
The traditional method is to observe the behavior of animals based on the experience of breeders, but this requires a lot of manpower, with high work intensity and extremely low efficiency, which can no longer meet the needs of large-scale breeding [4]. With technological developments and updates, visual detection has gradually transitioned to automatic monitoring. At present, mainstream automatic monitoring methods are mainly based on various sensors to establish a data-driven identification model, which has been

Subjects
The trial was carried out at Yi Wei White Velvet Goat LLC in Inner Mongolia between 2020 and 2021. The 10-month-old Albas cashmere goats were in basically the same physical condition, healthy and disease-free. Each one wore a data collection device to obtain motor behavior data. During data collection, video surveillance equipment was used to simultaneously calibrate the behavior of goats, determine behavior categories, and verify the effect of the classification algorithm proposed in this paper based on the calibration results.

Data Acquisition
The MPU9250 three-axis acceleration sensor was selected for data acquisition in the test [12]. The collected data were transmitted to a PC through a wireless module, and then recorded and stored. The data acquisition device and collection point are shown in Figure 1. The data acquisition device is composed of the three-axis acceleration sensor module, a power module, and a wireless module, and is enclosed in a waterproof box, which is fixed on the goat's back with straps. In this paper, we stipulate that when a test goat is wearing the data acquisition device and standing still, the positive direction of the X-axis is the direction pointing to the goat's head, the positive direction of the Y-axis is the direction pointing to the right side of the goat's body, and the positive direction of the Z-axis is vertical to the ground. The three positive directions are indicated in Figure 1b.

Data Processing
For collected raw data, noise reduction is often required [13]. In addition, considering that the position of the data acquisition device will change in the process of the goat's movement, which will cause certain deviation in the acceleration values in three directions when standing still, it is necessary to correct the data [14]. The included angle between the Z-axis square of the sensor and gravitational acceleration g is , the included angle between the XOY reference plane and the positive direction of the X-axis is , and the values of the three-axis acceleration sensor in the static state in the X, Y, and Z directions are , , and , respectively. According to spatial geometry, and can be expressed as follows: The collected values of the three-axis acceleration sensor in the three directions are , , and , and the corrected values , , and can be expressed as follows: ( sin cos )sin cos sin ( cos sin )sin cos sin sin cos cos The acceleration values in the three directions before and after correction are shown in Figure 2. It can be seen that the corrected values are close to the values before correction.

Data Processing
For collected raw data, noise reduction is often required [13]. In addition, considering that the position of the data acquisition device will change in the process of the goat's movement, which will cause certain deviation in the acceleration values in three directions when standing still, it is necessary to correct the data [14]. The included angle between the Z-axis square of the sensor and gravitational acceleration g is θ, the included angle g between the XOY reference plane and the positive direction of the X-axis is α, and the values of the three-axis acceleration sensor in the static state in the X, Y, and Z directions are a 1x , a 1y , and a 1z , respectively. According to spatial geometry, α and θ can be expressed as follows: The collected values of the three-axis acceleration sensor in the three directions are a x , a y , and a z , and the corrected values A x , A y , and A z can be expressed as follows: A y = (a x cos α + a y sin α) sin θ − a y cos θ sin α, A z = a z sin θ − a x cos θ + cos θ.
The acceleration values in the three directions before and after correction are shown in Figure 2. It can be seen that the corrected values are close to the values before correction. The acceleration of different behaviors varies greatly in each direction. For example, goats may have smaller accelerations when lying and standing, and relatively larger accelerations when walking and running. The frequency characteristics of acceleration data for different behaviors are also different in all directions. Active behaviors such as walking and running produce higher frequency changes in acceleration, while stationary behaviors such as lying and standing may exhibit low-frequency or near-stable characteristics. Looking at the temporal relationship of accelerations under different behaviors can also help to distinguish them. For example, changes in acceleration in walking and running behaviors may have distinct repetitive patterns or periodicity, while lying and standing behaviors may exhibit relatively flat or no apparent periodicity. For goat behavior recognition, acceleration data alone may have certain limitations, so this paper added acceleration sensor data to improve the accuracy of behavior recognition. In addition, a machine learning training model, XGBoost, was built to learn and classify these behaviors.
Lying behavior: When the goat is lying, the motion amplitude is very small, and the acceleration value in the three directions is close to the initial value of the sensor. Standing behavior: When the goat stands, it is usually accompanied by feeding behavior, with a certain shift in the body, a certain change in the vertical direction, and a longer duration. Walking behavior: When the goat is walking, the acceleration value in the forward direction will change obviously. With the change of the forward direction, the lateral acceleration value will also change to some extent, and the vertical direction will change to some extent. Running behavior: When the goat is running, the acceleration in all three directions changes dramatically, especially in the vertical direction. With the goat jumping, the acceleration value in the vertical direction fluctuates more than the forward and lateral value changes.

XGBoost Algorithm
The XGBoost model is a machine learning model. Specifically, XGBoost is an ensemble learning algorithm that integrates multiple decision tree models and trains and optimizes the models through a gradient lifting algorithm to achieve multiple classification problems. When training the XGBost model, the main purpose is to determine the best tree structure, the core problem is to find an optimal split node, and the greedy algorithm is used to calculate the gain before and after the split to determine whether the node needs to be split. Then, each parameter of the generation optimization model is selected based on the objective function. The XGBoost model is finally obtained by training according to the above process, and the process of using this model for prediction is shown in Figure 3. The acceleration of different behaviors varies greatly in each direction. For example, goats may have smaller accelerations when lying and standing, and relatively larger accelerations when walking and running. The frequency characteristics of acceleration data for different behaviors are also different in all directions. Active behaviors such as walking and running produce higher frequency changes in acceleration, while stationary behaviors such as lying and standing may exhibit low-frequency or near-stable characteristics. Looking at the temporal relationship of accelerations under different behaviors can also help to distinguish them. For example, changes in acceleration in walking and running behaviors may have distinct repetitive patterns or periodicity, while lying and standing behaviors may exhibit relatively flat or no apparent periodicity. For goat behavior recognition, acceleration data alone may have certain limitations, so this paper added acceleration sensor data to improve the accuracy of behavior recognition. In addition, a machine learning training model, XGBoost, was built to learn and classify these behaviors.
Lying behavior: When the goat is lying, the motion amplitude is very small, and the acceleration value in the three directions is close to the initial value of the sensor. Standing behavior: When the goat stands, it is usually accompanied by feeding behavior, with a certain shift in the body, a certain change in the vertical direction, and a longer duration. Walking behavior: When the goat is walking, the acceleration value in the forward direction will change obviously. With the change of the forward direction, the lateral acceleration value will also change to some extent, and the vertical direction will change to some extent. Running behavior: When the goat is running, the acceleration in all three directions changes dramatically, especially in the vertical direction. With the goat jumping, the acceleration value in the vertical direction fluctuates more than the forward and lateral value changes.

XGBoost Algorithm
The XGBoost model is a machine learning model. Specifically, XGBoost is an ensemble learning algorithm that integrates multiple decision tree models and trains and optimizes the models through a gradient lifting algorithm to achieve multiple classification problems. When training the XGBost model, the main purpose is to determine the best tree structure, the core problem is to find an optimal split node, and the greedy algorithm is used to calculate the gain before and after the split to determine whether the node needs to be split. Then, each parameter of the generation optimization model is selected based on the objective function. The XGBoost model is finally obtained by training according to the above process, and the process of using this model for prediction is shown in Figure 3.
where ( ) is the basis classifier, is the number of iterations, is the number of classification and regression tree leaf nodes in the kth iteration, , is the sample substitute value corresponding to the jth node of the kth iteration, is the learning rate.
The XGBoost model itself accepts the two-dimensional feature matrix as input, takes the goat movement behavior features extracted from the three-dimensional data as the columns of the input matrix, and takes the one-dimensional vector flattening the threedimensional data as the rows of the input matrix. Then, the three-dimensional data can be converted into a two-dimensional matrix and passed to the XGBoost model as input for training and prediction. When the XGBoost model is trained, the objective function to be optimized can be expressed as ( ) where is the true value, ( , ) is the loss function of and , Ω( ) is a regular term, and are penalty factors; is the number of child nodes of the CART tree, and is the output score for the jth child node.
where y 0 ( ) is the basis classifier, K x is the number of iterations, T k is the number of classification and regression tree leaf nodes in the kth iteration, ω j,k is the sample substitute value corresponding to the jth node of the kth iteration, η is the learning rate. The XGBoost model itself accepts the two-dimensional feature matrix as input, takes the goat movement behavior features extracted from the three-dimensional data as the columns of the input matrix, and takes the one-dimensional vector flattening the threedimensional data as the rows of the input matrix. Then, the three-dimensional data can be converted into a two-dimensional matrix and passed to the XGBoost model as input for training and prediction. When the XGBoost model is trained, the objective function to be optimized can be expressed as where y i is the true value, l(ŷ i , y i ) is the loss function ofŷ i and y i , Ω( f k ) is a regular term, γ and λ are penalty factors; T is the number of child nodes of the CART tree, and ω j is the output score for the jth child node.
It can be seen from the above equation that there are many empirical parameters in XGBoost model training, such as learning rate η, penalty coefficients γ and λ, etc. If default coefficients are used, it is difficult to ensure the quality of model prediction results, so these parameters need to be optimized [10]. After the XGBoost model was trained for classification recognition, the test set contained 23,475 pieces of data, including 5941 related to lying behavior, 5455 related to standing behavior, 6151 related to walking behavior, and 5928 related to running behavior. The results of goat locomotion behavior recognition based on XGBoost are shown in Tables 1 and 2.  Table 1, it can be seen that the recognition rate of lying behavior is the highest at 93.97% and that of standing behavior is the lowest at 81.63%. This may be because the change of acceleration in the three directions during recumbent rest is relatively small compared to the other three types of behaviors, and the recognition rate of standing behavior is lower. Goats will be in a static state when they are standing, and the change in acceleration is small, which is similar to lying behavior. The average recognition rate of the four behaviors was only 89.37%, so there is room for further improvement.

WOA
The Whale Optimization Algorithm (WOA) is a search algorithm that simulates whale predation. It uses two methods to randomly engage in bubble-net attacks, and then uses this random behavior to achieve global search [15]. The WOA was used to increase the global search capability of the XGBoost model as well as the applicability of the algorithm, so that the XGBoost model combined with the WOA can be applied to other datasets in order to find the optimal solution or a near-optimal solution.
WOA is an optimization algorithm based on whale foraging behavior, while goat behavior recognition involves animal behavior recognition and image processing technology. The following is a basic framework for using WOA to achieve the identification of goat behavior. First of all, for data collection and labeling, it is necessary to collect and label a large number of goat behavior data, such as eating grass, running, lying, and other behaviors. Then, feature extraction is carried out, using XGBoost model to extract features from goat behavior images. The third step is data preprocessing, that is, noise reduction described in Section 2.3 of this paper. Finally, WOA training is carried out. The preprocessed features are taken as input and WOA is used for training. WOA simulates whale foraging behavior and searches for the best solution by adjusting the location of candidate solutions. Here, the candidate solution can be seen as a classifier or recognition model of goat behavior. WOA is an optimization algorithm that uses a portion of the labeled data set to evaluate the model and calculate metrics such as accuracy and recall rate. According to the evaluation results, the parameters of WOA or the structure of the model are adjusted to further optimize the model. Goat behavior recognition is designed to recognize the new goat behavior image through the trained model. After the image is input into the model, the model will output the corresponding behavior category, that is, realize the recognition of goat movement behavior.
The specific principles of the WOA are as follows: (1) Encircle prey When searching for prey, since the location of prey is unknown, we first assume that the whale with the best fitness indicates the location of prey, and the rest of the individuals then approach the prey. The process can be expressed as follows: where D is the encirclement compensation, C and A are random coefficient vectors, W * (t) is the optimal position vector under the number of iterations t, t is the current number of iterations, and W(t + 1) is the t + 1 generation position vector obtained after the next iteration update. During iteration, when a better solution emerges, W * (t) will be updated.
A and C are calculated by the following equation: where α is the control parameter, which decreases linearly from 2 to 0 as the number of iterations increases; t max is the set maximum number of iterations; and r is a random vector from 0 to 1.

(2) Engage in hunting behavior
Hunting behavior is a process of approaching and gradually surrounding prey along a helix, which can be expressed as W(t + 1) = D · e bl · cos(2πl) + W * (t) (14) where D is the distance between whales and prey, b is a constant that controls the range of the helix, and l is a random number between −1 and 1. At the same time as the whales surround the prey, the distance between the whales and the prey should also be reduced. A 50% probability is set to randomly select whether the whales will perform the spiral encirclement action or the narrow range action, that is, when the probability is random, they will perform the contraction encirclement action, and at that time will perform the distance reduction action. The specific process can be expressed as where p is random probability with a value between 0 and 1.

(3) Search for prey
In addition to hunting known prey, whales will also search for new prey, upon which they will enter the random search phase. Then, whales will be randomly selected to update the position, leaving the optimal individual position that each whale was given in the iteration process. The process can be expressed as where W rand is a random position vector.

Social Learning Strategies
Social learning strategies enable individuals with poor fitness in the whale population to learn from other whales with better fitness [16]. Based on the WOA, this paper proposes the Whale Optimization Algorithm with Social Learning Strategy (SL-WOA), which introduces the mechanism of cooperative learning and knowledge sharing based on the original algorithm to improve the search efficiency, accelerate the convergence process, and enhance the robustness and adaptability of the algorithm to better solve complex optimization problems. The specific process is described below and illustrated in Figure 4.

Classification and Recognition Algorithm Flow Based on SL-WOA-XGBoost
The SL-WOA-XGBoost algorithm flow is as follows, and the flowchart is shown in Figure 5. to learn from the individual that is better than that one at random, and update its position: ( + 1) ( ).   (1) Initialize whale population parameter The population size is determined by the following equation: where M is the minimum number for normal operation of the algorithm, Q is a dimension, and [] is a remainder operation.
(2) Sort The fitness of the whales P i,j (t) is calculated and sorted in descending order. Fitness is measured by the recognition accuracy of XGBoost on the test set. The expression for identify accuracy is as follows: where n is the total number of test samples and n p is the total number of types identified for correct classification. The final definition of f itness is (3) Calculate social learning probability P L i Social learning probability P L i represents the learning ability of individual whales. When the random probability of the individual p i (t) is less than P L i , the individual will engage in social learning. The calculation expression is as follows: where i is the fitness number, which is negatively related to fitness, meaning the larger i is, the lower an individual's f itness is. It can be seen from the above formula that there is a certain relationship between P L i and f itness and the dimension of the optimization problem. The larger i is, the smaller 1 − (i − 1)/N is. This indicates that the individual is better and the ability to learn from other individuals is weak; α· log([D/N]) also shows that there is a negative correlation between P L i and D, that is, the larger D is, the smaller P L i is, that is, the greater the learning ability of individuals when optimizing high-dimensional problems, which is conducive to maintaining the diversity of the optimized population and preventing the population from falling into local optimization.
(4) Calculate random probabilities The variable p i (t) is used to update the positions of non-optimal individuals and is calculated as follows: where r 1 (t), r 2 (t), and r 3 (t) are random numbers between 0 and 1; ε is a social learning factor; and ∆p i,j (t) is a learning offset, which consists of local learning I i,j (t) and global learning W i,j (t). It can be seen that the updated position is determined by the local optimal position P i,j (t) and the three parts that represent the current individual's learning behavior I i,j (t) from the better individual k and the current individual's learning behavior W i,j (t) from the group as a whole, which is measured by the distance between P i,j (t) and P k,j (t). Finally, W i,j (t) is measured by the distance between the average optimal position G(t) of the individual and the group.

Classification and Recognition Algorithm Flow Based on SL-WOA-XGBoost
The SL-WOA-XGBoost algorithm flow is as follows, and the flowchart is shown in Figure 5. (4) According to Equations (20)-(24), start from the individual with the greatest fitness to learn from the individual that is better than that one at random, and update its position: P jd (t + 1) = P jd (t). (5) Calculate the individual fitness value after updating the position: P j (t) > P L j (t). (6) Determine whether the iteration termination conditions are met. If so, the global optimal solution will be output. Otherwise, return to step (3).
The SL-WOA-XGBoost algorithm flow is as follows, and the flowchart is shown in Figure 5.
(1) Initialize algorithm parameters, including WOA population number, iteration number, etc. (2) Initialize the population and calculate the fitness value for each individual in the population. (3) Sort individuals in descending according to the fitness value. (4) According to Equations (20)-(24), start from the individual with the greatest fitness to learn from the individual that is better than that one at random, and update its position: ( + 1) ( ).
(6) Determine whether the iteration termination conditions are met. If so, the global optimal solution will be output. Otherwise, return to step (3). In order to quantitatively evaluate the classification effect of the optimized XGBoost model, the test set data used in Section 3 was used for testing, and the results are shown in Tables 3 and 4.  Table 3, it can be seen that the improved XGBoost model incorporating the SL-WOA strategy has the highest recognition rate of 97.14% for recumbent behaviors and the lowest recognition rate of 91.47% for standing behaviors, and the recognition rate is higher for walking behaviors than running behaviors, which is the same before and after the algorithms are improved. The recognition rate of standing behavior is the lowest for the XGBoost model before and after improvement, because the two algorithms incorrectly recognize standing behavior as lying down behavior, which may be because both behaviors tend to be stationary, with small acceleration changes and close acceleration values, which makes the classification and recognition incorrect. On the other hand, in the recognition of walking and running behaviors, walking was incorrectly identified as running and running was incorrectly identified as walking, which may be because the acceleration and posture of the goat change more during both walking and running.

Results and Analysis
A comparison of the recognition rates before and after the two algorithms are improved based on XGBoost and SL-WOA-XGBoost is shown in Table 5, and the comparison results of the two classification algorithms are shown in Figure 6.  The recognition effects of the XGBoost model before and after improvement are compared in Figure 6. It can be seen that the rate of correct recognition of standing behavior after improvement increased by 6.42%, and the correct recognition of standing behavior was significantly improved. The average recognition rate of the improved XGBoost model is 94.42%, which is 3.12% higher than before the improvement, indicating that the proposed strategy can improve the goat movement behavior recognition rate. As can be seen from Table 5, the recognition rates of the four behaviors are improved The recognition effects of the XGBoost model before and after improvement are compared in Figure 6. It can be seen that the rate of correct recognition of standing behavior after improvement increased by 6.42%, and the correct recognition of standing behavior was significantly improved. The average recognition rate of the improved XGBoost model is 94.42%, which is 3.12% higher than before the improvement, indicating that the proposed strategy can improve the goat movement behavior recognition rate.
As can be seen from Table 5, the recognition rates of the four behaviors are improved by 2.51%, 6.42%, 1.98%, and 3.09%, and the average recognition rate is improved by 3.50%, among which the recognition rates of standing and walking behaviors are improved by 6.42% and 1.98% in the improved XGBoost model. This is a significant improvement in correct recognition of the two behaviors, and this result shows that the proposed strategy can improve the recognition rate of goat movement behavior, which verifies the effectiveness of the classification method described in this paper.

Conclusions
(1) In this paper, a goat movement behavior monitoring system based on a collection point on the goat's back is designed. The wearable data acquisition device, equipped with a three-axis acceleration sensor, is used to collect, transmit, and store goat movement behavior data in real time, which is of great significance for goat health monitoring and intelligent early disease warning. (2) WOA is just an optimization algorithm, the real algorithm for behavior classification is the XGBOOST model, and WOA is just used to select the optimal setting parameters for the XGBOOST model. The overall movement behavior identification process of goats is as follows: Firstly, data collection and labeling are carried out, that is, a large number of behavior data of goats' standing, walking, running, and lying behaviors are collected and labeled; The next step is to perform feature extraction, using the XGBoost model to transform the 3D data into a 2D matrix that is passed to the XGBoost model for training and prediction. The third step is data preprocessing; the noise reduction process is described in Section 2.3 of this article. Finally, WOA is used to optimize the training and prediction results of the XGBoost model, simulate whale foraging behavior, and find the best solution by adjusting the location of candidate solutions.
Here, the candidate solution can be viewed as a classifier or recognition model for goat behavior. WOA is an optimization algorithm that uses part of a labeled data set to perform model evaluations and calculate metrics such as accuracy, recall rates, and more. According to the evaluation results, the parameters of WOA or the structure of the model are adjusted to further optimize the model. Goat behavior recognition is used to recognize new goat behavior images by trained models. After the image is input into the model, the model will output the corresponding behavior category, that is, realize the recognition of goat movement behavior. (3) To solve the problem of low accuracy in goat movement recognition, a classification and recognition method based on SL-WOA-XGBoost is proposed. The WOA optimized by a social learning strategy is introduced to optimize the XGBoost parameters, and finally realize the classification and recognition of four behaviors: lying, standing, walking, and running. The experimental results show that the average accuracy of this method is 94.42%, which is 3.12% higher than that of the unmodified XGBoost model. Data Availability Statement: Some or all of the data, models, or code generated or used during the study are available from the corresponding author by request (raw data, site data, algorithm model).