Development of a Device and Algorithm Research for Akhal-Teke Activity Level Analysis

Featured Application: This research introduces a novel wearable device that uses an acceleration threshold behavior recognition method to classify horse activities into three levels: low (standing), medium (walking), and high (tro tt ing, cantering, and galloping). The recognition algorithm is directly implemented in the hardware, which horses wear during their training sessions. This device allows for the real-time analysis of horse activity levels and the accurate calculation of the time spent in each activity state. This method provides scienti ﬁ c data support for horse training, facilitating the optimization of training programs . Abstract: This study demonstrated that wearable devices can distinguish between di ﬀ erent levels of horse activity, categorized into three types based on the horse’s gaits: low activity (standing), medium activity (walking), and high activity (tro tt ing, cantering, and galloping). Current research in activity level classi ﬁ cation predominantly relies on deep learning techniques, known for their e ﬀ ectiveness but also their demand for substantial data and computational resources. This study introduces a combined acceleration threshold behavior recognition method tailored for wearable hardware devices, enabling these devices to classify the activity levels of horses directly. The approach comprises three sequential phases: ﬁ rst, a combined acceleration interval counting method utilizing a non-linear segmentation strategy for preliminary classi ﬁ cation; second, a statistical analysis of the variance among these segments, coupled with multi-level threshold processing; third, a method using variance-based proximity classi ﬁ cation for recognition. The experimental results show that the initial stage achieved an accuracy of 87.55% using interval counting, the second stage reached 90.87% with variance analysis, and the third stage achieved 91.27% through variance-based proximity classi ﬁ cation. When all three stages are combined, the classi ﬁ cation accuracy improves to 92.74%. Extensive testing with the Xinjiang Wild Horse Group validated the feasibility of the proposed solution and demonstrated its practical applicability in real-world scenarios.


Introduction
In horse training, tailoring the duration of training sessions to the horse's activity levels can help prevent errors and injuries due to overexertion [1].Wearable sensors play a pivotal role in precisely classifying equine activity levels, providing trainers with robust data to develop scientific and reasonable training plans.Using the activity level assessment model to quantify equine motion data, trainers can specify training durations based on the horse's activity levels, thereby avoiding training errors and injuries caused by excessive fatigue.
Activity levels are differentiated based on the horse's gaits.Three-axis accelerometer data have been proven to detect animal gaits [2].For instance, Casella et al. [3] collected accelerometer data using a smartwatch application and employed novel outlier detection, feature extraction, and machine learning methods to analyze horse motion states through monitoring devices on the saddle and the rider's wrist.Tran et al. [4] created a system for recognizing cattle behaviors, including walking, feeding, lying, and standing, using accelerometers attached to both the leg and collar.Giannetto et al. [5] explored interspecies comparisons of daily total locomotor activity under various management conditions for sheep and horses.McLennan et al. [6] validated an automated recording system for evaluating behavioral activity levels in sheep.Current research on gait recognition algorithms primarily focuses on analyzing behavioral patterns of sheep [7][8][9] and cattle [10][11][12].This study has drawn on their research method, utilizing gait analysis to measure activity levels.
During horse training, the duration of training for different activity levels directly impacts the effectiveness of the training [13].Trainers can finely adjust their strategies by analyzing the horse's activity levels.For example, in the early stages of training, medium activity contributes to building trust and foundational skills, while horses needing improvements in speed and strength benefit from higher activity level exercises [14].To accurately determine each activity level, precise gait recognition methods are crucial.Current research tends to apply deep learning technology to analyze horse gaits [15][16][17][18] precisely.Methods based on deep learning are challenging to implement directly on resourceconstrained wearable devices.This study proposes a horse activity level evaluation model based on acceleration threshold analysis to address this issue.This approach is well suited for low-power and embedded devices such as the STMicroelectronics 32-bit Arm Cortex-M Microcontroller (STM32) series hardware devices, which have been used in this study [19].The proposed approach enables wearable devices to parse and record the time spent in different gaits accurately.
The focus of this study is to provide a new evaluation model for horse activity levels and run the model on hardware devices to evaluate horse activity, providing a more convenient and comprehensive solution for horse activity monitoring and management.Additionally, this study employs non-linear segmentation methods for the preliminary classification of horse activity levels, laying a foundation for the detailed classification of complex activity levels using the acceleration interval counting method.Building upon the preliminary classification, combined with inter-segment variance analysis and multi-level threshold processing, this study further achieves fine-grained division and precise identification of horse activity levels.

System Architecture
This study has developed a wearable device for monitoring the activity levels of horses.The sensors in the hardware devices are connected, as shown in Figure 1.The STM32 is the core control unit, integrating various sensors to achieve motion detection, gait perception, and data transmission functionalities.The MPU6050 sensor is connected to the STM32 via the inter-integrated circuit (I2C) interface and is responsible for accelerometer data collection.The display module is connected to the central control chip via the service provider interface (SPI) for real-time data display.The power management module ensures a stable device power supply, including the lithium battery charging management chip TP4054, linear regulator XC6206P302MR, and over-current protection chip SY6280.The 4G transmission module connects to the STM32 via the universal asynchronous receiver transmitter (UART) serial port interface, enabling data transmission to remote servers.The appearance of the wearable device is shown in Figure A1, the system flowchart is presented in Figure A2, and the comprehensive workflow of the entire system is detailed in Figure 2.

Data Acquisition
The experiment occurred at the Urumqi Wild Horse group ranch in Xinjiang, China.The experiment was conducted from 29 June to 4 July 2023, during daily morning training sessions from 10:30 AM to 12:30 PM.To ensure the comparability and stability of the research findings, healthy Akhal-Teke horses aged 5-8 years (Figure 3) were selected.Observers used timers to record the duration of each horse's low, medium, and high activity (Table 1), ensuring accurate timing.In this study, six horses were observed, and wearable devices were used to collect acceleration data on each axis and combined acceleration data.The six devices were configured with a range of ±4 g (m/s 2 ) for three-axis acceleration values and ±2000 dps (°/s) for three-axis angular velocity values.The sensors had a sampling frequency of 2 Hz.The experimental process involves the device collecting three-axis acceleration data, computing combined acceleration data, analyzing horse gait data, and outputting predicted activity levels (Figure 4).

Data Processing
In order to more accurately capture the activity level characteristics of horses, this experiment adopted the method of combined acceleration instead of relying on singleaxis acceleration.Single-axis acceleration can be affected by slight horse vibrations, neck movements, and circuit noise, leading to inaccurate results [20].The combined acceleration integrates acceleration data from each axis, providing a more comprehensive metric reflecting the activity level characteristics of horses.As seen in Figure 5, the introduction of combined acceleration has provided a more stable measurement standard (Figure 5d).Table 2 displays the raw data for the three activity levels.Table 2. Motion data for the horse during low activity, medium activity, and high activity, with each row containing raw accelerometer data for the x, y, and z axes, along with the combined acceleration (ACC).The experiment collected three-axis acceleration data, represented by acceleration values on each axis as ax, ay, and az.The formula for calculating the combined acceleration is presented in Equation (1).

Algorithm Design
The experiment assesses horse activity levels by analyzing their gaits and proposes a method for threshold analysis of combined acceleration [21][22][23].This method includes three key stages: firstly, segmenting data through an interval counting approach to pinpoint specific gait patterns [24]; secondly, employing a variance analysis technique to examine the variability in acceleration across various datasets [25]; and finally, utilizing a variance-based proximity classification method for classification.Predicted horse activity levels are then compared with those recorded by observers to measure the accuracy of the assessment techniques.

Stage 1: Combined Acceleration Interval Counting Method
This study analyzed the collected acceleration data related to horse activity levels and identified the relationship between activity levels and acceleration (Figure 6).Statistical findings revealed that the combined acceleration ranges (units m/s 2 ) for low activity are N represents the size of the data collection window, which is determined by the specific experimental setup, and () represents motion behaviors.Equation (3) individually computes the number of combined acceleration values within specific intervals for low-activity, medium-activity, and high-activity behaviors.If this count exceeds (see Table 3) of the total array count, it is considered indicative of the corresponding activity level occurring.In the proposed approach, the combined acceleration variance analysis method deduces specific variance patterns under different motion states by computing the variance of the combined acceleration data within intervals associated with the three activity levels and examining the variations in variance, as illustrated in Figure 7. Analysis of test data from horses in different activity levels shows a characteristic range for the variance of combined acceleration.Through approximately 50 experiments, the boundaries for combined acceleration variance between low activity and medium activity (Table 4) and between medium activity and high activity (Table 5) were determined.This study defines the combined acceleration variance of horses in different activity levels as follows:

•
When the horse is in a low-activity state, it ranges from 0 to 1.2.

•
During the slow, medium activity, it ranges from 1.2 to 29.

•
When the horse is moving rapidly in a high activity state, it exceeds 29.
These findings serve as crucial features for identifying the horse's activity levels.
• Equation ( 5) determines the activity levels of the horse based on the variance  of the combined acceleration data.
• Equation ( 8) counts the occurrences of each label among the k-nearest neighbor labels.The label with the highest frequency is selected as the prediction.

Threshold Analysis of Combined Acceleration
The combined acceleration threshold analysis method attempts to combine the advantages of the Stage 1, Stage 2, and Stage 3 methods to enhance the overall accuracy and robustness of activity levels classification.

•
Equation ( 9) represents the combined acceleration set with a data collection window size of n.
• Equation ( 10) is used to calculate the data volume for low-activity behavior; (condition) is the indicator function.
• Equation ( 11) is used to calculate the variance within the interval.
Algorithm A1 presents the pseudocode for the threshold analysis of combined acceleration.

Model Training Results
Figure 8 presents confusion matrices for the predicted results obtained from Stage 1, Stage 2, Stage 3, and the threshold analysis of combined acceleration, respectively.Analyzing the activity level classification results using the Stage 1 interval counting method (Figure 8a), Stage 2 variance analysis method (Figure 8b), and Stage 3 variance-based proximity classification method (Figure 8c), the interval counting method demonstrated high accuracy in identifying low activity (93.91%), the variance analysis method performed well in distinguishing medium activity (94.62%), and the variance-based proximity classification method excelled in identifying high activity (91.53%).Combining the strengths of the three former methods, the combined acceleration threshold analysis (Fig- ure 8d) exhibited improved performance in identifying low-activity and high-activity behaviors, with accuracy rates reaching 95.03% and 91.09%, respectively, though the accuracy for medium activity slightly decreased to 92.55%.
The total number of samples in this experiment is 61,649, consisting of 12,564 samples for category 0, 35,991 samples for category 1, and 13,094 samples for category 2. Figure 8d shows the confusion matrix for the threshold analysis of combined acceleration, including the calculations for the number of true positives, true negatives, false positives, and false negatives.The values for sensitivity, accuracy, F-measure, and their confidence intervals were calculated, respectively.

•
Using Equation ( 13), the sensitivity for low activity was calculated as 95.03%, for medium activity as 92.55%, and for high activity as 91.09%.
• Using Equation ( 15), the F-measure for low activity was 0.9248, for medium activity was 0.9440, and for high activity was 0.9041.
• In Equation ( 16), n represents the number of samples in the dataset; z represents the z-value corresponding to the chosen confidence level, which is 1.96 for a 95% confidence level.These values illustrate that the model performs effectively and reliably in predicting activity levels.Overall, this method significantly enhances the identification of horse activity levels.Figure 9 illustrates the relationship between the data collection window (n) and the classification accuracy.In the interval counting method (Figure 9a), the accuracy gradually improves as n increases, reaching its peak at the data collection window of 25 (87.55%) and then stabilizing after that.In the variance analysis method (Figure 9b), the accuracy demonstrates an increasing trend with the increase in n, reaching its peak at n = 26 (90.87%).Therefore, with n = 26, a more precise analysis of activity levels is achieved.Threshold analysis of combined acceleration attains its highest overall recognition accuracy at n = 26, reaching 92.74%.

Validation of the Trained Models
To validate the accuracy of the combined acceleration threshold analysis model on new data, this study collected 20,000 data points for verification experiments.By comparing the accuracy of the trained models in actual application stages, the following results were found (Figure 10): the interval counting method achieved an accuracy of 82.49% on new data and 87.55% for the trained model; the variance analysis method achieved an accuracy of 90.13% on new data and 90.87% for the trained model; the proximity classification method achieved an accuracy of 91.55% on new data and 91.27% for the trained model; and the threshold analysis method achieved an accuracy of 92.71% on new data and 92.74% for the trained model.
The threshold analysis method combines the advantages of the three stages to achieve precise predictions of horse activity levels.Although the accuracy on new data slightly decreased, it still remained at 92.71%, almost consistent with the 92.74% accuracy of the trained model.This indicates that the threshold analysis method not only performs well during the training phase but also exhibits high robustness and reliability in practical applications.In this study, horse activity levels were analyzed using the combined acceleration threshold analysis model and compared with the KNN model (Figure 11).The results indicate that the overall accuracy of the combined acceleration threshold analysis model is 92.91%, slightly higher than the 92.31% of the KNN model (Figure 11c).Specifically, the combined acceleration threshold analysis model performed better in predicting categories 0 and 1, with accuracies of 95.03% and 92.55% (Figure 11a), respectively.In contrast, the KNN model excelled in predicting category 2, achieving an accuracy of 98.06% (Figure 11b).Although the combined acceleration threshold analysis model exhibited some misclassification in category 2 (8.91% misclassified as category 1), it had lower misclassification rates for categories 0 and 1, at 4.76% and 3.76%, respectively.In comparison, the KNN model had a higher misclassification rate in category 1, with 11.11% misclassified as category 2.
Compared to existing methods, the interval counting method and variance analysis method achieved accuracies of 82.49% and 90.13%, respectively, on new data, both lower than those of the combined acceleration threshold analysis model and the KNN model.The combined acceleration threshold analysis model, by leveraging the advantages of multiple stages, achieved higher accuracy and stability, making it suitable for wide application.Although the KNN model performed exceptionally well in certain categories, its overall balance was slightly inferior.

Statistical Time
In research, data collection window and frequency were utilized to calculate the duration of different activity levels.Overall analysis indicates (Figure 12) that the differences between predicted and actual times are minimal.Although there are some errors in the predictions for moderate and high activity levels, the model's predictions are generally accurate across all activity intensity levels.The predicted duration for moderate activity is slightly overestimated while that for high activity is slightly underestimated.However, these differences are within an acceptable range.Optimizing acceleration thresholds and implementing personalized calibration strategies are expected to further enhance the precision and reliability of the prediction model.The combined acceleration threshold analysis enhances the accuracy of classifying activity levels by effectively utilizing acceleration data and integrating fundamental machine learning principles, all while ensuring efficiency on hardware platforms.This method simplifies feature extraction and classification from acceleration data, allowing for quick and precise identification of low, medium, and high horse activity levels.To broaden the applicability of the proposed activity level classification method to other animals, this study envisions a generalized method based on the one proposed (Table A1).

Device Positions and Frequencies
Wearing the device on the horse's neck ensures it is easy to put on and can collect the horse's activity data.Despite potential shifts in the device's position due to neck movements, this study addresses these inaccuracies using combined acceleration data (Figure 5).This approach guarantees both high accuracy and reliability in the data gathered.
In the experiment, a frequency of 2 Hz was adopted for the MPU6050.This is due to the algorithm being deployed directly onto wearable devices, with the core microcontroller STM32 tasked with data collection and executing the analysis algorithm to ascertain activity levels.Utilizing too high a frequency would exceed the microcontroller's processing capabilities, thus impacting the efficiency and accuracy of the analysis.To mitigate the risk of collecting and processing an excess of unnecessary data, a balanced frequency of 2 Hz was chosen, striking a compromise between data quality and quantity.This decision aims to minimize data overload, optimize processing efficiency, and reduce computation.

Data Collection Window
This study also investigated the impact of data collection window size on the activity in classifying activity in horse training.The research findings indicate that as the data collection window size increases, there is a corresponding improvement in the accuracy of activity classification on the test dataset.This aligns with Walton et al.'s [26] study on window size's influence on classification accuracy.However, more extensive data collection windows reduced the volume of training and validation data, subsequently decreasing the classification accuracy.This is because larger data collection windows encompass multiple gaits, and the selection of a data collection window is influenced by factors such as data collection frequency and the distribution of activity data.Mansbridge et al. [27] conducted a study on the impact of data collection windows on sheep behavior classification, confirming that more oversized data collection windows are not necessarily better.
The study identified that setting the data collection window to 26 for the analysis of activity levels resulted in optimal classification accuracy.This study investigated the close relationship between the biological characteristics of horses [28] and the size of the data collection window.Through a detailed analysis of the natural gait patterns of horses, including walking, trotting, and galloping, it was found that selecting the correct data collection window size is crucial for accurately capturing these dynamic changes.When the data collection window size is 26, this scale is sufficient to cover the range required for horses to perform a series of basic and complex behavioral transitions, ensuring a comprehensive record of horse activity patterns.Secondly, this size of the data collection window ensures data richness while avoiding the problem of excessive data processing complexity due to an overly large window, striking a balance between efficiency and accuracy.

Classification Algorithm
Threshold analysis of the combined acceleration method integrates two distinct stages.The first stage utilizes a combined acceleration interval counting method, facilitating quick activity level evaluation.The second stage, utilizing variance analysis of combined acceleration, delves deeper into motion characteristics, effectively distinguishing behaviors of varying intensities.The combined acceleration threshold analysis method leverages both strengths, significantly improving activity level classification accuracy.
Nauwelaerts et al. [16] explored the influence of acceleration on diverse horse activity levels (walk, trot, and canter).Their research offered valuable insights into the effects of acceleration on gait, contributing to a better understanding of equine motion.Barrey et al. [29] conducted an objective study on horse gait using an accelerometer.The results indicated that this accelerometer could effectively assess horse gaits.This study validates the application of acceleration analysis.It demonstrates how to translate these theoretical insights into practical animal activity level classification applications.This study employs MPU6050 sensors to capture horse motion data.It distinguishes low-activity, mediumactivity, and high-activity behaviors through threshold analysis.
In equine gait classification, Serra Bragança et al. [17] achieved up to 97% accuracy by training machine learning models on raw accelerometer data and feature-extracted data.Waele et al. [30] focused on unsupervised deep learning techniques, using deep embedded clustering (DEC) methods and discrete Fourier transform (DFT) preprocessing for automatic equine gait classification, reaching an overall accuracy of about 83%.This research introduces a horse activity level classification method based on combined acceleration threshold analysis, surpassing the unsupervised deep learning approach with a classification accuracy of 91.57%, although lower than the machine learning-based approach.The unique aspect of this method is its capability to be directly deployed on hardware without relying on extensive server computations, making it particularly suitable for scenarios with limited computational resources or where processing is required.Despite a slight compromise in accuracy, this method offers a novel and practical solution for monitoring animal activity levels with its simplicity, efficiency, and ease of deployment, which is especially valuable in resource-constrained environments.

Limitations and Future Work
While this study proposes a promising horse activity classification technology based on wearable hardware devices and has achieved some success, certain limitations warrant future work.

Hardware resource limitations:
Due to hardware limitations, including restricted memory and processing capabilities, it was not feasible to incorporate additional features in algorithm design and experimentation.

Sample limitations:
The experimental samples in this study primarily cover specific types of horses, specific environmental conditions, and states with specific training.Therefore, the generalizability of the results to other scenarios may be limited at present for this experiment.

Algorithm simplification:
To adapt to the constraints of hardware devices, this study simplified machine learning algorithms to some extent.While this enhances the ability to deploy on resource-limited devices, it may also impact classification accuracy.

Activity level evaluation model accuracy:
Despite achieving some successes in experiments, the accuracy of recognition during high activity motion of horses is lower than in slow and medium activities.
Future research should focus on developing more intelligent adaptive algorithms to meet the demands of complex behavioral patterns.This involves integrating other sensory data and adopting advanced machine learning methods to construct more accurate models [31,32], requiring ongoing improvements in chip technology and algorithm optimization [33].

Conclusions
This study proposes a wearable device-based model for evaluating horse activity levels, optimally capturing activity level changes with a window size of 26.By using combined acceleration threshold analysis, an accuracy of 92.74% was achieved.Experiments on new datasets validated the feasibility of the model.This method provides a practical solution for resource-constrained wearable devices, enabling convenient monitoring of

Figure 2 .
Figure 2. The overall framework of data collection, algorithm processing and analysis, and data transmission to end-users for monitoring.

Figure 3 .
Figure 3. On-site data acquisition using a wearable device placed on the neck of the horse.

Table 1 .
An example of a horse training experiment record sheet documenting the training time for each horse's activity level, displaying the training data for the horse with device ID 17.The symbol (⋮) indicates that some intermediate time intervals have been omitted.The symbol (√) corresponds to the Activity Levels for the respective time periods.

Figure 4 .
Figure 4. Flowchart depicting the sequence of steps and procedures followed in the experiment.

Figure 5 .
Figure 5. Shows the comparison between three-axis acceleration and combined acceleration under low activity: (a) x-axis acceleration, (b) y-axis acceleration, (c) z-axis acceleration, (d) combined acceleration.
[9.0, 10.5], for medium activity are [7.0,15.0], and for high activity are [2.0,40.0].These specific combined acceleration ranges constitute the boundary values for the interval counting method.

Figure 6 .
Figure 6.Signal characteristics of combined acceleration for low-activity, medium-activity, and high-activity behaviors.Specific analysis methods: Equation (2) is the counting function, where  = [ ,  ,  , … ,  ] represents a collection of combined acceleration values, and  represents the nth combined acceleration.For each element  in the array , if it satisfies  ≤  ≤ , then the value of ( ≤  ≤ ) is 1; otherwise, its value is 0. By assessing all these combined acceleration values, the total count of elements within the range [, ] is obtained.

Figure 7 .
Figure 7. Distribution range of the combined acceleration variance during low-activity, mediumactivity, and high-activity behaviors.

3 .
Stage 3: Variance-Based Proximity Classification This algorithm is based on a simplified K-Nearest Neighbor (KNN) design.First, the optimal parameters are determined by training the model using KNN.Due to the imbalance in the training dataset, KNN employs oversampling methods to balance the data.Ultimately, training and testing results indicate that the algorithm performs best when k = 3 and p = 2.The algorithm then calculates the Euclidean distance between each data point and the training data, sorts these distances, selects the k-nearest points, and determines the most frequently occurring label among these nearest neighbors as the predicted activity level.Specific analysis methods: • Equation (6) calculates the Euclidean distance between a test point  and each training point  :  = ( ,  ) = ( −  ) The calculated confidence interval (CI) boundaries are [91.70,93.7].

Figure 8 .
Figure 8. Comparing the confusion matrices of the three methods: (a) Stage 1: combined acceleration interval counting method, (b) Stage 2: combined acceleration variance analysis method, (c) Stage 3: variance-based proximity classification method, and (d) Stages 1, 2, and 3 together: threshold analysis of combined acceleration.

Figure 9 .
Figure 9. Comparing the correlation between the data collection window and the classification accuracy in (a) Stage 1: combined acceleration interval counting method and (b) Stage 2: combined acceleration variance analysis method.

Figure 10 .
Figure 10.Comparison of the accuracies between the interval counting, variance analysis, proximity classification, and combined threshold analysis methods on both training models and practical applications.

Figure 11 .
Figure 11.(a) Confusion matrix of the combined acceleration threshold analysis model, (b) confusion matrix of the KNN model, (c) accuracy comparison between the two models.

Figure 12 .
Figure 12.Comparison results between actual and predicted activity durations.

Table 3 .
The relationship between interval analysis threshold and classification accuracy.

Table 4 .
Boundary of combined acceleration variance between low activity and medium activity, experimentally validated with the highest classification accuracy when the boundary value is 1.2.

Table 5 .
Combined acceleration variance threshold between medium activity and high activity, experimentally validated with the highest classification accuracy when the threshold value is 29.
•Periodically collecting combined accelerometer data via a timer.= [ ,  ,  , … ,  ] is a set of combined acceleration values, where  represents the nth combined acceleration.•Equation (4) calculates the variance with a data collection window size of n.  represents the variance, and  stands for the mean, calculated as  = ∑  .