A Novel Combined Model for Predicting Humidity in Sheep Housing Facilities

Simple Summary The hybrid model is proposed to predict humidity in sheep barns, based on a machine learning model combining a light gradient boosting machine with gray wolf optimization and support-vector regression (LightGBM–CGWO–SVR). Influencing factors with a high contribution to humidity were extracted using LightGBM to reduce the complexity of the model, and required hyperparameters in SVR were optimized, adopting the CGWO algorithm to avoid the local extremum problem. The experimental results indicated that the proposed LightGBM–CGWO–SVR model outperformed eight existing models used for comparison on all evaluation metrics; it achieved minimum values of 0.0662, 0.2284, 0.0521, and 0.0083, in terms of MAE, RMSE, MSE, and NRMSE, respectively, and a maximum value of 0.9973, in terms of the R2 index. Abstract Accurately predicting humidity changes in sheep barns is important to ensure the healthy growth of the animals and to improve the economic returns of sheep farming. In this study, to address the limitations of conventional methods in establishing accurate mathematical models of dynamic changes in humidity in sheep barns, we propose a method to predict humidity in sheep barns based on a machine learning model combining a light gradient boosting machine with gray wolf optimization and support-vector regression (LightGBM–CGWO–SVR). Influencing factors with a high contribution to humidity were extracted using LightGBM to reduce the complexity of the model. To avoid the local extremum problem, the CGWO algorithm was used to optimize the required hyperparameters in SVR and determine the optimal hyperparameter combination. The combined algorithm was applied to predict the humidity of an intensive sheep-breeding facility in Manas, Xinjiang, China, in real time for the next 10 min. The experimental results indicated that the proposed LightGBM–CGWO–SVR model outperformed eight existing models used for comparison on all evaluation metrics. It achieved minimum values of 0.0662, 0.2284, 0.0521, and 0.0083 in terms of mean absolute error, root mean square error, mean squared error, and normalized root mean square error, respectively, and a maximum value of 0.9973 in terms of the R2 index.


Introduction
Sheep farming for meat is a leading industry in the livestock economy of rural inland Northwest China (i.e., Xinjiang), and large-scale intensive farming is the primary mode A typical hybrid design helps remove noise, reduce dimensionality, or extract features from high-dimensional input data of models, e.g., wavelet transform and principal component analysis, and, subsequently, uses the obtained data as an input to construct a predictive model using machine learning or deep learning methods. Some hybrid models have been applied to predict humidity and other environmental parameters in livestock houses or greenhouses. For example, Besteiro et al. [14] developed a commercial piglet farm CO 2 prediction model based on a wavelet neural network designed to consider indoor and outdoor temperatures, indoor humidity, and ventilation rates as input variables. He and Ma [15] extracted four main factors from eight parameters, namely, outside air temperature and humidity, wind speed, and others, based on principal component analysis (PCA) and then constructed a prediction model for indoor greenhouse humidity in north China in winter based on a BPNN model. The test set results showed that the model was not accurate enough. In addition to PCA for feature selection, the light gradient boosting machine (LightGBM) algorithm, developed in recent years [16] as a variant of the improved gradient-boosted decision tree (GDBT), has been widely used as a prediction model in several fields [17][18][19]. Since the heuristic information of the iterative tree can be used as an important measure of features [20], an iterative tree can be constructed using LightGBM to calculate the feature importance measure of a sample to perform feature selection. The built-in exclusive feature bundling algorithm of the LightGBM model compensates for the drawback of PCA that useful information tends to be lost in calculating the dimensionality reduction for less relevant high-dimensional features, and mutually exclusive feature bundling allows for certain cases of non-mutual exclusion among a small number of features, which can better separate redundant features from bundled features to maintaining the accuracy of information filtering. A few works have used LightGBM to reduce the dimensionality of the features of an input model [16,21,22].
In data-driven predictive modeling algorithms, the support-vector regression (SVR) approach rests on a stronger mathematical theoretical foundation than neural-network-based "black box" predictive models [23]. It can effectively solve the problem of constructing high-dimensional data models under limited sample conditions and possesses the advantages of strong generalization ability, convergence to the global optimum, and insensitivity to dimensionality. This method has shown considerable potential to learn nonlinear timeseries data, and has been widely used in several fields for prediction, such as late potato blight [24], apricot yields [25], solar radiation [26], and so on.
As the SVR parameter selection is often stochastic and empirical, inappropriate parameter combinations affect the performance and prediction accuracy of such models. Thus, for SVR prediction models, the best parameter combination should be selected to obtain good prediction results. The gray wolf optimization (GWO) algorithm proposed by Mirjalili et al. [27] possesses the advantages of high stability and search capability and has been widely applied to optimize SVR [28,29], BPNN [30] and other models. It has been proven to effectively improve the accuracy and generalization ability of prediction models. As a new metaheuristic optimization technique, GWO possesses stronger global search capability and multimodal function exploration capability than genetic algorithms (GA) and particle swarm optimization (PSO) algorithms [31]. However, it is relatively inefficient and has a tendency to obtain a local optimum [32]. A few researchers have improved the GWO algorithm by introducing new operators, changing control parameters, or updating the GW position. Song et al. [33] proposed a nonlinear model predictive controller, based on the shaded GWO algorithm, to predict the maximum wind energy of wind turbines, and the algorithm was balanced between local and global search by changing the formulae of the gray wolves' social ranks and position updates to prevent the algorithm from reaching a local optimum. Zhang and Hong [34] improved the search capability and search efficiency of the GWO algorithm by introducing a chaotic tent mapping function to solve the problems of incomplete diversity and premature convergence of wolf packs. Zhao et al. [35] optimized the parameters of a gated recurrent unit (GRU) model using an improved version called IGWO to balance the overfitting and underfitting of the traditional Animals 2022, 12, 3300 4 of 21 GWO algorithm by improving the objective function. Liu et al. [36] addressed the tendency of the GWO algorithm to fall into local extremes in early stages by introducing a new operator to construct a dynamic variational operator to adjust the difference vector of the wolf pack. Based on these works, and by combining the advantages of other algorithms, in this study, we propose an algorithm called Chaos-GWO (CGWO), based on chaos theory, which is designed to avoid falling into local optima. The proposed approach is further discussed in Sections 2.4 and 3.3.
Therefore, to improve the lower performance of the existing humidity prediction methods for livestock houses or greenhouses, and overcome the shortcomings of a single SVR model, which may be difficult to adapt to sheep barn humidity prediction, we propose a hybrid prediction model for ambient air humidity in sheep barns based on the proposed combined LightGBM-CGWO-SVR method, to predict the air humidity after 10 min. First, redundant or overlapping information on the environmental features obtained from the sensors is filtered out using the LightGBM model's built-in dimensionality reduction technique, to avoid excessive computational complexity and overfitting caused by the direct input of multidimensional features. Subsequently, the SVR model hyperparameters are searched and optimized using the GWO algorithm, improved by the chaos operator for better accuracy and reliability.

Data Sources
The source of the data was a sheep breeding facility for meat production located at 44 • 27 18 N latitude and 86 • 10 47 E longitude in Manas County, Changji Hui Autonomous Prefecture, Xinjiang Uygur Autonomous Region, China. The main space of the facility, with an area of~422 m 2 , was selected to collect the experimental data. The facility mainly breeds and raises Suffolk sheep, and the sheep barn structure is semi-enclosed and generally divided into three parts, as shown in Figure 1, which included the main area, used as the daily resting area for sheep located in the middle of the barn, a shaded area located in the north, and an eating area located in the south. The main area of the farm was~33.75 m long and 12.5 m wide, with brick and concrete walls on all sides, windows and doors, a roof made of steel plating, and a floor of compacted soil. In summer, the sheep barn is protected from heat by natural ventilation and a shaded tent on the north side. In winter, the sheep are kept in the main area for warmth and enclosed breeding, and ventilation is provided by exhaust fans. To monitor the environmental parameters of the sheep barn online, temperature, humidity, noise, light, particulate matter 2.5 (PM 2.5 ), particulate matter 10 (PM 10 ), total suspended particulate (TSP), CO 2 , ammonia, and hydrogen sulfide sensors were deployed in the main area. The ammonia, hydrogen sulfide, noise, and other sensors were installed at 3.2, 3.7, 3.1, and 3.3 m, respectively, from the ground.

Experimental Data Acquisition
The sensors installed in the sheep barn performed online data collection at 10-min intervals, and the data were uploaded to a remote IoT cloud service platform through a gateway. The 2000 units of data collected online from 8 February to 22 February. 2021,

Experimental Data Acquisition
The sensors installed in the sheep barn performed online data collection at 10-min intervals, and the data were uploaded to a remote IoT cloud service platform through a gateway. The 2000 units of data collected online from 8 to 22 February 2021, were used as the data source for this study; 70% of these data were selected for the training set, and 30% were used for the testing set to train and validate the performance of the proposed humidity prediction model. The raw environmental data are provided in Table 1, and the sequence of the observed air humidity data is illustrated in Figure 2.

Experimental Data Acquisition
The sensors installed in the sheep barn performed online data collection at 10-min intervals, and the data were uploaded to a remote IoT cloud service platform through a gateway. The 2000 units of data collected online from 8 February to 22 February. 2021, were used as the data source for this study; 70% of these data were selected for the training set, and 30% were used for the testing set to train and validate the performance of the proposed humidity prediction model. The raw environmental data are provided in Table  1, and the sequence of the observed air humidity data is illustrated in Figure 2.

Experimental Data Preprocessing
As the parameters in the sheep barn environment had different magnitudes and units, which differed significantly and affected the data analysis and learning results, the influence of magnitudes between parameter sets needed to be eliminated through normalization to enable the algorithms to learn and analyze the hidden structural relations in the data more easily [37]. In addition, standardization was performed to accelerate the training of learning models, and was more stable than an unstandardized training procedure under the same conditions, providing a more uniform fit and better prediction results. The standardization equation is provided as Equation (1): where Z S , Z m , and Z m * denote the standard deviation, mean, and values obtained after normalization, respectively. LightGBM is a lightweight framework implementing a gradient-boosting decision tree, which is an algorithm that converts multiple decision trees into a strong learning machine using weighted linear combinations. It retains samples with large gradients and randomly selects samples with small gradients during training. To counteract the sampling effect in the data distribution, LightGBM introduces a constant multiplier for samples with small gradients in calculating the information gain. For the training set D : {Y 1 , Y 2 , · · · , Y n }, the negative gradient of the loss function at each iteration corresponds to {h 1 , h 2 , · · · , h n }, the sampling rate of the large gradient samples is o, and the sampling rate of the small gradient samples is p. The steps used to learn the feature weighting are performed as follows: Step 1: Sort the sample points in descending order according to the absolute values of their gradients.
Step 2: Select the top o × n samples of the sorted results to generate a subset of the large gradient samples O D .
Step 3: Randomly select p × n samples from the remaining samples and generate them as a small gradient sample subset P D .
Step 4: Combine the large and small gradient samples (O D ∪ P D ) to learn a new decision tree; to calculate the information gain (IG), the small gradient samples are multiplied by a weight factor (1 − o)/p, and the information gain of the split feature m for the split point d is calculated as: of which: Step 5: Repeat Steps 1-4 for a specified number of iterations or until convergence is reached.
After traversing the entire model, the sum of IG of all split nodes for a feature is considered an important weight.
The collected environmental parameters of sheep barns were ranked based on feature importance using LightGBM, and the weights obtained are listed in Table 2. To select strongly correlated features, the key influencing factors of CO 2 concentration, light intensity, air temperature and PM 2.5 were screened and applied to the LightGBM model, and the training and testing sets of the Xinjiang sheep barn humidity prediction model were constructed using the screened parameter data.

GWO Algorithm
The GWO algorithm was inspired by the putative hierarchy of Gray Wolves and an associated hunting mechanism. The model is based on the assumption of a strict social hierarchy in a GW pack, which is divided into four levels. The first level is an α wolf, the leader of the pack and responsible for the decision-making of the whole population; the second level is the β wolves, which assist the α wolf in decision-making or other hunting activities; the third level is the δ wolves, which obey the commands of the first two levels and perform tasks, such as scouting and sentry duty; the fourth level comprises ω wolves, which obey the leadership and command of the first three levels. The main components of the GWO algorithm are described below.
(1) Population structure In the GWO algorithm, each GW represents a potential solution, where the optimal solution is denoted as α, the second best as β, the third best as δ, and ω denotes the remaining candidate solutions and is guided by α, β, and δ in its search.
(2) Surrounding the prey The GW encircle the prey in the GWO algorithm as described by a mathematical model defined in Equations (4) and (5): where n denotes the current number of iterations, A · D denotes the bracketing step, U i denotes the position vector of the prey, and U denotes the current position vector of an individual GW. A and C denote the vectors of random coefficients, which are calculated as given in Equations (6) and (7): where C varies in the range of (0, 2), which increases the randomness of the distances of other individual GWs from the α, β, and δ wolves, which influences the detection ability of the group and facilitates the GWO algorithm to jump out of local optima. r 1 and r 2 denote randomly generated vectors on interval (0, 1), respectively, and a denotes a nonlinear factor that decreases linearly from 2 to 0 with the number of iterations, which is calculated as: where n and n max denote the current number of iterations and the maximum number of iterations. As the value of a decreases from 2 to 0, A varies in the range of (−a,a). When |A| < 1, the GW launches an attack on its prey; that is, an optimal solution is achieved. When |A| > 1, the GW has performed a search but abandoned the current prey; thus, this search avoids getting stuck in a local optimal solution and causes the GWO algorithm to continue to search for a solution that is optimal solution.
(3) Hunting process The α, β, and δ wolves are more aware of the potential location of their prey and occupy an optimal position to attack. Other individual GWs are guided by the α, β, and δ wolves to move toward the prey during the hunting phase, to surround the prey, and to update their positions by moving and repositioning multiple times to eventually reach the prey's position, respectively. The specific equations for the change in the positions of the GW pack siege are given as: Animals 2022, 12, 3300 8 of 21 where L α , L β , and L δ denote the distances of the individual GWs from the α, β, and δ wolves, respectively; U α , U β , and U δ denote the current positions of α, β, and δ wolves, respectively; U 1 , U 2 , and U 3 denote the updated positions of the individual GWs toward α, β and δ wolves, respectively; and U i (n + 1) denotes the mean of the updated positions of the individual GWs along the directions of α, β, and δ wolves.

Improvement of GWO Based on Chaotic Operators
(1) Improvements of update mechanism based on random coefficient vectors C For GWO, exploration and exploitation capabilities are crucial to optimal performance and mainly depend on the value of the stochastic exploration coefficient C. Thus, choosing a suitable value of C to coordinate the exploration and exploitation capabilities of GWO is critical.
Inspired by chaos theory, we introduced time-varying acceleration constants in C 1 , differential mean perturbed time-varying parameters in C 2 , and sigmoid-based acceleration coefficients in C 3 to enhance the randomness of C. Compared with random search, a chaotic algorithm could help in a thorough search with a higher probability and speed. It overcomes the tendency of the GWO algorithm to fall into a local optimum, owing to existing performance, and distributes the search more uniformly to obtain better global convergence performance and faster convergence. The improved equations, C 1 , C 2 , and C 3 , are defined as: where n and n max denote the current iteration number and maximum iteration number, respectively. Equations (12)- (14) are used instead of Equation (7).
(2) Algorithm implementation of CGWO In summary, the implementation process of the proposed CGWO algorithm is given as follows: Step 1: Initialize the algorithm parameters of the GW population size M, with a maximum number of iterations n max .
iterate through all the individual wolves, calculate the fitness values of these M individual GWs { f (U i ), i = 1, 2, · · · , M}, and identify α, β, and δ wolves: U α , U β , and U δ . Initialize the nonlinear decreasing factor a and random exploration vectors A and C.
Step 4: If n < n max (or when the iteration condition is satisfied).
Step 5: Update the distances and directions of the movement of individual GWs in the current pack from the α, β, and δ wolves using Equations (9)-(11), respectively, and update the location of each individual GW using Equations (4) and (5).
Step 6: Update the random exploration vectors A, C, and the convergence factor a using Equations (6)-(8).
Step 7: Calculate the fitness of each individual GW for the current iteration { f (U i (n)), i = 1, 2, · · · , M}, and compare it with the fitness of the last iteration.
If the new fitness is higher than the current fitness, the new fitness replaces the current fitness as the global optimal fitness and updates the wolf's position. If the new fitness value is not higher than the current fitness, it remains unchanged.
Step 9: n = n + 1, return to Step 4 when the iteration condition is satisfied, exit the loop when the maximum number of iterations (n max ) is reached, obtain the position of the α wolf, which is the best value for the SVR parameters C, g, and ε, and the algorithm ends.

SVR
For a given sample training set (m i , n i ), i = 1, 2, · · · , N, m denotes the independent variable in the sample training set, n is the dependent variable in the sample training set and N is the number of samples. The SVR is linearly fitted by the nonlinear feature mapping function φ(m), which maps the data m onto a high-dimensional linear space. The linear function of SVR in the high-dimensional feature space can be expressed as: where µ denotes the normal vector and b denotes the bias term.
To control the accuracy of the fit, an insensitive loss function ε is introduced. If the difference between the predicted value f (m) of m and the actual value n is less than ε, the predicted value does not suffer any loss. Otherwise, the predicted value is considered to have suffered a loss, and the optimization model is expressed in Equation (16): Penalty coefficients C were introduced for error-free fitting at positive and negative relaxation levels, i.e., τ i and τ i * , respectively, with the minimum of the sum of the complexity of the regression function and the fitting error as the final optimization objective, as expressed in Equation (17): Using the Lagrange multipliers λ i and λ i * such that the partial derivatives of the variables µ, b, τ i , and τ i * are zero, the abovementioned functions were transformed into the pairwise problem of SVR, as expressed in Equation (19): where κ m i , m j denotes the kernel function, which implies that the inner product is calculated after mapping m i and m j onto a high-dimensional feature space. By solving the pairwise problem, the SVR regression function could be expressed as: In this experiment, the radial basis kernel function RBF, which is currently widely used, was considered, and expressed as: where g denotes the width parameter of the radial basis kernel function.  (Figure 3). The specific algorithmic steps are as follows: Step 1: Perform data repair on the collected data.
Step 2: Use LightGBM to calculate the IG for the sheep barn environmental data after the data have been restored, and each parameter factor has been ranked in terms of feature importance, selecting strongly correlated features.
Step 3: Normalize the data after dimensionality reduction, based on LightGBM, and divide the training and test sets in a ratio of 7:3.
Step 4: Based on the training set, optimize C, g, and ε for the SVR model using the CGWO algorithm described in Section 2.4.3.
Step 5: Apply the best hyperparameters obtained using CGWO, that is, C, g, and ε, to the SVR model to obtain the combined LightGBM-CGWO-SVR prediction model.

Experimental Setup and Parameter Determination
The experimental calculations were performed using a computer with an 11th Gen Intel(R) Core (TM) i7-1165G7 processor at 2.80 GHz and 16 GB of memory, running the Windows 10 64-bit operating system with the Anaconda3 platform and the Python 3.6 programming langue (64-bit). The simulation environment was completed by installing the NumPy, Sklearn, and LightGBM packages.
The CGWO parameters were set as follows. The size of the wolf pack was set to 20, the random exploration vectors and were set to 20, and the SVR parameters were searched in the range of [0.01, 10]. SVR uses RBF, and the optimal combination of hyperparameters was obtained by searching the SVR model globally using the CGWO algorithm. The maximum number of iterations was 20, although the fourth iteration started to converge in the experiment.

Experimental Setup and Parameter Determination
The experimental calculations were performed using a computer with an 11th Gen Intel(R) Core (TM) i7-1165G7 processor at 2.80 GHz and 16 GB of memory, running the Windows 10 64-bit operating system with the Anaconda3 platform and the Python 3.6 programming langue (64-bit). The simulation environment was completed by installing the NumPy, Sklearn, and LightGBM packages.
The CGWO parameters were set as follows. The size of the wolf pack was set to 20, the random exploration vectors A and C were set to 20, and the SVR parameters were searched in the range of [0.01, 10]. SVR uses RBF, and the optimal combination of hyperparameters was obtained by searching the SVR model globally using the CGWO algorithm. The maximum number of iterations was 20, although the fourth iteration started to converge in the experiment.

Performance Criteria
In this study, the coefficient of determination (R 2 ), mean absolute error (MAE), root mean square error (RMSE), normalized root mean square error (NRMSE), and mean squared error (MSE) were used as evaluation metrics to computationally assess the prediction results.
(1) R 2 shown in Equation (1) was used to assess the fit between the predicted and actual humidity, and considered a value in the range of (0, 1), a value closer to 1 represented a better fit between the predicted and actual humidity values.
(2) MAE was the average distance between the individual and predicted humidity data; a smaller MAE implied that the predicted value was closer to the actual humidity data (Equation (23)).
(3) MSE represented the mean of the square of the difference between individual humidity data and the predicted humidity data. RMSE was the root mean square result of the MSE, which could be intuitive, in terms of the order of magnitude performance; normalized RMSE converted the RMSE value within (0, 1). The smaller the MSE, RMSE, and NRMSE, the smaller the error between the predicted and true values, and the more accurate the model. They were expressed as: In the equations above, f k , f k , r, and f denoted the original data value, predicted value of f k , amount of data, and average value of f k , respectively.

Results and Analysis
To demonstrate the superior performance and effectiveness of the proposed LightGBM-CGWO-SVR model in predicting the humidity of sheep barn facilities, our approach was compared with the following models: GRU, SVR, extreme gradient boosting (XGBoost), GA-SVR, GWO-SVR, CGWO-XGBoost, CGWO-SVR, and LightGBM-SVR. These were computed using the same training and validation datasets, and parameters of different SVR models are shown in Table 3. The prediction performance for 10 min details of the different models on the final validation dataset are listed in Table 4. The real humidity measurements on the validation set with the predicted values of each model are shown in Figure 4. According to the results of the comparison presented in Table 4, the evaluation indices of the proposed LightGBM-CGWO-SVR-based sheep barn humidity prediction model were better than those of the other prediction models. The minimum values of MAE, RMSE, MSE, and NRMSE of the model were 0.0662, 0.2284, 0.0521, and 0.0083, respectively, among all the models, and R 2 achieved the highest value of 0.9973. As shown in Figure 4, the prediction curve of the LightGBM-CGWO-SVR model was able to significantly fit the true value curve, indicating that the LightGBM-CGWO-SVR-based sheep barn humidity prediction model exhibited high accuracy and fitting performance.   For a single model, the prediction accuracy of the SVR model was higher than that of the GRU and XGBoost models. The MAE, RMSE, MSE, and NRMSE values of the SVR model were reduced by 76%, 30%, 51%, and 30%, respectively, compared with those of GRU, and 76%, 19%, 34%, and 19%, respectively, compared with those of XGBoost. Compared with GRU and XGBoost, the R 2 of SVR improved by 9% and 4%, respectively. This indicated that the SVR model exhibited good adaptability to sheep barn humidity data. The real observed values of the local humidity and the predicted values of the GRU, XGBoost, and SVR models are shown in Figure 5. The prediction curves of the SVR model were closer to the real values than those of the GRU and XGBoost models, particularly for the prediction curves of the GRU model, which fluctuated more than those of the XGBoost and SVR models. The values of the GRU and XGBoost models deviated more from the predicted value when the humidity value was approximately 85%, whereas SVR achieved better coverage. This might have occurred because GRU models exhibit issues of gradient disappearance or explosion for increasing amounts of data and model volume, and XGBoost suffers from the difficulty of tuning parameters to cope with high-dimensional nonlinear features. In contrast, SVR models, the computational complexity of which is not highly correlated with the dimensionality of the input parameters, demonstrate better generalization ability and prediction accuracy when dealing with high-dimensional data. For a single model, the prediction accuracy of the SVR model was higher than that of the GRU and XGBoost models. The MAE, RMSE, MSE, and NRMSE values of the SVR model were reduced by 76%, 30%, 51%, and 30%, respectively, compared with those of GRU, and 76%, 19%, 34%, and 19%, respectively, compared with those of XGBoost. Compared with GRU and XGBoost, the R 2 of SVR improved by 9% and 4%, respectively. This indicated that the SVR model exhibited good adaptability to sheep barn humidity data. The real observed values of the local humidity and the predicted values of the GRU, XGBoost, and SVR models are shown in Figure 5. The prediction curves of the SVR model were closer to the real values than those of the GRU and XGBoost models, particularly for the prediction curves of the GRU model, which fluctuated more than those of the XGBoost and SVR models. The values of the GRU and XGBoost models deviated more from the predicted value when the humidity value was approximately 85%, whereas SVR achieved better coverage. This might have occurred because GRU models exhibit issues of gradient disappearance or explosion for increasing amounts of data and model volume, and XGBoost suffers from the difficulty of tuning parameters to cope with high-dimensional nonlinear features. In contrast, SVR models, the computational complexity of which is not highly correlated with the dimensionality of the input parameters, demonstrate better generalization ability and prediction accuracy when dealing with high-dimensional data. When the SVR algorithm hyperparameters were globally searched and optimi using three different search algorithms (i.e., GA, GWO, and CGWO), the SVR algorit optimized using CGWO exhibited lower MAE, RMSE, MSE, and NRMSE values of 8 43%, 73%, and 48%, respectively, compared to the SVR algorithm optimized using G whereas R 2 was improved by 1%. Compared with the GWO-optimized SVR algorith the MAE, RMSE, MSE, and NRMSE values were reduced by 4%, 30%, 48%, and 2 respectively, and R 2 was improved by 0.4%. Thus, CGWO outperformed GA and GWO terms of optimization and could maintain a good balance between local and glo searches. Moreover, the real measurement values of humidity localized in the validat set were compared with the predicted values obtained from the GA-SVR, GWO-SVR, a CGWO-SVR models ( Figure 6). Using the optimization algorithm to perform a glo search of the SVR hyperparameters significantly improved the prediction performanc the SVR model, and the CGWO-optimized SVR model was able to fit the original cu more ideally. The result was closer to the true values in the prediction of peaks than models optimized by GA and GWO. Compared with GA, the GWO algorithm introdu nonlinear decreasing factors and random coefficient vectors to increase the randomn of the distance between other GW individuals and the head, which improved detection ability of the group. The results obtained in this study, by introducing ch operators into the GWO optimization algorithm to form the CGWO optimizat algorithm, further validated that chaos operators can overcome the shortcomings of GWO algorithm's tendency to fall into local optima and its low efficiency in p planning. When the SVR algorithm hyperparameters were globally searched and optimized using three different search algorithms (i.e., GA, GWO, and CGWO), the SVR algorithm optimized using CGWO exhibited lower MAE, RMSE, MSE, and NRMSE values of 80%, 43%, 73%, and 48%, respectively, compared to the SVR algorithm optimized using GA, whereas R 2 was improved by 1%. Compared with the GWO-optimized SVR algorithm, the MAE, RMSE, MSE, and NRMSE values were reduced by 4%, 30%, 48%, and 28%, respectively, and R 2 was improved by 0.4%. Thus, CGWO outperformed GA and GWO in terms of optimization and could maintain a good balance between local and global searches. Moreover, the real measurement values of humidity localized in the validation set were compared with the predicted values obtained from the GA-SVR, GWO-SVR, and CGWO-SVR models ( Figure 6). Using the optimization algorithm to perform a global search of the SVR hyperparameters significantly improved the prediction performance of the SVR model, and the CGWO-optimized SVR model was able to fit the original curve more ideally. The result was closer to the true values in the prediction of peaks than the models optimized by GA and GWO. Compared with GA, the GWO algorithm introduces nonlinear decreasing factors and random coefficient vectors to increase the randomness of the distance between other GW individuals and the head, which improved the detection ability of the group. The results obtained in this study, by introducing chaos operators into the GWO optimization algorithm to form the CGWO optimization algorithm, further validated that chaos operators can overcome the shortcomings of the GWO algorithm's tendency to fall into local optima and its low efficiency in path planning. The data in Table 4 indicate that the MAE, RMSE, MSE, and NRMSE of the SVR model optimized by CGWO were reduced by 5%, 24%, 43%, and 24%, respectively, compared with those of the model optimized by CGWO with XGBoost, which further demonstrated the superior fitting ability of SVR for nonlinear problems. The MAE, RMSE, The data in Table 4 indicate that the MAE, RMSE, MSE, and NRMSE of the SVR model optimized by CGWO were reduced by 5%, 24%, 43%, and 24%, respectively, compared with those of the model optimized by CGWO with XGBoost, which further demonstrated the superior fitting ability of SVR for nonlinear problems. The MAE, RMSE, MSE, and NRMSE of the LightGBM-CGWO-SVR model obtained by further reducing the dimensionality of the data using LightGBM on top of the CGWO-SVR model were reduced by 9%, 25%, 44%, and 26%, respectively, compared with those of CGWO-SVR. This demonstrated that feature extraction of sheep barn data using LightGBM could reduce the input dimensionality of the model, filter redundant information and irrelevant features, reduce the size of the algorithm, and enhance the generalization performance of the model. In addition, the MAE, RMSE, MSE, and NRMSE of the LightGBM-SVR model obtained using LightGBM to reduce the dimensionality were 25%, 3%, 6%, and 3%, respectively, lower than those of the single SVR model, which implied that using LightGBM to reduce the dimensionality of the data could improve the accuracy of the model. This was supported by the comparison of the local datasets predicted by the single SVR, CGWO-SVR, and LightGBM-CGWO-SVR models with respect to humidity (Figure 7). The data in Table 4 indicate that the MAE, RMSE, MSE, and NRMSE of the SVR model optimized by CGWO were reduced by 5%, 24%, 43%, and 24%, respectively, compared with those of the model optimized by CGWO with XGBoost, which further demonstrated the superior fitting ability of SVR for nonlinear problems. The MAE, RMSE, MSE, and NRMSE of the LightGBM-CGWO-SVR model obtained by further reducing the dimensionality of the data using LightGBM on top of the CGWO-SVR model were reduced by 9%, 25%, 44%, and 26%, respectively, compared with those of CGWO-SVR. This demonstrated that feature extraction of sheep barn data using LightGBM could reduce the input dimensionality of the model, filter redundant information and irrelevant features, reduce the size of the algorithm, and enhance the generalization performance of the model. In addition, the MAE, RMSE, MSE, and NRMSE of the LightGBM-SVR model obtained using LightGBM to reduce the dimensionality were 25%, 3%, 6%, and 3%, respectively, lower than those of the single SVR model, which implied that using LightGBM to reduce the dimensionality of the data could improve the accuracy of the model. This was supported by the comparison of the local datasets predicted by the single SVR, CGWO-SVR, and LightGBM-CGWO-SVR models with respect to humidity ( Figure  7).    Figure 9, where each point represents the error statistic of the predicted humidity value of a model compared with the actual value. Among them, the LightGBM-CGWO-SVR model was closest to the arc where the actual values were located, and had the shortest distance from the actual values, indicating that the model possessed the smallest root mean square error between the predicted and actual values. Moreover, the Taylor plot shows that the proposed model exhibited the highest correlation coefficient compared to the other models.
validation set compared with the actual values are shown in Figure 9, where each point represents the error statistic of the predicted humidity value of a model compared with the actual value. Among them, the LightGBM-CGWO-SVR model was closest to the arc where the actual values were located, and had the shortest distance from the actual values, indicating that the model possessed the smallest root mean square error between the predicted and actual values. Moreover, the Taylor plot shows that the proposed model exhibited the highest correlation coefficient compared to the other models.  In addition, autocorrelations of the observed time series data are often an important factor affecting prediction performance. Table 5 shows the accuracy results of the model predicted by the proposed LightGBM-CGWO--SVR, which used 10-, 30-, 60-, and 90-min lag time datasets, respectively. The results showed the accuracy of the model developed slowly decreased over time. Despite all this, the observed humidity data with a lag of less than 100 min were sound information, and the prediction accuracy of the dataset with a In addition, autocorrelations of the observed time series data are often an important factor affecting prediction performance. Table 5 shows the accuracy results of the model predicted by the proposed LightGBM-CGWO-SVR, which used 10-, 30-, 60-, and 90-min lag time datasets, respectively. The results showed the accuracy of the model developed slowly decreased over time. Despite all this, the observed humidity data with a lag of less than 100 min were sound information, and the prediction accuracy of the dataset with a lag of 90 min also showed good performance. Its R 2 reached 0.9940, which was only 0.39% less than that of the dataset with a lag of 10 min.

Discussion
In a previous related study, He and Ma [15] obtained good performance on the training set using the BPNN and PCA combination forecasting model, with R 2 of 09768 and RMSE of 1.6745. However, the test set results with R 2 of 0.8842 showed that the model generalization ability was still insufficient. The RNN-LSTM prediction model developed by Hongkang et al. [11] achieved good performance to temperature in the greenhouse, but it was not satisfactory to humidity prediction, and the R 2 was only 0.82 for the dataset with a 10 min lag.
The proposed LightGBM-CGWO-SVR prediction model exhibited high prediction accuracy for the collected time-series data. It could help accurately predict changes in humidity pattern in sheep barn facilities at intervals of 10 min or even 90 min. Furthermore, it could effectively adapt to nonlinear and high-dimensional data on humidity in sheep barns and exhibited superior generalization ability and prediction accuracy, compared to advanced algorithms, such as GRU and XGBoost models. It is possible to apply constructed combination models to predict and regulate air humidity in intensive sheep farming barns.
It should be noted that the data was acquired in February, when Xinjiang is in winter and very cold. For this reason, the sheep house was not always strongly ventilated, but ventilated intermittently and regularly. That is to say, a large part of the humidity data in this data set was not significantly affected by ventilation. Of course, the ventilation factor should not be ignored in model construction for the warm season, when the sheep house would be frequently ventilated.

Conclusions
The final simulation results of the proposed LightGBM-CGWO-SVR model and a comparison with other advanced models implied the following points: (1) The characteristics of the environmental factors of the sheep barn were screened using LightGBM without considering environmental variables with small effects on humidity. The key influencing factors of CO 2 concentration, light intensity, air temperature, and PM 2.5 were finally screened from the nine variables, including air humidity, CO 2 concentration, PM 2.5 , PM 10 , light intensity, noise, TSP, NH 3 concentration, and H 2 S concentration, and used for modeling, which effectively improved the computational efficiency of the model and its accuracy. (2) The CGWO algorithm obtained by introducing chaotic operators in this study retained the advantages of a simple structure, few control parameters, and straightforward implementation of the GWO algorithm, while improving the global search capability of the traditional GWO algorithm and solving the problem of randomness and empiricality in the selection of SVR parameters. The optimized SVR algorithm exhibited higher prediction accuracy and better generalization ability than an SVR algorithm optimized using GA and GWO.
The minimum values of the MAE, RMSE, MSE, and NRMSE indices were 0.0662, 0.2284, 0.0521, and 0.0083, respectively, and R 2 achieved the highest value of 0.9973. The results indicated that the proposed model can provide an effective guide for the accurate prediction of humidity in sheep barns in for Suffolk sheep, in Xinjiang, and could provide a reference for future research on the prediction of other environmental factors, such as NH 3 and H 2 S concentration.