Non-Contact Temperature Monitoring in Dairy Cattle via Thermal Infrared Imaging and Environmental Parameters

Zhao, Kaixuan; Ge, Shaojuan; Chen, Yinan; Li, Qianwen; Guo, Mengyun; Nian, Yue; Ren, Wenkai

doi:10.3390/agriculture16030306

Open AccessArticle

Non-Contact Temperature Monitoring in Dairy Cattle via Thermal Infrared Imaging and Environmental Parameters

by

Kaixuan Zhao

^1,2,*,

Shaojuan Ge

¹,

Yinan Chen

¹,

Qianwen Li

¹,

Mengyun Guo

¹,

Yue Nian

¹ and

Wenkai Ren

¹

College of Agricultural Equipment Engineering, Henan University of Science and Technology, Luoyang 471023, China

²

Science & Technology Innovation Center for Completed Set Equipment, Longmen Laboratory, Luoyang 471023, China

^*

Author to whom correspondence should be addressed.

Agriculture 2026, 16(3), 306; https://doi.org/10.3390/agriculture16030306

Submission received: 4 December 2025 / Revised: 21 January 2026 / Accepted: 22 January 2026 / Published: 26 January 2026

(This article belongs to the Section Farm Animal Production)

Download

Browse Figures

Versions Notes

Abstract

Core body temperature is a critical physiological indicator for assessing and diagnosing animal health status. In bovines, continuously monitoring this metric enables accurate evaluation of their physiological condition; however, traditional rectal measurements are labor-intensive and cause stress in animals. To achieve intelligent, contactless temperature monitoring in cattle, we proposed a non-invasive method based on thermal imaging combined with environmental data fusion. First, thermal infrared images of the cows’ faces were collected, and the You Only Look Once (YOLO) object detection model was used to locate the head region. Then, the YOLO segmentation network was enhanced with the Online Convolutional Re-parameterization (OREPA) and High-level Screening-feature Fusion Pyramid Network (HS-FPN) modules to perform instance segmentation of the eye socket area. Finally, environmental variables—ambient temperature, humidity, wind speed, and light intensity—were integrated to compensate for eye socket temperature, and a random forest algorithm was used to construct a predictive model of rectal temperature. The experiments were conducted using a thermal infrared image dataset comprising 33,450 frontal-view images of dairy cows with a resolution of 384 × 288 pixels, along with 1471 paired samples combining thermal and environmental data for model development. The proposed method achieved a segmentation accuracy (mean average precision, mAP_50–95) of 86.59% for the eye socket region, ensuring reliable temperature extraction. The rectal temperature prediction model demonstrated a strong correlation with the reference rectal temperature (R² = 0.852), confirming its robustness and predictive reliability for practical applications. These results demonstrate that the proposed method is practical for non-contact temperature monitoring of cattle in large-scale farms, particularly those operating under confined or semi-confined housing conditions.

Keywords:

dairy cows; rectal temperature monitoring; instance segmentation; machine learning

1. Simple Overview

Monitoring the body temperature of dairy cows is essential for assessing their health and welfare. Traditional rectal temperature measurement, although accurate, is time-consuming and can cause discomfort and stress in animals. To overcome these limitations, this study developed a contactless and intelligent method for estimating cow body temperature using thermal infrared imaging combined with environmental information. A deep learning model was applied to automatically detect and extract the eye socket region from thermal images, which closely reflects the cow’s internal temperature. By integrating ambient temperature, humidity, wind speed, and light intensity, a random forest model was built to predict the rectal temperature. The results showed that the predicted temperatures were strongly correlated with rectal measurements, confirming the reliability of the method. This approach offers a practical and animal-friendly tool for continuous temperature monitoring in modern dairy farms, contributing to precision livestock farming and improved animal welfare.

2. Introduction

Measuring physiological indicators in animals plays a crucial role in monitoring their welfare and health [1]. Various physiological parameters of dairy cows typically reflect their physiological status and health condition, with core body temperature being the most representative physiological parameter [2]. Cows are thermostatic animals, normal physiological functions in which depend on a relatively constant core body temperature [3]. Changes in core body temperature directly or indirectly reflect the cow’s physical condition. Rectal temperature is recognized as the primary physiological indicator for assessing an animal’s thermal balance and is clinically used to represent the cow’s core body temperature [4,5]. The normal rectal temperature range for dairy cows is 37.5–39.5 °C, showing regular variations associated with physiological activities such as estrus, ovulation, pregnancy, and parturition. Convenient, accurate, and effective monitoring of temperature changes not only aids in estrus detection, pregnancy diagnosis, and parturition prediction but also enables proactive disease monitoring, prevention, and control [6].

Traditional body temperature monitoring in livestock is primarily conducted manually, which is time-consuming and poses potential disease transmission risks [7]. Moreover, this approach requires directly contacting or physically restraining the animals, which may induce stress responses, thereby affecting the stability and accuracy of physiological measurements [8].

Temperature monitoring methods are generally classified into contact and non-contact approaches. Contact-based monitoring in cattle mainly relies on high-precision sensors to capture temperature from specific body regions [9]. Common implementations include subcutaneous or intramuscular implantation or the insertion of temperature loggers into the rectum [10], vagina [11], subcutaneous tissue near the neck [12], or behind the ear [13]. When implanted beneath the skin or within the vagina, these devices typically provide higher measurement accuracy and sensitivity to minor temperature changes. For instance, Kou et al. [14] designed a device using a thermistor sensor enclosed in a protective casing, which was attached to the metatarsal region of dairy cows to enable automatic surface temperature monitoring.

However, contact-based approaches are often limited by poor animal compliance and the risk of stress responses during use. In addition, complex rearing environments and external disturbances can damage the sensors and lead to abnormal readings. Consequently, developing non-contact and intelligent temperature detection technologies has gained interest for achieving accurate and stress-free monitoring in precision livestock farming.

Infrared thermal imaging (IRT) is emerging as the primary method for measuring an animal’s body surface temperature due to it being non-contact and rapid as well as enabling real-time monitoring [15]. M. Z. Lu et al. [16] collected 600 datasets from 20 piglets using infrared thermography (IRT). They employed a Support Vector Machine (SVM) classifier combined with contour features to identify the ear base region and extracted the highest temperature within this region as the ear base temperature, enabling automated measurement from top-view thermal images. Similarly, D. He et al. [17] captured lateral thermal images of dairy cows and proposed an automated eye temperature detection method based on a skeletal tree model, achieving a mean error of 0.35 °C in estimating eye socket temperature. These studies demonstrate the potential of IRT for non-invasive and automated body temperature monitoring in livestock.

Nevertheless, IRT measurements are highly susceptible to environmental factors. Gloster et al. [18] showed that ambient temperature affected infrared readings of cattle hooves, with the largest variation occurring under lower temperatures. Church et al. [19] found that humidity had minimal impact at a 1 m distance, but became significant at greater distances and higher temperatures. Additionally, wind speeds of 12 km/h introduced an error of 0.78 °C, while direct solar radiation caused a 0.6 °C difference between eyes over 30 min compared to shaded conditions. These findings indicate that airflow, solar radiation, and other environmental factors can directly influence surface temperature measurements.

Accurate assessment of core body temperature relies on reliable reference indicators. Rectal temperature correlates closely with core body temperature [20]. To enhance the accuracy and convenience of non-contact rectal temperature detection, this study proposes a method for detecting cow rectal temperature by integrating thermal imaging with environmental factor compensation. Thermal infrared images of the facial region are acquired, and deep learning techniques are employed to localize and segment the eye region. By combining the segmented regions with a temperature matrix and accounting for environmental factors—including ambient temperature, humidity, wind speed, and light intensity—this approach enables automatic extraction of eye temperature and precise prediction of cattle rectal temperature. The contributions of this paper are as follows.

(1): A thermal imaging and environmental parameter acquisition platform for dairy cow heads has been established, laying the foundation for subsequent body temperature prediction research.
(2): A cascaded deep learning approach for segmenting the cow’s eye region was proposed to reduce the influence of environmental conditions and animal movement on eye socket localization.
(3): A non-contact method for predicting rectal temperature has been developed by fusing thermal imaging with environmental data, aimed at achieving precise estimation of dairy cows’ rectal temperature under real-world farming conditions.

3. Materials and Methods

3.1. Data Acquisition

The experimental data were collected on 10 July 2025 at Shengsheng Ranch in Luoyang City, a region experiencing a temperate continental monsoon climate. The annual average wind speed excluding calm periods is 3.2 m per second, annual average temperature ranges from 12.2 to 24.6 °C, annual precipitation is 528–800 mm, and annual sunshine duration reaches 2200–2300 h, with annual relative humidity maintained between 60 and 70%. The farm employs open-style barns with barred-style rearing. The experimental barn is a covered free-stall structure, featuring a fully roofed design with open sides that allow natural ventilation. This semi-enclosed configuration effectively shields the animals from direct solar radiation; therefore, solar radiation was not included as an independent environmental variable in this study. The ambient temperature, relative humidity, wind speed, and light intensity were continuously monitored to characterize the microclimatic conditions within the barn. A comprehensive spray system combining fans and sprinklers was installed in the feeding area, cycling every 4–5 min. Each cycle included approximately 30 s of spraying to fully wet the cows’ backs, followed by fan drying to achieve temperature reduction.

The test subjects were 28 Holstein dairy cows, aged 3.25–7.33 years (mean 4.67 ± 1.18 years), with body weights ranging from 728 to 832 kg (mean 788.35 ± 31.88 kg). At the time of the study, 21 cows were in lactation and 7 cows were in the dry period. When cows extended their heads outside the stall bars to feed, a headlock feeding system restrained them to prevent excessive head movement. This experiment employed a MAG32 (Shanghai Magnity Technologies Co., Ltd., Shanghai, China) thermal camera to capture thermal imaging videos of the dairy cows. An STM32 (STMicroelectronics N.V., Plan-les-Ouates, Switzerland) development board, MB016 temperature/humidity sensor (Guangzhou Xingyi Electronic Technology Co., Ltd., Guangzhou, China), BH1750 light sensor (Rohm Semiconductor, Kyoto, Japan), and WS3054 ultrasonic anemometer (Chengdu SenTec Technology Co., Ltd., Chengdu, China) were used, comprising an environmental parameter acquisition device. This device recorded barn conditions including temperature, humidity, light intensity, and wind speed during data collection. A computer collected sensor data, while farm staff measured rectal temperature using Youmu veterinary electronic thermometers (Zhengzhou Youmu Agricultural Technology Co., Ltd., Zhengzhou, China) inserted into the cows’ rectums. Basic parameters of various sensors are shown in Table 1.

This experiment treated the collection of thermal infrared video, rectal temperature, and concurrent environmental data—including environmental temperature, relative humidity, light intensity, and wind speed—from a cow as a single data acquisition task. The thermal camera and environmental data logger were mounted 2 m from the stall rail and 0.8 m above the ground for data collection. During the cow’s feeding period, a thermal imaging video of the animal’s face was recorded using the thermal camera, with the imager’s emissivity set to 0.98. Environmental data was recorded and rectal temperature measurements were taken concurrently with the thermal imaging video capture. The ear tag number for each cow was recorded so that every dataset could be accurately linked to its specific cow ID. The data acquisition device is shown in Figure 1. In the experiment, 28 datasets were collected. The thermal imaging video for each cow lasted approximately 1 min, recorded at 25 frames per second (fps), totaling 42,000 frames, and with a video resolution of 384 pixels (horizontal) × 288 pixels (vertical). The environmental parameter collection device operated at 6 fps, gathering a total of 10,080 environmental data points.

3.2. Dataset Construction

The thermal camera was connected to the computer through Ethernet, transmitting and storing the captured thermal imaging video on the computer. Meanwhile, the environmental parameter acquisition device transmitted environmental data to the computer through a serial port controlled by a microcontroller.

3.2.1. Preprocessing of Thermal Imaging Data

The captured thermal infrared video was analyzed using ThermoX (v2.5.4), the accompanying software for the thermal imaging cameras. It was discovered that the pseudo-color temperature scale on the right side of the video employs a dynamic adaptive mapping method, in which its displayed temperature range adjusts in real time based on the highest and lowest temperatures within the current frame, lacking a fixed temperature range. To eliminate visual errors caused by this, we fixed the temperature range of the color temperature scale to ensure that color changes only reflected actual temperature variations, enhancing the quantitative interpretability of images and the stability of data analysis. The specific steps of this process are shown in Figure 2, and temperature thermal imaging remapping of the cow’s head region is shown in Figure 3.

3.2.2. Dataset

When creating the thermal imaging dataset, the remapped pseudo-color images were first manually inspected to remove those where the left or right eye socket area was not visible due to head movement. This process yielded a final dataset of 33,450 thermal infrared images. The head and eye socket regions of cows were manually annotated in thermal infrared images; examples of annotated images are shown in Figure 4. The dataset was then randomly divided into training, validation, and test sets at a 7:2:1 ratio.

During the construction of the rectal temperature prediction dataset, the mismatch in frame rates between the environmental monitoring device and the thermal infrared camera made frame-by-frame alignment infeasible. Thermal infrared data were therefore synchronized with environmental parameters at one-second intervals. The environmental monitoring device recorded ambient temperature, relative humidity, wind speed, and light intensity at 6 fps, and the per-second averages of these parameters were used for synchronization. For thermal data, the first frame within each second was selected, and the mean eye-region temperature in that frame was extracted as the model input variable, with the rectal temperature as the output target. Each dataset record thus represented one second of data, including the averaged environmental parameters, the mean eye-region temperature, and the corresponding rectal temperature. The final dataset, collected from 28 Holstein dairy cows, comprised 1471 records and was divided by cow identity into training and validation subsets at an 8:2 ratio to ensure individual independence.

To provide a clear overview of the complete methodology, including data acquisition, eye socket localization, eye-region temperature extraction, and rectal temperature prediction using the random forest model, an overall flow diagram is presented in Figure 5.

3.3. Eye Socket Detection Based on Cascade Deep Learning

Deep learning is a branch of artificial intelligence in which high-level features from large datasets are learned, enabling it to surpass traditional machine learning and find widespread application in the field of agricultural breeding [21]. In this study, infrared thermal imaging was combined with deep learning to automatically segment key temperature measurement regions in dairy cows and extract corresponding surface temperatures. Considering the uncertainty of cows’ feeding positions, thermal infrared images often captured multiple individuals; therefore, direct eye segmentation could lead to mismatches between eye temperature and individual identity. To ensure accurate temperature extraction, a cascading strategy was implemented, in which the cow head region was first detected and the eye socket area was subsequently segmented within the detected head image.

3.3.1. Cow Head Detection Based on YOLO

The YOLO (You Only Look Once) series of algorithms is an end-to-end object detection model based on deep convolutional neural networks [22]. By transforming the detection task into a single regression problem, it achieves fast and accurate object localization and classification [23]. The YOLO network, released by Ultralytics (Frederick, MD, USA), is the first to integrate object detection, instance segmentation, and image classification tasks. Building upon YOLOv5, it introduces the C2f module to replace the C3 structure and employs a multi-scale feature fusion mechanism to significantly enhance detection performance for objects of varying sizes. During training, the model incorporates diverse data augmentation strategies, including random scaling, cropping, flipping, and color perturbations. It also adopts the approach of deactivating Mosaic augmentation in later stages, as observed in YOLOX, to improve accuracy [24]. Considering detection accuracy, inference speed, and generalization capability comprehensively, we selected YOLOv8 as the detection model in this study.

The annotated dataset was then trained using the YOLOv8 network, during which batch sizes of 16 and 240 epochs were employed, with data loading by 8 parallel threads. To ensure detection performance and convergence stability, parameter updates utilized the Stochastic Gradient Descent (SGD) optimizer with an initial learning rate of 0.01 and weight decay of 0.0005.

3.3.2. Segmentation of the Eye Socket Based on OH-YOLO

Thermal infrared images were first processed using the trained detection model to locate and crop the cow head region based on the predicted bounding boxes, generating images containing only the head. Cropped head images were then manually screened to remove accidental inclusions of other cows, ensuring that the extracted eye regions corresponded accurately to individual identities.

Eye socket segmentation was performed using an optimized and lightweight model, OH-YOLO, developed based on the YOLOv8n-seg framework. In OH-YOLO, conventional convolutional blocks in the backbone were replaced with Online Convolutional Re-parameterization (OREPA) modules to reduce parameter redundancy and improve efficiency, while the High-level Screening-feature Fusion Pyramid Network (HSFPN) was introduced in the neck to enhance multi-scale feature fusion and segmentation robustness. The overall network architecture of OH-YOLO is shown in Figure 6.

OREPA Module

OREPA (Online Convolutional Re-parameterization) (structure shown in Figure 7) is a two-stage training framework designed for structural reparameterization models, simplifying complex multi-branch modules during training into a single convolutional operation to significantly reduce computational and memory overhead [25].

The overall OREPA process consists of two phases: the first is Block Linearization, which utilizes a specialized linear scaling layer to optimize the performance of online blocks; the second is Block Squeezing, which leverages the linear additivity and associativity of convolutions to merge multi-layer, multi-branch linear structures into a single equivalent convolutional kernel (OREPA Conv). This significantly reduces feature-level computations and memory consumption.

2.: HS-FPN Module

HSFPN [26], illustrated in Figure 8, consists of two components: feature selection and feature fusion. The feature selection module comprises two key components: Channel Attention (CA) and Dimension Matching (DM). The CA module processes the input feature map

f_{i n} ϵ R^{(C \times H \times W)}

(where C, H, and W denote the number of channels, height, and width, respectively) by employing both global average pooling and global max pooling to compute the average and maximum values for each channel. DM performs channel compression using a 1 × 1 convolution to align the channel dimension of multi-scale feature maps.

The feature fusion module primarily comprises a Selective Feature Fusion (SFF) mechanism, which employs high-level features as weights to filter out important semantic information embedded within low-level features, thereby enabling strategic feature fusion [27]. Given high-level feature

f_{h i g h} ϵ R^{(C \times H \times W)}

and low-level feature

f_{l o w} ϵ R^{(C \times H_{1} \times W_{1})}

, the former is first up-sampled and aligned via transposed convolution and bilinear interpolation to produce

f_{a t t} ϵ R^{(C \times H_{1} \times W_{1})}

. The aligned high-level feature is converted into attention weights through the CA module, which are then applied to filter low-level features. The refined low-level features are fused with the high-level features, yielding the output feature

f_{a t t} ϵ R^{(C \times H_{1} \times W_{1})}

. The fusion process for feature selection can be formulated as follows:

f_{a t t} = B L (T - C o n v (f_{h i g h}))

(1)

f_{o u t} = f_{l o w} * C A (f_{a t t}) + f_{a t t}

(2)

In the equations,

f_{h i g h}

and

f_{l o w}

denote the input high-level and low-level features;

T - C o n v

represents the transposed convolution operation;

B L

indicates the bilinear interpolation;

f_{a t t}

corresponds to the transformed high-level feature after processing; and

f_{o u t}

denotes the output feature map after fusion.

3.3.3. Detection Model Evaluation Metrics

To comprehensively evaluate the performance of these models, this study analyzes them in terms of two dimensions: accuracy metrics and efficiency metrics. Accuracy was assessed with

“ A P ”

(average precision) and

“ m A P ”

(mean average precision).

A P

measures detection accuracy for a single class as the area under the precision–recall curve, while

m A P

averages

A P

across all classes.

{A P}_{50}

considers a detection correct when the Intersection over Union

(I o U)

exceeds 0.5, and

{m A P}_{50 - 95}

averages AP across

I o U

thresholds from 0.50 to 0.95 in steps of 0.05, reflecting detection and localization performance under varying overlap requirements. The two indicators are calculated as shown in Equations (3) and (4).

A P = \int_{0}^{1} P (R) d R

(3)

m A P = \frac{1}{n} \sum_{1}^{n} {A P}_{i}

(4)

In the formulas,

P

represents the precision value of the detection result,

R

denotes the recall rate,

n

is the total number of categories in the dataset, and

{A P}_{i}

is the average precision for category

i

.

Efficiency was evaluated using the number of parameters (Params), computational complexity (GFLOPs), model size, and image processing time per frame. Params indicate model structural complexity, GFLOPs quantify floating-point operations during inference, model size reflects storage demands, and processing time measures real-time capability. Together with accuracy metrics, these indicators provide a comprehensive assessment of the model’s detection performance, computational efficiency, and deployment feasibility.

3.4. Temperature Extraction in the Eye Socket Area

In Section 3.2.1, during the thermal imaging preprocessing stage, we obtained the raw temperature matrix for each frame. Each element of this matrix corresponds one-to-one with the temperature value of a single pixel in the image. Therefore, when extracting temperature data from the eye socket region of dairy cows, the coordinates

(X_{p r e}, Y_{p r e})

obtained from the segmentation of the eye socket region can be mapped to the indices in the temperature matrix. By extracting the corresponding temperature values, the average temperature of the eye region can be obtained.

However, prior to locating the eye socket region, to ensure segmentation accuracy, we performed image cropping based on the head region bounding box information predicted by the detection model—specifically, the top-left corner coordinates

(X_{0}, Y_{0})

and bottom-right corner coordinates

(X_{1}, Y_{1})

. Subsequently, eye socket segmentation was performed on the cropped head region image, which resulted in segmentation coordinates based on the cropped image coordinate system. To achieve a one-to-one correspondence between segmented coordinates and temperature matrix indices, the corresponding cropping offset was added to convert the coordinates back to the original image coordinate system, as calculated using Equations (5) and (6). Finally, using the restored original coordinates

(X_{g l o b a l}, Y_{g l o b a l})

of the restored eye socket region, the corresponding pixel temperature values were extracted from the temperature matrix. Their mean value was then calculated to determine the temperature of the eye socket region.

X_{g l o b a l} = X_{p r e} + X_{0}

(5)

Y_{g l o b a l} = Y_{p r e} + Y_{0}

(6)

In the equations,

(X_{p r e}, Y_{p r e})

represents the coordinates of the segmented eye socket,

(X_{0}, Y_{0})

denotes the upper-left corner coordinates of the head detection box, and

(X_{g l o b a l}, Y_{g l o b a l})

indicates the coordinates of the eye socket region within the original thermal infrared image.

3.5. Established Rectal Temperature Prediction Model

Since thermal imaging technology is based on the radiative transfer relationship between the infrared emission intensity of an object and its surface temperature, its measurements are inevitably influenced by environmental conditions [28]. Specifically, ambient temperature and wind speed affect the surface heat exchange between the animal and its surroundings, while humidity and light intensity influence the transmission of infrared radiation and image quality. Therefore, based on the average eye socket temperature extracted from the thermal matrix, we further incorporated environmental parameters (including ambient temperature, relative humidity, wind speed, and light intensity) as additional input features into the random forest model to minimize environmental interferences. By learning the relationships between environmental variations and eye socket temperature, the model compensates for measurement deviations caused by fluctuating environmental conditions, thereby improving the accuracy and stability of body temperature prediction.

To further quantify the relationships between each environmental parameter and eye socket temperature, a correlation analysis was conducted using Statistical Package for the Social Sciences (SPSS) software (v26.0). Pearson’s correlation coefficients were calculated for each pair of variables to assess their linear associations [29]. This analysis provides additional statistical evidence of how environmental factors may influence eye socket temperature measurements and their potential contribution to the random forest prediction model.

3.5.1. Random Forest Model

Given that rectal temperature prediction is a regression problem, we employ random forest (RF) as the predictive model. RF is a supervised integrated learning method whose basic unit is a decision tree, which is a simple predictive model that stratifies the input data space into output regions [30]. The prediction of the output region of a decision tree is the average value of the response variable that falls within that output region in the training dataset. RF is a forest composed of multiple decision trees. It generates a large training dataset through random sampling and independently constructs each decision tree. Following this, it votes on or averages the prediction results to generate the regression outcome. Taking advantage of the variance and diversity among the decision trees, RF demonstrates remarkable robustness and generalizability [31]. A basic schematic of the RF model for rectal temperature prediction is shown in Figure 9.

3.5.2. Model Evaluation Metrics

After establishing the predictive model, to broadly evaluate its performance, the mean square error (MSE), mean absolute error (MAE), and coefficient of determination (R²) were adopted as metrics to evaluate the model. MSE represents the mean squared error between predicted values and actual values; MAE measures the average absolute error level of the model predictions; and R² reflects the degree of fit between the predicted values and the actual values, in which the closer R² is to 1, the better the fit of the model.

M S E = \frac{1}{n} \sum_{i = 1}^{n} {(T_{i} - {\hat{T}}_{i})}^{2}

(7)

M A E = \frac{1}{n} \sum_{i = 1}^{n} (T_{i} - {\hat{T}}_{i})

(8)

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(T_{i} - \hat{T})}^{2}}{\sum_{i = 1}^{n} {(T_{i} - \bar{T})}^{2}}

(9)

In the formulas,

n

denotes the number of rectal temperatures in the test set;

T_{i}

represents the true value of the i-th rectal temperature in the test set,

{\hat{T}}_{i}

denotes the predicted value of the i-th rectal temperature in the test set, and

\bar{T}

is the average of the true temperature values in the test set.

4. Results

4.1. Cow Head Object Detection Results

The test set was divided into the trained cow head detection model to evaluate its performance in locating cow head regions, the results of which indicate that the proposed object detection model performs exceptionally well. Ultimately, for thermal infrared images with a resolution of 384 × 288, the average precision (AP) for detecting targets in the cow head region achieved 100% at

{A P}_{50}

, 99.01% at

{A P}_{75}

, and 98.35% at

{m A P}_{50 - 95}

, demonstrating the model’s high accuracy and stability in target localization.

4.2. Eye Socket Recognition Results

The eye socket area was further segmented among the detected cow head targets. The overall detection results of the cow head and eye socket are shown in Figure 10.

4.2.1. Segmentation Results of Different Models

To evaluate the segmentation performance of the proposed model in this study, comparative experiments were conducted between the improved model and other mainstream segmentation models, including Mask-RCNN and the YOLO series, which are commonly used. All models were tested in the same dataset, and the experimental results are shown in Table 2.

As shown in the table, the OH-Yolo model demonstrates optimal performance in segmentation tasks, achieving

{A P}_{50}

,

{A P}_{75}

, and

m {A P}_{50 - 95}

values of 99.5%, 98.99%, and 86.59%, respectively. It maintains high detection accuracy across different IoU thresholds, outperforming other comparison models. In terms of model complexity, OH-Yolo has 2.178 million parameters, representing reductions of approximately 95.0%, 94.2%, and 33.1% compared to Mask R-CNN, YoloV7, and YoloV8, respectively. Its computational load (GFLOPs) is 9.7, representing reductions of 92.8%, 93.2%, and 19.2% compared to Mask R-CNN, YOLOv7, and YOLOv8, respectively. The model size is 4.36 MB, reduced by 330.7 MB, 285.6 MB, and 2.13 MB compared to Mask R-CNN, YoloV7, and YoloV8, respectively, significantly lowering storage requirements. Regarding the inference speed, OH-Yolo processes each image in 2.57 ms—approximately 77.6% faster than Mask R-CNN, 53.4% faster than YoloV7, and 9.5% faster than YoloV8. This means that it achieves significant real-time performance gains while maintaining high accuracy. In summary, the improved OH-Yolo shows significant advantages over Mask R-CNN, Yolov7, and Yolov8 in terms of accuracy, computational complexity, model size, and inference efficiency, demonstrating its comprehensive strengths in practical segmentation tasks.

To visually demonstrate the segmentation results of each model on actual images, images randomly selected from the test set were used to test the models, the results of which are shown in Figure 11. It can be observed that the Mask-RCNN model performed poorly in this task, showing frequent missed and false detections. In contrast, the YOLO series achieved a higher segmentation accuracy and stability. YOLOv7 and YOLOv8 produced clearer eye-socket boundaries, whereas the proposed OH-YOLO maintained a comparable accuracy with improved robustness under minor head movements and lighting variations. As supported by Table 2, the lightweight OH-YOLO architecture sustains high-quality segmentation while offering faster inference and greater suitability for edge deployment.

4.2.2. Eye Socket Model Ablation Test

Ablation studies were conducted to investigate the impact of innovative modules on the performance of eye region segmentation. To validate the effectiveness of improvements in individual modules, a series of ablation tests were conducted on the dataset. This study extends the YOLOv8 architecture by integrating the OREPA module and the HS-FPN module, thereby implementing a series of targeted enhancements designed to improve the accuracy and robustness of eye socket segmentation. The performance comparison results between the new model and the original model are shown in Table 3.

As shown in Table 3, among them, “✗” indicates that this module is not selected, and “✓” indicates that this module is selected. The original model (Experiment 1) exhibits a high segmentation performance, with average precision

{A P}_{50}

,

{A P}_{75}

, and

{m A P}_{50 - 95}

values reaching 99.5%, 98.99%, and 86.85%, respectively. However, its drawbacks include its numerous parameters, high computational complexity, and substantial model memory requirements, which pose challenges for deployment on resource-constrained devices. After replacing the original backbone network’s convolutional module with the OREPA module (Experiment 2), the model maintained detection accuracy while reducing the computational complexity by 9.2% and shortening the inference time from 2.84 ms to 2.60 ms, demonstrating higher structural efficiency. When replacing the original PANet in the neck structure with the HS-FPN module (Experiment 3), the model maintained a stable detection performance while significantly reducing parameters by 28.2%, decreasing model size by 25.1%, and slightly shortening the image inference time. Further integrating both the OREPA and HS-FPN modules (Experiment 4), the fusion model maintained the high accuracy of the original models while reducing the parameters, floating-point operations, and memory consumption by 33.1%, 19.2%, and 32.8%, respectively. The improved model also achieved a 9.5% increase in inference speed.

To further illustrate the performance variation across different ablation settings, the precision (P), mAP₅₀, and mAP_50–95 metrics were visualized, as shown in Figure 12. As observed, these indicators remain consistently high across all experiments, indicating that the proposed structural modifications do not compromise segmentation accuracy. The highly overlapping curves further demonstrate that the integration of the OREPA and HS-FPN modules enables the model to maintain high precision while achieving improved computational efficiency and structural compactness. Taken together with Table 3, the results of the ablation experiments verify the reliability of the proposed lightweight optimization strategy, achieving a balanced trade-off between structural simplification and segmentation quality.

4.3. Eye Socket Temperature Extraction Results

In this study, the average eye temperature of each cow was extracted using a method combining a temperature matrix derived from thermal infrared images with eye region segmentation and localization. The results showed that the infrared thermography system could stably identify the eye region of dairy cows and output reliable temperature information. The extracted eye temperature values exhibited noticeable inter-individual variation. Figure 13 presents the infrared eye temperature and rectal temperature of 28 dairy cows.

Overall, the eye temperature was lower than the rectal temperature; however, both measurements displayed a generally consistent variation pattern among individuals, indicating that cows with higher eye temperatures tended to have higher rectal temperatures as well. These results demonstrate that the proposed infrared thermography-based eye temperature extraction method can effectively reflect both individual differences and overall variation trends in body temperature, providing a useful reference for subsequent temperature monitoring in dairy cows.

4.4. Rectal Temperature Prediction Results

A random forest model was constructed using the following inputs: the average temperature in the eye socket region, environmental temperature and humidity, wind speed, and light intensity data. The recorded rectal temperature was utilized as the model’s output variable. To achieve optimal predictive performance for the random forest model, we employed a trial-and-error approach to determine the model’s best hyperparameter settings. The number of decision trees (Ntrees) was tested over a range of 1 to 400, and the minimum leaf size (Nleafs) was evaluated across the values [5, 10, 20, 50]. Figure 14 shows the mean squared error between the predicted and actual rectal temperature for different Ntrees and Nleafs values.

The results suggest that models with fewer leaf nodes and larger tree sizes exhibit lower prediction errors; at the same time, model complexity and computational time must be taken into account. The combination of a small Nleaf value and a large Ntrees value incurs significant computational time and may lead to overfitting. When the Ntrees value exceeds 200, the prediction accuracy does not show a significant improvement. The model performance results under different hyperparameter settings indicate that the model performs best when the decision tree depth is set to 200 and the minimum number of observations per leaf node is set to 5; therefore, the final model structure consists of 200 decision trees with 5 observations per leaf node.

A random forest model using this structure was trained on the constructed dataset to obtain a temperature prediction model, which was then applied to analyze the test dataset and evaluate its performance. Subsequently, the predictions were compared with the actual rectal temperatures, with the results shown in Figure 15, which clearly demonstrates that the predicted temperature values align closely with the actual temperature readings. The established rectal temperature prediction model achieved an MSE of 0.117, MAE of 0.058, and R² of 0.852. These metrics indicate that the model possesses a strong fitting capability and excellent generalization performance for rectal temperature prediction tasks, enabling high-precision non-contact estimation of rectal temperature under complex environmental conditions.

4.5. Comparison of Different Prediction Algorithms

To validate the reliability and stability of the random forest method in establishing rectal temperature prediction models, we selected Multi-Layer Perceptron (MLP), Artificial Neural Network (ANN), decision trees (DTs), and the Adaptive Boosting algorithm (AdaBoost) as comparative models for performance analysis. Figure 16 presents a comparison of the performance of different models using three evaluation metrics: the mean squared error (MSE), mean absolute error (MAE), and coefficient of determination (R²). As shown in the figure, the random forest-based model achieves the lowest MSE and MAE values and the highest R² value among all of the compared models. This result demonstrates that the random forest method provides superior accuracy, stronger fitting capability, and greater stability in rectal temperature estimation, delivering the most reliable predictive performance in this study.

5. Discussion

5.1. Analysis of Eye Socket Segmentation Results

From the comparative analysis of different models, the YOLO series demonstrated markedly superior accuracy and computational efficiency in eye socket segmentation compared to Mask-RCNN. Owing to its two-stage detection and segmentation architecture, Mask-RCNN is less capable of accurately delineating small thermal targets with fine structural boundaries, such as cow eye sockets. In contrast, the one-stage YOLO framework unifies detection and segmentation into a single process, enabling a favorable balance between precision and real-time performance. Among these models, YOLOv8 served as a robust baseline exhibiting high segmentation accuracy, while the optimized OH-YOLO further enhanced structural compactness and inference efficiency without compromising accuracy, making it more suitable for deployment in edge-computing or resource-limited livestock environments.

In the ablation experiments, each incorporated module contributed targeted functional improvements. The OREPA module effectively reduced redundant convolutional computations and enhanced backbone feature extraction efficiency, thereby producing cleaner and more discriminative feature representations for subsequent layers. The HS-FPN module strengthened multi-scale feature fusion, improving the model’s robustness to variations in head orientation and eye socket size. When integrated, these modules formed the OH-YOLO architecture, which achieved a refined balance between segmentation precision and computational efficiency. Although performance indicators such as

{A P}_{50}

and

{A P}_{75}

exhibited only marginal changes across experiments, this is primarily attributed to the characteristics of the task—eye socket regions in thermal infrared images present high contrast, clearly defined boundaries, and minimal background interference. Furthermore, head cropping and pseudo-color remapping in preprocessing enhanced edge distinctness, leading to a saturation effect in high-accuracy metrics.

Overall, the results of the ablation and comparative analyses confirm that the proposed optimization strategy significantly improves computational efficiency and structural robustness while maintaining segmentation accuracy. The reductions in parameter count, computational complexity, and inference time collectively demonstrate the practical potential of the proposed approach for intelligent livestock management and real-world edge deployment.

5.2. Impact of Environmental Factors

A Pearson correlation analysis was conducted to evaluate the influence of ambient temperature, relative humidity, wind speed, and light intensity on eye socket temperature. The results indicated that ambient temperature and wind speed were the primary environmental factors affecting eye socket temperature, showing statistically significant correlations, whereas the effects of light intensity and humidity were relatively minor. Detailed correlation coefficients and significance levels are presented in Table 4.

These findings are consistent with the physical principles of thermal infrared thermography. In animals, higher ambient temperatures increase the surface temperature, which could have elevated the measured eye socket temperature in this study. In contrast, stronger wind speeds accelerate convective heat exchange between the animal’s body surface and the surrounding air, which could have decreased the eye socket temperature in this study. Light intensity may slightly influence thermal measurements through localized reflection or surface heating, but this effect was limited under the experimental conditions. The relative humidity showed no significant effect on infrared radiation transmission within the tested range and thus had a negligible impact on the temperature measurement.

In constructing the random forest model for body temperature prediction, ambient temperature, wind speed, humidity, and light intensity were incorporated as additional input features. By learning the relationships between eye socket temperature and these environmental parameters, the model effectively compensated for temperature deviations induced by environmental fluctuations. Consequently, the model maintained a high prediction accuracy and stability even under varying thermal and illumination conditions.

The rectal temperature prediction approach presented in this paper was compared with other prediction models that combine thermal imaging with environmental factor compensation, as detailed in Table 5, in which “—” indicates no numerical value. F. K. Wang et al. [32] implemented a multi-sensor architecture with signal processing to correct for certain environmental influences, yet region of interest selection relied on manual localization. This approach not only increases the labor required but also introduces potential human error, which may compromise model stability. In contrast, the present study utilizes an improved OH-YOLO model for automatic eye socket detection, ensuring precise localization. By incorporating a more comprehensive set of environmental factors, the robustness and predictive accuracy of the model are substantially enhanced. A. K. Balhara et al. [33] employed a regression model to predict body temperature by integrating average eye temperature with ambient temperature. V. M. Pacheco et al. [34] and R. V. de Sousa et al. [35] applied ANNs to model multiple body regions or integrate environmental factors, achieving some level of thermal stress assessment; however, regional localization and temperature extraction still required manual operation, and the computational complexity limited the efficiency of large-scale monitoring. By contrast, the present study combines deep learning-based automatic eye socket localization with a random forest model, resulting in improved prediction accuracy and stability while enabling high-throughput monitoring and minimizing manual intervention.

In all the aforementioned studies, the eye socket region was chosen as the primary temperature measurement area because it provides the most reliable indication of rectal temperature. The region is particularly suitable for non-contact temperature measurement due to its dense capillary network and proximity to the brain, making the ocular surface temperature highly representative of systemic thermal status [8]. While maintaining the core principle of temperature measurement, consistent with previous studies, this work achieves a significant improvement in predictive performance through automated eye socket detection, comprehensive environmental factor integration, and lightweight modeling, substantially reducing both human effort and computational burden.

5.3. Study Limitations

Cows exhibit autonomous behaviors, such as head swinging, turning, and feeding, which can cause variations in the eye region captured by thermal infrared sensors. These natural movements, combined with the strict quality requirements applied during dataset construction, result in uneven data availability across individuals. Such imbalances may limit the model’s ability to fully capture individual-specific temperature patterns, particularly for cows with fewer samples. Despite this limitation, the proposed method maintained a relatively stable predictive performance across most individuals, indicating a degree of robustness to data sparsity and feature variability induced by natural motion. Nevertheless, addressing these challenges is essential for further enhancing the reliability and generalizability of non-contact rectal temperature estimation under dynamic and complex farm conditions.

5.4. Further Study

Building on the identified limitations, future research could implement tracking cameras combined with real-time image display and processing platforms to enable continuous and adaptive monitoring of the eye region. Extending the study from single- to multi-cow scenarios will also be a priority, as cows naturally congregate, creating complex interactions and variable postures that challenge current detection algorithms. Integrating individual identification methods, such as ear tags or facial recognition, with spatially aware target detection algorithms would allow accurate labeling and continuous thermal monitoring of multiple cows simultaneously. Such advancements would enhance the model’s adaptability to dynamic behaviors and improve the practicality of automated thermal monitoring systems, ultimately contributing to both animal welfare and operational efficiency in commercial dairy farming.

6. Conclusions

This study proposed an approach that integrated thermal imaging technology with environmental parameter assessment to accurately predict the body temperature of cows. The proposed lightweight detection model accurately localized and extracted the eye socket temperature from thermal images, achieving a high detection accuracy with a

{m A P}_{50 - 95}

value of 98.35% while reducing model complexity by more than 30%. By combining the extracted eye temperature with environmental factors, the random forest prediction model achieved a mean absolute error of 0.058 °C. The research method proposed in this paper enables rapid and accurate prediction of dairy cow body temperature. It not only provides technical support for health monitoring and inspection management in smart farms but also offers a feasible pathway for applying lightweight detection models using intelligent agricultural equipment.

Author Contributions

Conceptualization, S.G. and K.Z.; methodology, S.G. and K.Z.; software, S.G., M.G. and Y.N.; validation, Y.N. and W.R.; formal analysis, S.G., Y.C. and Q.L.; investigation, K.Z.; resources, S.G., Q.L. and Y.N.; data curation, Y.C., S.G., M.G. and W.R.; writing—original draft preparation, S.G.; writing—review and editing, Y.C., Q.L. and K.Z.; visualization, M.G., Q.L., Y.C. and W.R.; supervision, K.Z.; project administration, K.Z.; funding acquisition, K.Z. All authors have read and agreed to the published version of the manuscript.

Funding

This work was funded by the National Key R&D Plan Key projects (Grant No. 2023YFD2000702), University Science and Technology Innovation Talent Project of Henan Province (Grant No. 24HASTIT052), and Zhongyuan Young Top Talents Program for Scientific and Technological Innovation.

Institutional Review Board Statement

This study was approved by the Laboratory Animal Ethics Committee of Henan University of Science and Technology on 5 July 2025 (Issue No.: HAUST-025-C0705031). All experimental procedures involving animals were conducted in accordance with the institutional and national guidelines for the care and use of animals.

Data Availability Statement

The data sets presented in this article are not readily available because the data are part of an ongoing study. Requests to access the datasets should be directed to the corresponding author.

Acknowledgments

The authors acknowledge Luoyang Shengsheng Farm for facilitation of data acquisition and permission for data use.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhenjiang, C.; Jialiang, C.; Hongbo, Y.; Cheng, M. Application and research progress of infrared thermography in temperature measurement of livestock and poultry animals: A review. Comput. Electron. Agric. 2023, 205, 107586. [Google Scholar] [CrossRef]
Pinto, S.; Hoffmann, G.; Ammon, C.; Amon, T. Critical THI thresholds based on the physiological parameters of lactating dairy cows. J. Therm. Biol. 2020, 88, 102523. [Google Scholar] [CrossRef]
Molina-Benavides, R.A.; Perilla-Duque, S.; Campos-Gaona, R.; Sánchez-Guerrero, H.; Rivera-Palacios, J.C.; Muñoz-Borja, L.A.; Jiménez-Rodas, D. Effect of climate on thermal response in cows of different racial groups in lower tropic. Rev. MVZ Cordoba 2023, 28, e2921. [Google Scholar] [CrossRef]
Godyn, D.; Herbut, P.; Angrecka, S. Measurements of peripheral and deep body temperature in cattle—A review. J. Therm. Biol. 2019, 79, 42–49. [Google Scholar] [CrossRef] [PubMed]
Bewley, J.M.; Schutz, M.M. Recent studies using a reticular bolus system for monitoring dairy cattle core body temperature. In Proceedings of the First North American Conference on Precision Dairy Management (NAPDM), Toronto, ON, Canada, 2–5 March 2010; pp. 218–219. [Google Scholar]
Yong, C.; Fuping, Z.; Xin, C.; Hongxiang, K.; Xiaoli, C.; Yongqiang, L.; Dong, W. Cow Surface Temperature Measurement and Correlation with Rectal Temperature. Acta Vet. Zootech. Sin. 2015, 46, 2199–2205. [Google Scholar]
Jincheng, H.; Xian, Z.; Suqing, L.; Qianfu, G. Effects of ambient temperature and relative humidity and measurement site on the cow’s body temperature measured by infrared thermography. J. Zhejiang Univ. (Agric. Life Sci.) 2020, 46, 500–508. [Google Scholar]
Idris, M.; Uddin, J.; Sullivan, M.; McNeill, D.M.; Phillips, C.J.C. Non-Invasive Physiological Indicators of Heat Stress in Cattle. Animals 2021, 11, 71. [Google Scholar] [CrossRef]
Tianyu, L.; Ruirui, Z.; Hui, Z.; Linhuan, Z.; Gang, X.; Tongchuan, Y.; Weijia, W. Advancements in Intelligent Monitoring Technologies for Behavioral, Physiological, and Biomarker Analysis in Cattle Health: A Review. Agriculture 2025, 16, 39. [Google Scholar] [CrossRef]
Reuter, R.R.; Carroll, J.A.; Hulbert, L.E.; Dailey, J.W.; Galyean, M.L. Technical note: Development of a self-contained, indwelling rectal temperature probe for cattle research. J. Anim. Sci. 2010, 88, 3291–3295. [Google Scholar] [CrossRef] [PubMed]
He, D.; Liu, C.; Xiong, H. Design and experiment of implantable sensor and real-time detection system for temperature monitory of cow. Trans. Chin. Soc. Agric. Mach. 2018, 49, 195–202. [Google Scholar]
Lee, Y.; Bok, J.D.; Lee, H.J.; Lee, H.G.; Kim, D.; Lee, I.; Kang, S.K.; Choi, Y.J. Body Temperature Monitoring Using Subcutaneously Implanted Thermo-loggers from Holstein Steers. Asian-Australas. J. Anim. Sci. 2016, 29, 299–306. [Google Scholar] [CrossRef]
Huţu, I.; Ionescu, F.; Cimponeriu, A.; Chilinţan, M. RFID technology used for identification and temperature monitoring of cattle. Lucr. Științ. Med. Vet. 2009, 42, 44–50. [Google Scholar]
Hongxiang, K.; Yiqiang, Z.; Kang, R.; Xiaoli, C.; Yongqiang, L.; Dong, W. Automated measurement of cattle surface temperature and its correlation with rectal temperature. PLoS ONE 2017, 12, e0175377. [Google Scholar] [CrossRef][Green Version]
Zhang, Z.; Zhang, H.; Liu, T. Study on body temperature detection of pig based on infrared technology: A review. Artif. Intell. Agric. 2019, 1, 14–26. [Google Scholar] [CrossRef]
Mingzhou, L.; Ju, H.; Chao, C.; Okinda, C.; Mingxia, S.; Longshen, L.; Wen, Y.; Norton, T.; Berckmans, D. An automatic ear base temperature extraction method for top view piglet thermal image. Comput. Electron. Agric. 2018, 155, 339–347. [Google Scholar] [CrossRef]
He, D.; Song, Z. Automatic detection of dairy cow’s eye temperature based on thermal infrared imaging technology and skeleton tree model. Trans. Chin. Soc. Agric. Mach. 2021, 52, 243–250. [Google Scholar]
Gloster, J.; Ebert, K.; Gubbins, S.; Bashiruddin, J.; Paton, D.J. Normal variation in thermal radiated temperature in cattle: Implications for foot-and-mouth disease detection. BMC Vet. Res. 2011, 7, 73. [Google Scholar] [CrossRef]
Church, J.S.; Hegadoren, P.R.; Paetkau, M.J.; Miller, C.C.; Regev-Shoshani, G.; Schaefer, A.L.; Schwartzkopf-Genswein, K.S. Influence of environmental factors on infrared eye temperature measurements in cattle. Res. Vet. Sci. 2014, 96, 220–226. [Google Scholar] [CrossRef]
Bewley, J.M.; Einstein, M.E.; Grott, M.W.; Schutz, M.M. Comparison of Reticular and Rectal Core Body Temperatures in Lactating Dairy Cows. J. Dairy Sci. 2008, 91, 4661–4672. [Google Scholar] [CrossRef]
Tan, C.; Sun, F.; Kong, T.; Zhang, W.; Yang, C.; Liu, C. A survey on deep transfer learning. In Proceedings of the International Conference on Artificial Neural Networks, Rhodes, Greece, 4–7 October 2018; pp. 270–279. [Google Scholar]
Lu, Y.; Zhang, L.; Xie, W. YOLO-compact: An efficient YOLO network for single category real-time object detection. In Proceedings of the 2020 Chinese Control and Decision Conference (CCDC), Hefei, China, 22–24 August 2020; pp. 1931–1936. [Google Scholar]
Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
Jocher, G.; Chaurasia, A.; Qiu, J. Yolo by Ultralytics (Version 8.0. 0). Available online: https://github.com/ultralytics/ultralytics (accessed on 10 November 2025).
Hu, M.; Feng, J.; Hua, J.; Lai, B.; Huang, J.; Gong, X.; Hua, X.-S. Online convolutional re-parameterization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 558–567. [Google Scholar]
Shi, Z.; Hu, J.; Ren, J.; Ye, H.; Yuan, X.; Ouyang, Y.; He, J.; Ji, B.; Guo, J. HS-FPN: High frequency and spatial perception FPN for tiny object detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 25 February–4 March 2025; pp. 6896–6904. [Google Scholar]
Yifei, C.; Chenyan, Z.; Ben, C.; Yiyun, H.; Yifei, S.; Changmao, W.; Xianjun, F.; Yuxing, D.; Feiwei, Q.; Peng, Y.; et al. Accurate leukocyte detection based on deformable-DETR and multi-level feature fusion for aiding diagnosis of blood diseases. Comput. Biol. Med. 2024, 170, 107917. [Google Scholar] [CrossRef]
Hou, F.J.; Zhang, Y.; Zhou, Y.; Zhang, M.; Lv, B.; Wu, J.Q. Review on Infrared Imaging Technology. Sustainability 2022, 14, 11161. [Google Scholar] [CrossRef]
Sedgwick, P. Correlation versus linear regression. BMJ 2013, 346, f2686. [Google Scholar] [CrossRef]
Chengcheng, S.; Jiawen, H.; Tianshu, Z.; Jingsong, L. Accurate Core Body Temperature Prediction for Infrared Thermography Considering Ambient Temperature and Personal Features. IEEE J. Biomed. Health Inform. 2025, 29, 5016–5027. [Google Scholar] [CrossRef]
Gorczyca, M.T.; Milan, H.F.M.; Maia, A.S.C.; Gebremedhin, K.G. Machine learning algorithms to predict core, skin, and hair-coat temperatures of piglets. Comput. Electron. Agric. 2018, 151, 286–294. [Google Scholar] [CrossRef]
Fu-Kang, W.; Ju-Yin, S.; Pin-Hsun, J.; Ya-Chi, S.; Yu-Chieh, W. Non-Invasive Cattle Body Temperature Measurement Using Infrared Thermography and Auxiliary Sensors. Sensors 2021, 21, 2425. [Google Scholar] [CrossRef] [PubMed]
Balhara, A.K.; Jan, M.H.; Hooda, E.; Kumar, K.; Ghanghas, A.; Sangwan, S.; Balhara, S.; Phulia, S.K.; Yadav, S.; Boora, A.; et al. Prediction of core body temperature using infra-red thermography in buffaloes. Ital. J. Anim. Sci. 2024, 23, 834–841. [Google Scholar] [CrossRef]
Pacheco, V.M.; de Sousa, R.V.; Rodrigues, A.V.D.; Sardinha, E.J.D.; Martello, L.S. Thermal imaging combined with predictive machine learning based model for the development of thermal stress level classifiers. Livest. Sci. 2020, 241, 104244. [Google Scholar] [CrossRef]
De Sousa, R.V.; Rodrigues, A.V.D.; de Abreu, M.G.; Tabile, R.A.; Martello, L.S. Predictive model based on artificial neural network for assessing beef cattle thermal stress using weather and physiological variables. Comput. Electron. Agric. 2018, 144, 37–43. [Google Scholar] [CrossRef]

Figure 1. Data acquisition. (a) shows a diagram of the system setup, (b) shows the onsite thermal infrared image and environmental parameter acquisition setup, and (c) shows a rectal temperature measurement being taken.

Figure 2. Flowchart of thermal infrared image preprocessing.

Figure 3. Cow head region temperature thermal imaging remapping. (a) shows the image before remapping and (b) shows the image after remapping.

Figure 4. Example of annotated image. (a) shows head region annotation and (b) shows eye socket region annotations.

Figure 5. Overall workflow of the proposed rectal temperature prediction method.

Figure 6. Improved OH-YOLO network architecture.

Figure 7. OREPA module for convolution enhancement. (a) shows the prototype block, (b) shows the linearization block, and (c) shows the training block.

Figure 8. HS-FPN module for neck enhancement.

Figure 9. Schematic of rectal temperature prediction model.

Figure 10. Example of overall detection results.

Figure 11. Eye socket segmentation results of different models.

Figure 12. Model performance comparison graph. (a) represents the comparison curve of accuracy rates, (b) represents the comparison curve of mAP₅₀, and (c) represents the comparison curve of mAP_50–95.

Figure 13. Measured infrared eye and rectal temperatures for individual cows.

Figure 14. Mean squared error of prediction models under different architectures.

Figure 15. Comparison of rectal and predicted temperature values.

Figure 16. Comparison of the performance of different models.

Table 1. Basic sensor parameters.

Sensor	Measurement Range	Precision	Communication Interface
MAG32-IRT	−20 to 150 °C	±1.5%	Network port
MB016-Temperature and humidity sensor	Temperature: −20 to 80 °C	±0.3 °C	IIC
MB016-Temperature and humidity sensor	Humidity: 0–100% RH	±2% RH	IIC
BH1750-Light sensor	0–65,535 Lux	±1 Lux	IIC
WS3054-Wind speed sensor	0–60 m/s	±3%	RS485
Thermometer	32–42.99 °C	±0.1 °C	/

Table 2. Segmentation results of different models.

Model	AP₅₀ (%)	AP₇₅ (%)	mAP_50–95 (%)	Parameters (M)	GFLOPs	Model Size (MB)	Inference Time (ms)
Mask-RCNN	49.50	49.41	41.90	43.93	133.93	335.06	11.45
Yolov7	99.50	84.70	86.20	37.85	141.90	289.96	5.52
Yolov8	99.50	98.99	86.85	3.26	12.00	6.49	2.84
OH-Yolo	99.50	98.99	86.59	2.18	9.70	4.36	2.57

Table 3. Ablation test results.

Experiments	OREPA	HS-FPN	AP₅₀ (%)	AP₇₅ (%)	mAP_50–95 (%)	Parameters (M)	GFLOPs	Model Size (MB)	Inference Time (ms)
1	✗	✗	99.50	98.99	86.85	3.26	12.00	6.49	2.84
2	✓	✗	99.50	99.00	86.50	3.49	10.90	7.00	2.60
3	✗	✓	99.50	99.00	86.57	2.34	10.70	4.86	2.78
4	✓	✓	99.50	98.99	86.59	2.18	9.70	4.36	2.57

Table 4. Pearson’s correlation analysis between eye socket temperature and environmental factors.

Environmental Factor	Pearson’s Correlation (r)	p-Value	Significance Level
Ambient Temperature	0.34	0	Highly significant (p ≤ 0.001)
Relative Humidity	0.01	0.76	Not significant (p > 0.05)
Wind Speed	−0.28	0	Highly significant (p ≤ 0.001)
Light Intensity	0.06	0.03	Significant (0.01 < p ≤ 0.05)

Table 5. Comparison of related studies.

Authors	Method	Temperature Measurement Area	Extraction Method of Temperature Measurement Areas	Environmental Factor	Accuracy (R²)
F. K. Wang et al. [32]	IRT + multi-sensor system + signal processing	Eye socket	Manual localization via RGB and IRT overlay	Wind speed, ambient temperature, humidity	—
A. K. Balhara et al. [33]	Regression model + IRT	Eye socket	Manual selection	Ambient temperature	0.52
V. M. Pacheco et al. [34]	Artificial neural network (ANN) + IRT + meteorological factors	Multiple body regions: forehead, eye socket, rib, flank	Manual selection of optimal region	Temperature, humidity, seasonal variation	0.71
R. V. de Sousa et al. [35]	ANN + IRT + meteorological factors	Eye socket	Manual selection of optimal surface temperature	Temperature, humidity	0.72
Ours	Improved OH-yolo model + IRT + random forest	Eye socket	Automatic detection and extraction of temperature matrix	Temperature, humidity, wind speed, other environmental parameters	0.85

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhao, K.; Ge, S.; Chen, Y.; Li, Q.; Guo, M.; Nian, Y.; Ren, W. Non-Contact Temperature Monitoring in Dairy Cattle via Thermal Infrared Imaging and Environmental Parameters. Agriculture 2026, 16, 306. https://doi.org/10.3390/agriculture16030306

AMA Style

Zhao K, Ge S, Chen Y, Li Q, Guo M, Nian Y, Ren W. Non-Contact Temperature Monitoring in Dairy Cattle via Thermal Infrared Imaging and Environmental Parameters. Agriculture. 2026; 16(3):306. https://doi.org/10.3390/agriculture16030306

Chicago/Turabian Style

Zhao, Kaixuan, Shaojuan Ge, Yinan Chen, Qianwen Li, Mengyun Guo, Yue Nian, and Wenkai Ren. 2026. "Non-Contact Temperature Monitoring in Dairy Cattle via Thermal Infrared Imaging and Environmental Parameters" Agriculture 16, no. 3: 306. https://doi.org/10.3390/agriculture16030306

APA Style

Zhao, K., Ge, S., Chen, Y., Li, Q., Guo, M., Nian, Y., & Ren, W. (2026). Non-Contact Temperature Monitoring in Dairy Cattle via Thermal Infrared Imaging and Environmental Parameters. Agriculture, 16(3), 306. https://doi.org/10.3390/agriculture16030306

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Non-Contact Temperature Monitoring in Dairy Cattle via Thermal Infrared Imaging and Environmental Parameters

Abstract

1. Simple Overview

2. Introduction

3. Materials and Methods

3.1. Data Acquisition

3.2. Dataset Construction

3.2.1. Preprocessing of Thermal Imaging Data

3.2.2. Dataset

3.3. Eye Socket Detection Based on Cascade Deep Learning

3.3.1. Cow Head Detection Based on YOLO

3.3.2. Segmentation of the Eye Socket Based on OH-YOLO

3.3.3. Detection Model Evaluation Metrics

3.4. Temperature Extraction in the Eye Socket Area

3.5. Established Rectal Temperature Prediction Model

3.5.1. Random Forest Model

3.5.2. Model Evaluation Metrics

4. Results

4.1. Cow Head Object Detection Results

4.2. Eye Socket Recognition Results

4.2.1. Segmentation Results of Different Models

4.2.2. Eye Socket Model Ablation Test

4.3. Eye Socket Temperature Extraction Results

4.4. Rectal Temperature Prediction Results

4.5. Comparison of Different Prediction Algorithms

5. Discussion

5.1. Analysis of Eye Socket Segmentation Results

5.2. Impact of Environmental Factors

5.3. Study Limitations

5.4. Further Study

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI