Next Article in Journal
Lightweight Interpretable Deep Learning Model for Nutrient Analysis in Mobile Health Applications
Previous Article in Journal
Beauty Tech—Customer Experience and Loyalty of Augmented Reality- and Artificial Intelligence-Driven Cosmetics
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Integration of YOLOv9 Segmentation and Monocular Depth Estimation in Thermal Imaging for Prediction of Estrus in Sows Based on Pixel Intensity Analysis

by
Iyad Almadani
*,
Aaron L. Robinson
and
Mohammed Abuhussein
Electrical and Computer Engineering, University of Memphis, Memphis, TN 38111, USA
*
Author to whom correspondence should be addressed.
Digital 2025, 5(2), 22; https://doi.org/10.3390/digital5020022
Submission received: 11 April 2025 / Revised: 21 May 2025 / Accepted: 4 June 2025 / Published: 13 June 2025

Abstract

Many researchers focus on improving reproductive health in sows and ensuring successful breeding by accurately identifying the optimal time of ovulation through estrus detection. One promising non-contact technique involves using computer vision to analyze temperature variations in thermal images of the sow’s vulva. However, variations in camera distance during dataset collection can significantly affect the accuracy of this method, as different distances alter the resolution of the region of interest, causing pixel intensity values to represent varying areas and temperatures. This inconsistency hinders the detection of the subtle temperature differences required to distinguish between estrus and non-estrus states. Moreover, failure to maintain a consistent camera distance, along with external factors such as atmospheric conditions and improper calibration, can distort temperature readings, further compromising data accuracy and reliability. Furthermore, without addressing distance variations, the model’s generalizability diminishes, increasing the likelihood of false positives and negatives and ultimately reducing the effectiveness of estrus detection. In our previously proposed methodology for estrus detection in sows, we utilized YOLOv8 for segmentation and keypoint detection, while monocular depth estimation was used for camera calibration. This calibration helps establish a functional relationship between the measurements in the image (such as distances between labia, the clitoris-to-perineum distance, and vulva perimeter) and the depth distance to the camera, enabling accurate adjustments and calibration for our analysis. Estrus classification is performed by comparing new data points with reference datasets using a three-nearest-neighbor voting system. In this paper, we aim to enhance our previous method by incorporating the mean pixel intensity of the region of interest as an additional factor. We propose a detailed four-step methodology coupled with two stages of evaluation. First, we carefully annotate masks around the vulva to calculate its perimeter precisely. Leveraging the advantages of deep learning, we train a model on these annotated images, enabling segmentation using the cutting-edge YOLOv9 algorithm. This segmentation enables the detection of the sow’s vulva, allowing for analysis of its shape and facilitating the calculation of the mean pixel intensity in the region. Crucially, we use monocular depth estimation from the previous method, establishing a functional link between pixel intensity and the distance to the camera, ensuring accuracy in our analysis. We then introduce a classification approach that differentiates between estrus and non-estrus regions based on the mean pixel intensity of the vulva. This classification method involves calculating Euclidean distances between new data points and reference points from two datasets: one for “estrus” and the other for “non-estrus”. The classification process identifies the five closest neighbors from the datasets and applies a majority voting system to determine the label. A new point is classified as “estrus” if the majority of its nearest neighbors are labeled as estrus; otherwise, it is classified as “non-estrus”. This automated approach offers a robust solution for accurate estrus detection. To validate our method, we propose two evaluation stages: first, a quantitative analysis comparing the performance of our new YOLOv9 segmentation model with the older U-Net and YOLOv8 models. Secondly, we assess the classification process by defining a confusion matrix and comparing the results of our previous method, which used the three nearest points, with those of our new model that utilizes five nearest points. This comparison allows us to evaluate the improvements in accuracy and performance achieved with the updated model. The automation of this vital process holds the potential to revolutionize reproductive health management in agriculture, boosting breeding success rates. Through thorough evaluation and experimentation, our research highlights the transformative power of computer vision, pushing forward more advanced practices in the field.

1. Introduction

Detecting estrus in sows based on vulva temperature is the second most critical approach after analyzing vulva swelling [1,2]. The temperature rise during estrus is caused by hormonal shifts that lead to increased blood flow to the reproductive organs, resulting in redness and warmth in the vulva area, signaling the sow’s readiness for mating [3]. This temperature rise is caused by elevated estrogen levels, which trigger vasodilation (the widening of blood vessels) in the reproductive region [4], leading to swelling, reddening, and warming of the vulva. These subtle temperature shifts can be captured by thermal imaging; however, one of the primary challenges is the narrow temperature difference (typically around 1.5 °C) between estrus and non-estrus states. According to [5], Vulva Surface Temperature (VST) in gilts peaked at 35.6 °C ± 1.6 °C during estrus, followed by a significant decrease to 33.9 °C ± 1.7 °C approximately 8 h before ovulation, with the most notable temperature drop occurring between 32 and 8 h prior. Sows showed a comparable pattern, with a peak VST of 36.1 °C ± 1.3 °C observed 32 to 24 h before ovulation, which then declined to 34.6 °C ± 1.6 °C around 16 h before ovulation. This trend was consistent, as most gilts (19 out of 25) and sows (23 out of 27) displayed the characteristic peak in VST followed by a decline, with nearly all animals showing an initial temperature rise. The same study established a standardized measurement distance of 0.61 m to maintain consistency in temperature readings. This standardization is crucial because, in thermal imaging, pixel intensity values represent temperature, with each pixel corresponding to a specific temperature based on the camera’s calibration. As the distance of the camera from the object increases, image resolution and temperature accuracy decrease. This occurs because fewer pixels cover the target area, resulting in a loss of thermal detail and a phenomenon known as the “distance effect”, which reduces pixel intensity values and, thus, temperature accuracy [6]. Despite controlling for distance, the narrow temperature variation between estrus and non-estrus states remains challenging to solve. To address these challenges, the proposed method begins with YOLOv9 segmentation to accurately localize the vulva. YOLOv9 is selected for its advanced real-time object detection capabilities, which significantly surpass those of YOLOv8. Key improvements in YOLOv9, such as its enhanced backbone architecture, superior feature extraction, and optimized training process, contribute to higher accuracy and faster inference. Furthermore, YOLOv9’s ability to handle complex scenes and smaller objects makes it particularly well-suited for precision agriculture and animal monitoring applications, including estrus detection in sows [7]. After segmenting the vulva region, the study applies mean pixel intensity calculations within the segmented area to address narrow thermal reading differences, thereby enhancing the differentiation between estrus and non-estrus states. Additionally, by adjusting for the distance of the vulva from the camera, we aim to minimize measurement inconsistencies. This approach enhances the reliability of estrus identification, ultimately reducing the likelihood of false positives and false negatives and contributing to a more accurate and humane livestock management system.
This paper provides a comprehensive analysis of the theoretical foundations supporting the employed methods. In Section 4, we offer a detailed description of our proposed system, highlighting its complexities and the reasoning behind our approach. Section 5 presents a thorough overview of the experimental results, showcasing the performance and effectiveness of our methodology. This section also assesses our research outcomes through two evaluation stages: comparing the performance of our YOLOv9 model to the previous YOLOv8 and U-Net model [1,8] and assessing the classification model with a confusion matrix and comparing the results of our previous method, which used the three nearest points, with our new model that utilizes the five nearest points. Finally, Section 6 summarizes our findings and discusses potential directions for future development and enhancement of this work.

2. Background

Thermal imaging, or thermography, is a fascinating technology that has made a significant impact in various fields by allowing us to see heat in ways our eyes never could. The basic idea is that anything warmer than absolute zero emits infrared radiation, which is invisible to the naked eye but can be detected by thermal cameras. These cameras turn the heat emitted by objects into visible images, or thermograms, where different colors represent different temperatures. This ability to capture and measure temperature variations with such accuracy has led to widespread applications across industry [9], medicine [10], and even wildlife monitoring. In the field of veterinary science and animal husbandry, thermal imaging has proven to be a non-invasive and effective tool for monitoring animal health. By visualizing temperature variations across an animal’s body surface, it is possible to detect signs of injury, infection, or physiological stress. For instance, thermal cameras can identify painful areas in animals by detecting localized increases in skin temperature. Beyond general health monitoring, thermal cameras can effectively capture minor temperature variations on the surface of the animal’s body, making it feasible to monitor reproductive readiness [11,12]. Its precision in detecting physiological changes, such as temperature shifts associated with estrus, makes it a valuable, non-intrusive method for reproductive assessment [13].
As part of the essential feature extraction process required for model training [14], deep learning has further revolutionized the use of thermal imaging by enabling researchers to combine thermal data with sophisticated image processing [15] and data preprocessing techniques [16], which facilitate the extraction and analysis of complex heat patterns. Specifically, object detection and segmentation, driven by convolutional neural networks (CNNs) [8,17,18,19,20,21,22,23], play a crucial role in various imaging applications. These techniques have significantly improved the reliability of thermal imaging by enabling the precise capture of subtle temperature variations, enhancing the accuracy and usefulness of thermal assessments. Additionally, the exploration of 3D reconstruction [24,25] has opened new avenues for more comprehensive assessments. All these integrations promise more efficient, humane, and science-driven methods for animal care and breeding, paving the way for significant advancements in livestock management and animal welfare [26,27].
The use of deep learning and thermal imaging has significantly advanced the estimation of estrus in sows. Previous research has demonstrated that vulvar skin temperature increases at the onset of estrus and is a reliable indicator for predicting optimal insemination times [5]. Techniques such as object detection and image processing algorithms, including improved models like FD-YOLOV5s, have been used to automatically detect and extract these temperature variations from thermal images [11]. These automated methods address inefficiencies and variability in manual extraction, improving both speed and accuracy in estrus detection. However, there are limitations to these methods. The narrow temperature differences between estrus and non-estrus states (typically around 1.5 °C) make it challenging to differentiate accurately. Additionally, variations in the distance between the camera and the sow’s vulva can lead to inconsistent temperature readings. Environmental factors, such as ambient temperature and camera angle, can further affect measurement reliability. Our study addresses these issues by implementing normalization techniques for pixel intensity values and controlling the vulva–camera distance, enhancing the robustness and precision of estrus detection.

3. Related Work

To fully understand the methodology and objectives of this study, it is essential to reference the foundational work presented in our previous two papers.
In our first paper, Advanced Swine Management: Infrared Imaging for Precise Localization of Reproductive Organs in Livestock Monitoring [8], we focused on developing a robust image segmentation method to localize the sow’s vulva using U-Net. By segmenting the vulva from infrared images, we demonstrated a foundational approach to automating estrus detection through computer vision. The model achieved an Intersection over Union (IoU) score of 0.58, surpassing alternative methods such as SVM with Gabor and YOLOv3.
Building on this foundation, our second paper, YOLOv8-Based Estimation of Estrus in Sows Through Reproductive Organ Swelling Analysis Using a Single Camera [1], introduced a more advanced methodology leveraging YOLOv8 for segmentation and keypoint detection. This study integrated monocular depth estimation to calibrate camera distance, ensuring accurate geometric measurements of the vulva. Additionally, a three-nearest-neighbor classification approach was used to distinguish estrus states based on pixel width, pixel length, and vulva perimeter measurements. This work addressed critical challenges in estrus detection, including variability caused by inconsistent camera distances, and demonstrated significant improvements in detection accuracy.
These prior studies laid the groundwork for the current research by establishing key methods for vulva segmentation and estrus classification. In this paper, we aim to enhance these methodologies by incorporating mean pixel intensity as an additional factor, leveraging thermal imaging and YOLOv9 segmentation for more precise estrus detection. Readers are encouraged to consult the earlier works for a deeper understanding of the foundational techniques and challenges addressed in this study.

4. Method Overview

The dataset used for this study was collected by the Sivananthan Laboratory, a multidisciplinary research group renowned for its contributions to infrared imaging and sensor technology. Based at the University of Illinois at Chicago, the lab combines expertise in materials science, computer vision, and real-world sensor deployment. For this study, the Sivananthan lab conducted an extensive data collection process on a large commercial farms housing numerous sows, ensuring a representative sample of real-world environmental and behavioral conditions. To capture natural variation in lighting, shadows, and sow activity, images were taken three times daily—at 10:00 a.m., 2:00 p.m., and 6:00 p.m.—over a continuous four-month period. This systematic approach contributed to a diverse and rich dataset, supporting robust data augmentation techniques and enhancing the generalizability of the proposed system.
Our proposed methodology for estrus detection in sows is organized into four essential steps to improve the accuracy and reliability of detecting subtle temperature changes in thermal images. The first step involves carefully annotating masks around the vulva region, enabling precise perimeter calculations to define the shape of the region of interest. In the second step, we leverage deep learning to train a model on these annotated images, allowing for robust segmentation using the advanced YOLOv9 algorithm. This segmentation is critical for isolating the vulva region and calculating the mean pixel intensity, which is essential for detecting temperature variations.
The third step incorporates monocular depth estimation, building on prior calibration to establish a reliable link between pixel intensity and camera distance, ensuring consistent measurements, regardless of the camera’s position. While minor camera-angle variations occurred during capture, these were minimal and carefully controlled to prevent the vulva’s size or appearance being affected, ensuring accurate estrus detection, even with adjustments in camera distance.
Finally, the fourth step introduces a classification approach that differentiates estrus from non-estrus states based on mean pixel intensity. This classification uses Euclidean distance measurements and a five-nearest-neighbor voting system to assign labels, with the majority of neighbors determining whether a data point is classified as “estrus” or “non-estrus”. This streamlined four-step approach enhances the precision of estrus detection, with an overview of each step provided in the following Figure 1.

4.1. Image Annotation

The first step in training any segmentation model is to annotate and segment the region of interest, which, in this case, is the sow’s vulva. This process is carried out utilizing an online tool with manual annotation to accurately segment the vulva area. To streamline the manual annotation process in thermal images, preprocessing techniques can enhance the visibility of the Region of Interest (ROI), making it easier to segment accurately. By applying contrast stretching and histogram equalization, we can adjust the image’s dynamic range, which is particularly useful for 16-bit thermal images, where pixel intensities represent temperature variations. These techniques bring subtle differences in intensity to the forefront, helping to reveal the shape and boundaries of the ROI. Furthermore, adaptive thresholding and edge detection can outline prominent regions, highlighting boundaries that serve as visual guides during annotation, as shown in Figure 2. This preprocessing approach improves the clarity of the ROI and reduces the difficulty of manual segmentation, ensuring that annotated regions align precisely with the intended thermal features.

4.2. Pixel Intensity Calculation

To segment sow vulva regions from annotated thermal images, we employed the YOLOv9 architecture, which is well-suited for high-accuracy object detection and segmentation tasks. During training, YOLOv9 learns to identify and delineate the vulva area by iteratively refining its predictions against the labeled ground truth. Once the model is trained, it can be used to segment the vulva region in new thermal images by applying the learned weights.
Figure 3 presents an example thermal image of a sow’s vulva, along with its corresponding pixel intensity distribution histogram. After segmenting the region of interest (ROI) using YOLOv9, we compute the mean pixel intensity within that area to quantify temperature characteristics or other pixel-based thermal features. Let R denote the set of pixel intensities within the segmented ROI, with each pixel value ( I p ) reflecting thermal data captured by the imaging sensor. The mean pixel intensity ( μ ) is calculated as follows:
μ = 1 R p R I p
In this equation, R is the total number of pixels in the ROI, and p R I p is the cumulative sum of their intensities. This average intensity serves as a key quantitative feature for evaluating the thermal profile of the vulva, which is later used to support estrus estimation.
The pixel intensity distribution histogram provides additional visual insight into how temperature is distributed across the segmented ROI. While the thermal image highlights areas of heat, the histogram helps illustrate how frequently each pixel intensity occurs, revealing the overall thermal spread patterns such as hotspots or a uniform temperature distribution. These visualizations help emphasize that the temperature difference between estrus and non-estrus states is typically small (around 1.5 °C) and may not be visually obvious in the image alone. Although the probability distribution function (PDF) and histogram are not used directly in the analysis, they serve as useful illustrations to motivate the use of a more robust metric—mean pixel intensity—for capturing subtle temperature variations.

4.3. Calibrating the Camera Using Monocular Depth Estimation

The dataset consisted of images of sow vulvas taken at varying distances, with each image providing pixel-based measurements of the perimeter. All images were captured using the same camera, ensuring consistent focal length, sensor width, and pixel resolution across the dataset. A study [28] reported that the average vulva perimeter for non-estrus sows is 16.06 cm, while for sows in estrus, it is 17.43 cm. Using these averages as a reference, we converted the pixel perimeter values into physical dimensions by determining the physical size of each pixel. This was achieved using the sensor width and image resolution in the following equation:
Pixel Size = Sensor Width Image Width in Pixels
This equation reflects the physical width of a single pixel on the camera sensor, enabling the conversion of image-based measurements into real-world units. The physical perimeter of the vulva in each image was then calculated as follows:
p = p p × Pixel Size
where p p is the perimeter in pixels and p is the perimeter in millimeters or centimeters. To estimate the distance from the camera (depth), as shown in Figure 4, we applied the pinhole camera model:
d p = P × f p
where
  • p p represents the perimeter in pixels;
  • p is the calculated physical perimeter;
  • P is the actual perimeter measurement from reference data;
  • f is the camera’s focal length (in millimeters);
  • d p is the estimated depth or distance from the camera.
Accurately estimating the camera’s distance from the vulva is essential, as changes in distance affect the apparent size of the vulva in the image. Such variations can result in inaccurate estrus classification, either falsely identifying estrus or missing it altogether. By reliably estimating and accounting for camera distance, we enhance the precision of estrus detection and ensure consistent, robust analysis across the dataset.
Figure 4. Single-camera depth estimation.
Figure 4. Single-camera depth estimation.
Digital 05 00022 g004

4.4. Function Discovery

To establish a robust connection between the estimated actual distance, determined through monocular depth estimation, and the mean pixel intensity of the segmented region of the sow’s vulva, it is essential to derive a mathematical function that accurately represents the relationship between these two parameters. This function serves as the foundation for calibrating pixel intensity values to account for variations caused by changes in camera distance, ensuring consistency and reliability in subsequent temperature-based analysis.
Figure 5 illustrates the relationship between the thermal camera’s distance from the sow’s vulva and the mean pixel intensity of the segmented vulva region for both estrus and non-estrus states. At greater distances (beyond 55 cm), the intensity values for estrus and non-estrus states form two distinct curves, enabling clear differentiation between the two conditions. However, as the camera moves closer to the vulva (below 55 cm), the gap between the two curves narrows, and the values start to overlap. This overlap becomes especially pronounced at distances closer than 43 cm, introducing significant ambiguity into the system. Such overlap makes it challenging to classify whether the sow is in an estrus or non-estrus state based on pixel intensity alone, highlighting the critical need to maintain an optimal camera distance for reliable detection.
In thermal imaging, being too close to the vulva introduces challenges that hinder the accuracy of pixel intensity measurements, which are vital for distinguishing between estrus and non-estrus states. At close distances, the thermal sensor may experience saturation effects, where the intensity values are disproportionately high, resulting in unreliable readings. Additionally, thermal cameras rely on capturing an even distribution of emitted infrared radiation, but close proximity can result in uneven measurements due to variations in the sensor’s field of view or localized heat concentrations. These inaccuracies disrupt the consistency of mean pixel intensity values, particularly during estrus, where higher values are expected.

4.5. K-Nearest Neighbors (KNN) Classification

To accurately classify the estrus state of sows, we adopted a K-Nearest Neighbors (KNN) approach enhanced by the integration of mean pixel intensity values and calibrated depth measurements. This method utilizes the Euclidean distance metric to compare new data points with reference datasets representing “estrus” and “non-estrus” states. By identifying the five nearest neighbors in the dataset, the model applies a majority voting system to assign the appropriate label. If the majority of the closest neighbors are labeled as “estrus”, the point is classified as such; otherwise, it is labeled as “non-estrus”.
This classification framework builds upon our previous methodology, where a three-nearest-neighbor voting system was employed. By extending to five neighbors, we aim to improve the robustness of the classification process, mitigating the impact of noise or outliers in the dataset. Figure 6 illustrates the KNN classification process, showing how new data points are classified based on proximity to labeled reference points.

5. Results and Evaluation

In this section, we present the results of the proposed methodology, focusing on segmentation accuracy, camera distance calibration, and estrus classification. Our approach integrates YOLOv9 segmentation, monocular depth estimation, and mean pixel intensity analysis to enhance estrus detection in sows. The results are compared to previous models (U-Net and YOLOv8), showcasing significant advancements in accuracy, robustness, and reliability.
Our evaluation was conducted on 2437 previously unseen images, including 1226 images of sows in estrus and the remainder representing non-estrus states. For classification, we introduced an improved voting system that uses five nearest neighbors, replacing the three-nearest-neighbor approach applied in our earlier work. This adjustment leverages a broader set of data points, improving the accuracy and reliability of the classification process. The enhanced method’s performance was evaluated using a detailed confusion matrix, enabling a direct comparison with results from the previous study. This analysis demonstrated notable improvements in distinguishing between estrus and non-estrus states, reinforcing the effectiveness of the proposed methodology. The results highlight the value of increasing the number of neighbors in the voting system, particularly in achieving higher classification accuracy and consistency.

5.1. Performance Metrics and Comparative Analysis

The proposed YOLOv9 model demonstrates remarkable advancements in segmentation accuracy, outperforming both U-Net and YOLOv8 across multiple performance metrics. As shown in Table 1, YOLOv9 achieved an impressive intersection over union (IoU) score of 0.746, surpassing YOLOv8 by 2.1 percentage points and significantly outperforming U-Net. Additionally, YOLOv9 exhibited exceptional performance in precision-based metrics, achieving an mAP50 of 0.992 and maintaining a competitive mAP50-95 score of 0.683, indicating its ability to adapt to varying overlap thresholds. This improvement is also visually evident in Figure 7, which compares the ground-truth segmentation with the outputs of YOLOv8 and YOLOv9. YOLOv9 displays a closer alignment to the ground truth, capturing subtle and fine-grained details that YOLOv8 and U-Net miss.
The improved performance of YOLOv9 can be attributed to its enhanced ability to detect small objects with high precision, a critical requirement for the intricate task of estrus detection. YOLOv9 incorporates advanced feature extraction layers and a more refined detection head, enabling it to focus on fine-grained details and subtle features that might be overlooked by earlier models. This strength makes it particularly well-suited for segmenting small, complex regions like the vulva, where precision is essential for accurate analysis.

5.2. Evaluation of Estrus vs. Non-Estrus Classification Model Using a Confusion Matrix

The enhanced estrus detection model demonstrated strong performance, achieving a true-positive rate (TPR) of 95.24% and a true-negative rate (TNR) of 96.46%, as illustrated in the confusion matrix (Figure 8). These results reflect improvements of 0.71% and 0.35%, respectively, over the previous model. The enhanced system builds upon earlier work by incorporating mean pixel intensity from the segmented vulva as a key feature, in addition to width, height, and perimeter [1]. Furthermore, the classification approach was strengthened by upgrading from a three-nearest-neighbor (3-NN) to a five-nearest-neighbor (5-NN) voting scheme, resulting in greater reliability and accuracy.
Importantly, these results were achieved using the full dataset, including images captured at distances less than 55 cm from the vulva—distances that previously posed challenges due to overlapping intensity values and classification ambiguity. While the system performed well under these conditions, excluding such close-range images in future analysis could further boost overall accuracy by reducing noise and misclassification risk.

6. Conclusions and Future Work

The development and implementation of our enhanced methodology for estrus detection in sows mark a substantial advancement in the application of computer vision in agricultural technology. By incorporating pixel intensity alongside traditional vulva measurements—width, height, and perimeter—and upgrading the classification system to utilize a five-nearest-neighbor (5-NN) approach, our method significantly improves the precision and reliability of estrus detection. These advancements contribute to the optimization of reproductive health management and breeding outcomes in sows.
Our refined methodology integrates meticulous image annotation, advanced segmentation using YOLOv9, and a robust evaluation framework. The inclusion of pixel intensity, coupled with superior segmentation models, has strengthened the system’s ability to differentiate between estrus and non-estrus states, as reflected in the high achieved true-positive and true-negative rates.
One significant challenge in estrus detection using thermal imaging is the impact of camera distance on pixel intensity measurements. When the thermal camera is positioned beyond 55 cm from the sow’s vulva, the intensity values for estrus and non-estrus states are distinct, forming separate curves that allow for accurate classification. However, reducing the camera distance below 55 cm causes these curves to converge, leading to overlapping intensity values that complicate the classification process. This issue is particularly pronounced at distances of less than 43 cm, where the overlap becomes substantial. The underlying cause lies in the effects of close proximity on thermal imaging sensors, which can become saturated at short distances, producing excessively high-intensity readings. Additionally, localized heat concentrations and variations in the sensor’s field of view disrupt the uniform capture of infrared radiation, further compromising the consistency of pixel intensity measurements. These factors underscore the importance of maintaining an optimal camera distance to ensure reliable estrus detection.
The effectiveness of our approach was validated through comprehensive evaluations, demonstrating improvements over previous models. The shift from a three-nearest-neighbor (3-NN) to a five-nearest-neighbor (5-NN) classification mechanism significantly reduced misclassification rates, enhancing the robustness of the system. Furthermore, YOLOv9 exhibited superior performance in both keypoint detection and segmentation compared to earlier models, providing precise spatial measurements essential for reliable estrus detection.
In conclusion, the enhanced methodology significantly advances automated estrus detection, reducing human error and providing a scalable solution for livestock reproductive health management. These advancements pave the way for more efficient agricultural practices, with potential benefits for breeding outcomes and livestock health.

Author Contributions

Conceptualization, I.A.; methodology, I.A.; software, I.A.; validation, I.A.; formal analysis, I.A.; investigation, I.A.; resources, A.L.R.; data curation, I.A. and M.A.; writing—original draft preparation, I.A.; writing—review and editing, I.A. and M.A.; visualization, I.A.; supervision, A.L.R.; project administration, A.L.R.; funding acquisition, A.L.R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Restrictions apply to the availability of these data. Data were obtained from Sivananthan lab and are available from authors with permission from Sivananthan lab.

Acknowledgments

The authors thank Sivananthan Laboratories for providing the dataset used in this study.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Almadani, I.; Abuhussein, M.; Robinson, A.L. YOLOv8-Based Estimation of Estrus in Sows Through Reproductive Organ Swelling Analysis Using a Single Camera. Digital 2024, 4, 898–913. [Google Scholar] [CrossRef]
  2. Belstra, I.M.V. International Corporation, Flowers North Carolina State University, Todd See North Carolina State University, Singleton Purdue University. Detection of Estrus or Heat. Pork Information Gateway, Type: Factsheets. July 31, 2007. PIG 08-01-01. Available online: https://porkgateway.org/resource/estrus-or-heat-detection/ (accessed on 3 June 2025).
  3. Ford, S.; Reynolds, L.; Magness, R. Blood flow to the uterine and ovarian vascular beds of gilts during the estrous cycle or early pregnancy. Biol. Reprod. 1982, 27, 878–885. [Google Scholar] [CrossRef] [PubMed]
  4. Lee, J.H.; Lee, D.H.; Yun, W.; Oh, H.J.; An, J.S.; Kim, Y.G.; Kim, G.M.; Cho, J.H. Quantifiable and feasible estrus detection using the ultrasonic sensor array and digital infrared thermography. J. Anim. Sci. Technol. 2019, 61, 163. [Google Scholar] [CrossRef]
  5. Scolari, S.C.; Clark, S.G.; Knox, R.V.; Tamassia, M.A. Vulvar skin temperature changes significantly during estrus in swine as determined by digital infrared thermography. J. Swine Health Prod. 2011, 19, 151–155. [Google Scholar] [CrossRef]
  6. Minkina, W.; Dudzik, S. Infrared Thermography: Errors and Uncertainties; John Wiley & Sons: Hoboken, NJ, USA, 2009. [Google Scholar]
  7. Wang, C.Y.; Yeh, I.H.; Liao, H.Y.M. Yolov9: Learning what you want to learn using programmable gradient information. arXiv 2024, arXiv:2402.13616. [Google Scholar]
  8. Almadani, I.; Ramos, B.; Abuhussein, M.; Robinson, A.L. Advanced Swine Management: Infrared Imaging for Precise Localization of Reproductive Organs in Livestock Monitoring. Digital 2024, 4, 446–460. [Google Scholar] [CrossRef]
  9. ElMasry, G.; ElGamal, R.; Mandour, N.; Gou, P.; Al-Rejaie, S.; Belin, E.; Rousseau, D. Emerging thermal imaging techniques for seed quality evaluation: Principles and applications. Food Res. Int. 2020, 131, 109025. [Google Scholar] [CrossRef]
  10. Zhang, L.; Guo, H.; Li, Z. Application of medical infrared thermal imaging in the diagnosis of human internal focus. Infrared Phys. Technol. 2019, 101, 127–132. [Google Scholar] [CrossRef]
  11. Zheng, H.; Zhang, H.; Song, S.; Wang, Y.; Liu, T. Automatic detection of sow estrus using a lightweight real-time detector and thermal images. Int. J. Agric. Biol. Eng. 2023, 16, 194–207. [Google Scholar] [CrossRef]
  12. Mun, H.; Ampode, K.; Mahfuz, S.; Chung, I.; Dilawar, M.; Yang, C. Heat detection of gilts using digital infrared thermal imaging camera. Adv. Anim. Vet. Sci 2022, 10, 2142–2147. [Google Scholar]
  13. Soerensen, D.D.; Pedersen, L.J. Infrared skin temperature measurements for monitoring health in pigs: A review. Acta Vet. Scand. 2015, 57, 1–11. [Google Scholar] [CrossRef] [PubMed]
  14. Hasan, M.N.; Hamdan, S.; Poudel, S.; Vargas, J.; Poudel, K. Prediction of length-of-stay at intensive care unit (icu) using machine learning based on mimic-iii database. In Proceedings of the 2023 IEEE Conference on Artificial Intelligence (CAI), Santa Clara, CA, USA, 5–6 June 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 321–323. [Google Scholar]
  15. Almadani, I.; Abuhussein, M.; Robinson, A.L. Sow localization in thermal images using gabor filters. In Proceedings of the Future of Information and Communication Conference, Francisco, CA, USA, 3–4 March 2022; Springer: Berlin/Heidelberg, Germany, 2022; pp. 617–627. [Google Scholar]
  16. Ahmed, M.K. Converting OpenStreetMap (OSM) Data to Functional Road Networks for Downstream Applications. arXiv 2022, arXiv:2211.12996. [Google Scholar]
  17. Moinuddin, K.A.; Havugimana, F.; Al-Fahad, R.; Bidelman, G.M.; Yeasin, M. Unraveling Spatial-Spectral Dynamics of Speech Categorization Speed Using Convolutional Neural Networks. Brain Sci. 2022, 13, 75. [Google Scholar] [CrossRef]
  18. Havugimana, F.; Moinuddin, K.A.; Yeasin, M. Deep Learning Framework for Modeling Cognitive Load from Small and Noisy EEG data. IEEE Trans. Cogn. Dev. Syst. 2022, 16, 1006–1015. [Google Scholar] [CrossRef]
  19. Havugimana, F.; Muhammad, M.B.; Moinudin, K.A.; Yeasin, M. Predicting Cognitive Load using Parameter-optimized CNN from Spatial-Spectral Representation of EEG Recordings. In Proceedings of the 2021 20th IEEE International Conference on Machine Learning and Applications (ICMLA), Pasadena, CA, USA, 13–16 December 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 710–715. [Google Scholar]
  20. Abuhussein, M.; Almadani, I.; Robinson, A.L.; Younis, M. Enhancing Obscured Regions in Thermal Imaging: A Novel GAN-Based Approach for Efficient Occlusion Inpainting. J 2024, 7, 218–235. [Google Scholar] [CrossRef]
  21. Ahmed, M.K.; Yeasin, M. MU-Net: Modified U-Net for Precise Localization and Segmentation of Lumber-Spine Regions from Sagittal Views. TechRxiv 2024. [Google Scholar] [CrossRef]
  22. Chen, Z.; Yang, J.; Feng, Z.; Zhu, H. RailFOD23: A dataset for foreign object detection on railroad transmission lines. Sci. Data 2024, 11, 72. [Google Scholar] [CrossRef]
  23. Chen, Z.; Yang, J.; Li, F.; Feng, Z.; Chen, L.; Jia, L.; Li, P. Foreign Object Detection Method for Railway Catenary Based on a Scarce Image Generation Model and Lightweight Perception Architecture. IEEE Trans. Circuits Syst. Video Technol. 2025, 1-1. [Google Scholar] [CrossRef]
  24. Acampora, L.; De Filippis, F.; Martucci, A.; Sorgi, L. 3D reconstruction of thermal images. In Proceedings of the 26th Aerospace Testing Seminar, Los Angeles, CA, USA, 29–32 March 2011; pp. 263–276. [Google Scholar]
  25. Ahmed, M.K. Measurement and Evaluation of Deep Learning Based 3D Reconstruction. Master’s Thesis, The University of Memphis, Memphis, TN, USA, 2023. [Google Scholar]
  26. Xie, Q.; Wu, M.; Bao, J.; Zheng, P.; Liu, W.; Liu, X.; Yu, H. A deep learning-based detection method for pig body temperature using infrared thermography. Comput. Electron. Agric. 2023, 213, 108200. [Google Scholar] [CrossRef]
  27. Reza, M.N.; Ali, M.R.; Samsuzzaman; Kabir, M.S.N.; Karim, M.R.; Ahmed, S.; Kyoung, H.; Kim, G.; Chung, S.O. Thermal imaging and computer vision technologies for the enhancement of pig husbandry: A review. J. Anim. Sci. Technol. 2024, 66, 31. [Google Scholar] [CrossRef]
  28. De la Cruz-Vigo, P.; Rodriguez-Boñal, A.; Rodriguez-Bonilla, A.; Córdova-Izquierdo, A.; Pérez Garnelo, S.S.; Gómez-Fidalgo, E.; Martín-Lluch, M.; Sánchez-Sánchez, R. Morphometric changes on the vulva from proestrus to oestrus of nulliparous and multiparous HYPERPROLIFIC sows. Reprod. Domest. Anim. 2022, 57, 94–97. [Google Scholar] [CrossRef] [PubMed]
Figure 1. A summary of the structure of our estrus detection system.
Figure 1. A summary of the structure of our estrus detection system.
Digital 05 00022 g001
Figure 2. Enhanced ROI visibility in thermal images.
Figure 2. Enhanced ROI visibility in thermal images.
Digital 05 00022 g002
Figure 3. Thermal image of a sow vulva with its corresponding pixel intensity distribution histogram.
Figure 3. Thermal image of a sow vulva with its corresponding pixel intensity distribution histogram.
Digital 05 00022 g003
Figure 5. Camera distance vs. mean pixel intensity for estrus and non-estrus states with error bars and overlap.
Figure 5. Camera distance vs. mean pixel intensity for estrus and non-estrus states with error bars and overlap.
Digital 05 00022 g005
Figure 6. KNN classification process.
Figure 6. KNN classification process.
Digital 05 00022 g006
Figure 7. Demonstration of the ground truth and YOLOv8 and YOLOv9 segmentation results for estrus detection where the green mask represents the predicted mask.
Figure 7. Demonstration of the ground truth and YOLOv8 and YOLOv9 segmentation results for estrus detection where the green mask represents the predicted mask.
Digital 05 00022 g007
Figure 8. Confusion matrix for enhanced estrus detection model.
Figure 8. Confusion matrix for enhanced estrus detection model.
Digital 05 00022 g008
Table 1. Comparison of U-Net, YOLOv8, and YOLOv9 segmentation models.
Table 1. Comparison of U-Net, YOLOv8, and YOLOv9 segmentation models.
MetricU-NetYOLOv8YOLOv9
Intersection over Union (IoU)0.5860.7250.762
Mean Average Precision at 50% overlap (mAP50)0.6520.8040.992
Mean Average Precision at 50–95% overlap (mAP50-95)0.4890.6780.683
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Almadani, I.; Robinson, A.L.; Abuhussein, M. Integration of YOLOv9 Segmentation and Monocular Depth Estimation in Thermal Imaging for Prediction of Estrus in Sows Based on Pixel Intensity Analysis. Digital 2025, 5, 22. https://doi.org/10.3390/digital5020022

AMA Style

Almadani I, Robinson AL, Abuhussein M. Integration of YOLOv9 Segmentation and Monocular Depth Estimation in Thermal Imaging for Prediction of Estrus in Sows Based on Pixel Intensity Analysis. Digital. 2025; 5(2):22. https://doi.org/10.3390/digital5020022

Chicago/Turabian Style

Almadani, Iyad, Aaron L. Robinson, and Mohammed Abuhussein. 2025. "Integration of YOLOv9 Segmentation and Monocular Depth Estimation in Thermal Imaging for Prediction of Estrus in Sows Based on Pixel Intensity Analysis" Digital 5, no. 2: 22. https://doi.org/10.3390/digital5020022

APA Style

Almadani, I., Robinson, A. L., & Abuhussein, M. (2025). Integration of YOLOv9 Segmentation and Monocular Depth Estimation in Thermal Imaging for Prediction of Estrus in Sows Based on Pixel Intensity Analysis. Digital, 5(2), 22. https://doi.org/10.3390/digital5020022

Article Metrics

Back to TopTop