Enhanced Non-Invasive Estimation of Pig Body Weight in Growth Stage Based on Computer Vision

de Oliveira, Franck Morais; Cadavid, Verónica González; Saraz, Jairo Alexander Osorio; Vega, Felipe Andrés Obando; Ferraz, Gabriel Araújo e Silva; Ferraz, Patrícia Ferreira Ponciano

doi:10.3390/agriengineering8050165

Open AccessArticle

Enhanced Non-Invasive Estimation of Pig Body Weight in Growth Stage Based on Computer Vision

by

Franck Morais de Oliveira

¹

,

Verónica González Cadavid

^2,*

,

Jairo Alexander Osorio Saraz

³

,

Felipe Andrés Obando Vega

⁴,

Gabriel Araújo e Silva Ferraz

¹

and

Patrícia Ferreira Ponciano Ferraz

¹

Department of Agricultural Engineering, School of Engineering, Federal University of Lavras (UFLA), Lavras 37200-900, Brazil

²

Grupo de Investigación en Biodiversidad y Genética Molecular (BIOGEM), Departamento de Producción Animal, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia, Sede Medellín, Carrera 65 #59A-110, Medellín 050034, Colombia

³

Departamento de Ingeniería Agrícola y de Alimentos, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia, Sede Medellín, Carrera 65 #59A-110, Medellín 050034, Colombia

⁴

Facultad de Ingeniería y Ciencias Agropecuarias, Institución Universitaria Digital de Antioquia, Carrera 55 #42-90 Int 0101, Medellín 050015, Colombia

^*

Author to whom correspondence should be addressed.

AgriEngineering 2026, 8(5), 165; https://doi.org/10.3390/agriengineering8050165

Submission received: 28 February 2026 / Revised: 31 March 2026 / Accepted: 21 April 2026 / Published: 28 April 2026

(This article belongs to the Collection Exploring the Application of Artificial Intelligence and Image Processing in Agriculture)

Download

Browse Figures

Versions Notes

Abstract

Pig weighing is an essential procedure for monitoring growth and animal health; however, conventional methods are often labor-intensive, costly, and potentially stressful. In this context, this study proposes a non-invasive approach for estimating the body weight of pigs during the growing stage based on computer vision and the YOLOv11 algorithm, enabling automatic segmentation and individual identification in multi-animal environments. The study used RGB images of 10 group-housed pigs captured throughout the growing phase, in which automatic dorsal segmentation was combined with individual identification through numerical markings. From the generated binary masks, the segmented dorsal area was extracted and used as a predictor variable in Linear Regression and a Multilayer Perceptron (MLP) Artificial Neural Network. The YOLOv11 model showed consistent performance in the segmentation task, achieving test-set metrics of Precision = 0.849, Recall = 0.886, mAP@0.50 = 0.936, and mAP@0.50–0.95 = 0.819, demonstrating good generalization capability in scenarios with intense animal interaction. In the weight prediction stage, Linear Regression and the MLP achieved high coefficients of determination (R² = 0.96 and 0.95, respectively) with low errors (RMSE = 1.52 kg and 1.63 kg; MAE = 1.20 kg and 1.25 kg), indicating a strong correlation between segmented dorsal area and actual body weight. Class-wise analysis revealed superior performance for classes 7 and 9, with R² values up to 0.98 and RMSE below 1.1 kg, whereas class 8 showed greater error dispersion, associated with higher morphological variability and a smaller number of available samples. These results demonstrate that the direct use of morphometric information extracted from segmented masks in 2D images constitutes a robust, accurate, and low-cost approach for automatic pig body-weight estimation. Moreover, this study is among the few addressing this task specifically during the growing stage, highlighting its potential for future deployment in embedded systems and intelligent monitoring platforms for precision pig farming, although further evaluation of computational efficiency and real-time performance is still required.

Keywords:

livestock precision; computer vision; machine learning; image analysis; weight prediction

1. Introduction

Body weight is a key indicator for monitoring pig growth during production [1]. Changes in body weight provide a direct means of assessing pigs’ health and growth status; however, weighing is a labor-intensive operation and can be stressful for both animals and farmers [2]. In more critical situations, moving animals to weighing equipment poses a risk of accidents that may result in injuries to both workers and pigs [3].

It has been highlighted that traditional methods for monitoring pig weight and behavior require substantial manual effort and may also induce stress in animals, thereby driving the search for automated solutions, as reported by [4]. Although automatic weighing systems are available, these devices are generally expensive, frequently integrated into feeding stations, and occupy valuable pen space, which limits their large-scale application, as discussed by [2]. In this context, computer vision-based technologies have emerged as promising alternatives for non-invasive body weight estimation, contributing to reduced operational costs and improved animal welfare.

Beyond weight estimation, computer vision has expanded its applications in pig production, enabling behavioral monitoring and animal welfare assessment. These possibilities are illustrated by [5], who proposed a system based on RGB-D cameras and deep neural networks capable of posture recognition, social interaction detection, and real-time activity monitoring in pigs. Such advances demonstrate that the technology is not limited to automated weighing but is evolving toward integrated solutions for individual tracking and behavioral analysis, aligning with modern approaches to automatic identification and recognition discussed below.

Traditional pig identification methods, such as the use of physical markings for visual recognition in computer-based systems [6], are still widely applied in both research and commercial settings. However, with technological advances, these markings have increasingly served as auxiliary tools for subsequent analytical stages, including automatic segmentation and individual tracking. More recent studies, such as [7], demonstrate the transition toward non-invasive approaches based on convolutional neural networks, capable of recognizing individual animals through facial images, thereby reducing reliance on physical markings. This scenario indicates that, although manual marking techniques remain in use, they are progressively being integrated into more automated and intelligent systems, enhancing the monitoring of pig behavior and growth.

Within the broad range of computer vision techniques applied to pig production, image segmentation has emerged as a fundamental step for accurately isolating animal body regions, thereby enabling detailed and automated analyses. Instance segmentation, in particular, represents a robust and precise approach for pig image analysis, allowing individual identification even under group-housing conditions. This technique is essential for extracting indicators related to health and welfare, enabling the estimation of parameters such as body condition score, live weight, and behavioral patterns, which are critical for management practices and decision-making in pig farming [8].

Recent studies, such as [9], emphasize that despite advances in three-dimensional methods, there remains a demand for robust and accessible 2D solutions. In parallel, recent developments in computer vision have demonstrated the potential of deep learning techniques, particularly convolutional neural networks and instance segmentation models, for pig weight estimation from images [10]. Moreover, advances in intelligent monitoring systems integrating computer vision with IoT technologies have further expanded the applicability of these approaches for real-time livestock management [11]. Although these methods have shown promising results, important technical challenges remain. Factors such as image quality variability, animal occlusion, and complex interactions in group-housed environments can negatively affect segmentation accuracy and, consequently, the reliability of weight prediction models.

Furthermore, there is still a scarcity of studies specifically focused on the growing phase of pigs—a critical stage in which continuous monitoring is essential to ensure proper development, animal welfare, and production efficiency. Existing works addressing automatic detection during this phase are limited and, in most cases, rely on more complex approaches or methods that are difficult to implement in practical production environments. Recent review studies further highlight that, despite significant technological advances, important challenges remain in the development of robust, scalable, and cost-effective solutions for real-world livestock monitoring, particularly under group-housed and dynamic conditions [12]. In this context, the present study proposes an accessible and low-cost 2D solution based on computer vision, employing the YOLOv11 model for automatic segmentation in multi-animal environments, followed by weight prediction using Linear Regression and Multilayer Perceptron (MLP) neural networks. This approach contributes to addressing key limitations of existing methods by providing a robust and scalable method for estimating pig body weight during the growing stage under realistic production conditions, while maintaining the potential for future integration into more precise three-dimensional systems.

2. Materials and Methods

2.1. Experiment Location and Data Acquisition

This study was conducted in accordance with the ethical guidelines established by the Animal Ethics Committee (CICUA-057-20) of the National University of Colombia, Medellín campus.

The images used in this study were collected at an experimental farm of the Universidad Nacional de Colombia, Medellín campus, located in the municipality of Rionegro, Antioquia, at an altitude of 2154 m above sea level (6°07′56″ N, 75°27′17″ W). The Agricultural Station (EA) has an average temperature of 16.2 °C, typically ranging between 12 and 18 °C, a relative humidity of 82%, and an average annual precipitation of 2645 mm.

Within the EA, the swine unit housed 10 pigs distributed in two pens, each measuring 2.41 m × 2.41 m, equipped with a 2 m continuous feeder and a nipple drinker. The pens were separated by 1.5 m, and a weighing scale was installed between them to obtain individual body weight measurements three times per week, positioned 0.5 m above ground level. The collective pen was fitted with plastic slatted flooring. The animals used in this study were weaned piglets in the growing phase, with body weights ranging from approximately 9 to 34 kg.

Videos were recorded using cameras installed 1.5 m above the pen floor. Two ArduCam OV5647 NoIR cameras were used, capturing RGB videos at an acquisition rate of 5 frames per second (FPS) and a resolution of 640 × 480 pixels. Mounted on a support structure above the enclosure, the cameras enabled continuous top-view image acquisition of the pigs throughout their growing phase. Recordings were conducted over a two-month period—December 2021 and January 2022 (7 weeks, 49 days)—continuously between approximately 7:00 a.m. and 5:00 p.m., resulting in images with natural lighting variations. As the proposed approach relies on relative morphometric features extracted from pixel-based dorsal area, and all videos were acquired under a fixed camera setup with consistent height and orientation, the image acquisition process maintained geometric consistency throughout the experimental period.

The analyzed period corresponds to selected intervals within the full monitoring duration rather than the entire continuous timeline. This selection was performed to ensure data quality and variability, prioritizing frames with suitable visibility and reducing redundancy from consecutive images, as well as minimizing issues such as excessive overlap in early stages and reduced dorsal visibility in later stages.

2.2. Obtaining Frames Extracted from Vídeos

To compose the dataset, a total of 947 videos recorded over 15 days during the intermediate period of the 7-week observation phase were processed. This period was selected because the pigs exhibited behavioral stability and lower weight variation, ensuring greater consistency in the analyses. The videos were segmented on a per-minute basis and captured the animals in their usual environment.

For frame extraction, a Python (version 3.12) script was developed and executed in the Google Colab environment to decompose the videos into static images. The script was configured to select 1 frame out of every 10 available frames in each video, aiming to increase variability among the extracted images and reduce redundancy between consecutive frames.

This procedure resulted in a structured dataset comprising 28,395 images, preserving the original recording characteristics and enabling their use in the subsequent segmentation and pig weight prediction stages.

2.2.1. Image Segmentation

The dataset used in this stage was constructed from a subset of the 28,395 frames extracted from the original database. To standardize the number of samples per day and avoid temporal bias, a maximum limit of up to 200 images per day was established across the 15 selected days. However, some days contained fewer than 200 valid frames due to variations in video quality, animal overlap, low illumination, and, primarily, the unavailability of usable images after automatic filtering. Consequently, the final number of selected images did not reach the theoretical total of 3000 images.

Images were randomly selected within each day, covering the recording interval from 07:00 a.m. to 05:00 p.m., ensuring diversity in behavioral and spatial conditions. After selection, the resulting dataset was divided into 60% for training, 20% for validation, and 20% for testing, yielding 1524 images for training, 438 for validation, and 439 for testing. The discrepancy relative to the expected values (approximately 1800/600/600) stems directly from the actual variation in the number of available images per day and from the fact that the split was performed on the final filtered dataset rather than on the theoretical total of 3000 images.

All images were uploaded to the MakeSense.ai platform, where they were manually annotated using the polygon tool. Ten distinct classes (labeled 1–10) were defined, corresponding to the individual identification of pigs based on the numbers visibly marked on their backs, enabling differentiation of each animal during the segmentation and classification processes. The annotations were designed to delineate the dorsal area of each pig in every image.

After the labeling process, the annotated data were exported in single-file COCO JSON format, ensuring compatibility with models based on the YOLOv11 framework. This procedure ensured proper dataset preparation for the subsequent neural network training stage, providing a standardized and structured dataset in accordance with COCO format specifications.

2.2.2. Criteria for Selecting Annotations for Images

The annotations performed on the images followed guidelines inspired by reference studies, such as [13,14,15], as well as more recent work by [9], which emphasize the importance of anatomical standardization in the top-view perspective to ensure consistent morphometric measurements. These studies recommend excluding the head and tail of pigs as an effective strategy to more accurately represent the dorsal area, which exhibits the highest correlation with body weight. This approach aims to standardize segmentation and prevent undesirable variability caused by body parts with greater mobility or less regular shapes.

In addition, a preliminary screening procedure was applied to ensure data quality. Only image frames in which the dorsal region of the pigs was clearly visible and suitable for reliable segmentation were considered for annotation.

Furthermore, only animals in a standing posture (orthostatic position) were considered suitable for annotation, as recumbent pigs may present visual distortions in the dorsal projection, such as lateral compression or expansion of the contact area, potentially compromising the accuracy of body area measurements and, consequently, the predictive models.

Annotations were also avoided in cases of animal overlap, which may generate incorrectly segmented regions, hinder model training, and reduce performance. Priority was given to pigs positioned more centrally within the camera’s field of view, where dorsal region capture tends to be more complete and subject to less perspective distortion. Animals located near the edges of the video frame often have portions of their body outside the field of view or exhibit more prominent lateral regions, which may lead the model to learn spatial representations inconsistent with true dorsal anatomy. Representative examples of the annotation criteria are presented in Figure 1. These criteria ensured the exclusion of frames with occlusions, incomplete body visibility, or unreliable segmentation, contributing to the consistency of morphometric feature extraction.

2.3. Proposed Model

In this study, the YOLOv11 architecture in its Large segmentation version (YOLOv11-L-seg) was employed, widely recognized for its effectiveness in real-time object detection and instance segmentation tasks. The selection of this model was primarily based on its high representational capacity, which is advantageous for handling complex scenarios involving multiple animals, occlusions, and variations in posture. Additionally, the focus of this study was to maximize segmentation accuracy under these conditions, rather than to optimize computational efficiency, making the Large variant a suitable choice for the proposed application. A comparative evaluation with other model scales was not conducted in this study.

The YOLOv11-L model was selected due to its higher representational capacity, which is advantageous for handling complex scenarios involving multiple animals, occlusions, and variations in posture. In this study, the primary objective was to maximize segmentation accuracy rather than computational efficiency, making the Large variant a suitable choice for the proposed application.

Figure 2 presents a performance comparison among different versions of the YOLO family, showing that YOLOv11 achieves high mAP@0.50–0.95 values on COCO benchmark evaluations while maintaining a favorable trade-off among accuracy, number of parameters, and inference time. These results reinforce its suitability for scenarios requiring efficient and robust processing.

It is important to note that the performance comparison presented in this figure is based on benchmark results obtained on datasets such as COCO, as reported by the Ultralytics framework, and does not correspond to an experimental evaluation conducted within this study. This comparison is included to provide context for the selection of YOLOv11. In this work, no direct comparison with other models was performed, as the objective was to evaluate the applicability of YOLOv11 for dorsal segmentation and subsequent weight estimation under the proposed conditions. Therefore, the results should be interpreted as a validation of the proposed approach rather than a comparative benchmark.

To more clearly illustrate the operation of the proposed inference model, Figure 3 presents the detailed workflow of YOLOv11 applied to the captured images. Starting from the original image of the pig pen, the model simultaneously performs animal detection, individual classification, and dorsal segmentation through binary masks.

These outputs—such as bounding boxes and segmented regions—enable the extraction of essential morphometric information for subsequent analyses, including pixel counting and weight estimation. The illustrated structure highlights the logical sequence of processing steps and demonstrates the model’s ability to automatically identify, localize, and segment each animal within the scene.

YOLOv11 Network Hyperparameters

The training process of the proposed model was conducted on the Google Colab platform using an on-demand NVIDIA T4 GPU. The YOLOv11 architecture in its Large instance segmentation version (YOLOv11-L-seg) was adopted and initialized with pre-trained weights to accelerate convergence and improve generalization capability. The dataset was structured in YAML format, explicitly defining the training, validation, and test partitions.

Model training followed a supervised learning approach with input images resized to a fixed resolution. Data augmentation strategies were applied to increase sample variability and mitigate overfitting, including color-space transformations, geometric distortions, and spatial augmentations. In addition, periodic checkpointing was implemented during training to ensure robustness against interruptions and to preserve intermediate model states.

The main training hyperparameters used in this study are summarized in Table 1.

The optimizer, learning rate scheduler, and early stopping strategy followed the default configuration of the Ultralytics YOLO framework, including a default patience value of 100 epochs.

At the end of the process, the final weights, best-performing models, and evaluation metrics were stored in Google Drive, ensuring experimental reproducibility and traceability of the obtained results.

2.4. Application of the Proposed Model for Weight Prediction

After training, the YOLOv11 model was applied to a new dataset comprising 25,994 additional images extracted from the original database, which were not included in the training, validation, or test phases. The purpose of this step was to use the trained network to automatically segment the dorsal region of the pigs and perform individual classification based on the dorsal markings visible in the images.

The outputs generated at this stage—segmented binary masks and corresponding numerical labels—were stored for subsequent use in a new phase of the study focused on body mass prediction. This additional step aimed to evaluate the model’s applicability within a continuous processing pipeline and its potential integration into automated solutions in the field of animal science.

2.4.1. Prediction Models

Finally, after obtaining the segmented dorsal areas of the pigs, a body weight prediction stage was developed based on the pixel values extracted from the binary masks. Two distinct models were employed for this purpose: Linear Regression and a Multilayer Perceptron (MLP) neural network. The objective was to evaluate the ability of these methods to capture the relationship between the segmented dorsal area and the animals’ actual body weight.

The data were consolidated into a single dataset containing, for each individual and day, the area metrics (in pixels), the corresponding body weight manually recorded using a digital scale, and a reference weight based on the Progeny Growth Curve PIC^® 337 × Camborough.

To ensure robustness, the dataset was split into training (70%), validation (15%), and testing (15%) subsets using the train_test_split function from Scikit-Learn, maintaining data randomness with a fixed seed (random_state = 42). The details of each adopted approach are described below.

Linear Regression

Linear Regression was used as a base model due to its simplicity and efficiency in relating numerical variables. The model was trained using the segmented area (pixels) as the independent variable (X) and the actual weight (kg) as the dependent variable (y). The adjusted equation follows the form:

y = β_{0} + β_{1} X

(1)

where y is the predicted weight, β₀ the intercept and β₁ the coefficient associated with the segmented area. The model was trained exclusively with the training data, and subsequently evaluated on the test set using the metrics R² (coefficient of determination), RMSE (Root Mean Squared Error) and MAE (Mean Absolute Error). This approach allowed establishing a baseline for comparison with more complex models, indicating the degree of linearity of the relationship between segmented area and body weight.

Multi Layer Perceptron Neural Network

The Multilayer Perceptron (MLP) was used with the aim of modeling possible non-linear relationships between the segmented dorsal area and the body weight of pigs. Before training, the input data was normalized using the StandardScaler algorithm, ensuring greater numerical stability and improved convergence of the learning process. The adopted architecture consisted of two hidden layers with 64 neurons each using the ReLU activation function in the hidden layers and the Adam optimizer. The output layer consists of a single neuron with linear activation, inherent to the regression configuration of the MLPRegressor, enabling direct prediction of continuous body weight values.

Training was configured for up to 1000 iterations, with the early stopping mechanism enabled, using a validation subset to automatically stop training in case of no improvements, reducing the risk of overfitting. The final performance evaluation was carried out on the test set, previously defined in the data division stage.

The performance of the MLP was evaluated using the same metrics adopted for the Linear Regression model (R², RMSE and MAE), enabling a direct comparison between the approaches. Due to its ability to represent more complex relationships, MLP constitutes a complementary alternative to linear regression for the automatic prediction of body weight based on the segmented area.

3. Results

Previous studies, such as [13], demonstrated that the projected body area of pigs extracted from top-view images exhibits a high correlation with live weight (r = 0.96), outperforming other variables such as body length or perimeter. In that study, the considered area excluded the head and cervical region and was delineated from the animal’s center of gravity, focusing on the dorsal and posterior regions. This methodological decision aimed to minimize the effects of postural variations, such as head inclination or movement, which could compromise the accuracy of area estimation and, consequently, live weight prediction. In a subsequent publication, Ref. [14] confirmed that body area was the best-performing predictor of weight, including in models based on artificial neural networks, achieving coefficients of determination of R² = 0.9932 for a linear model using manually selected images and R² = 0.9925 for a linear model based on automatically selected images.

In contrast to these studies in which segmentation and removal of the head and tail were performed using manual or semi-automated algorithms in MATLAB, the present study implemented fully automatic dorsal area segmentation using the YOLOv11 detection and segmentation model, previously trained to recognize and isolate the body region of interest. This approach enhances process efficiency and scalability while maintaining standardization of the segmented area, which likewise excludes the head and tail in accordance with the methodological guidelines established in earlier works.

3.1. Model Training Results

3.1.1. Splitting Data by Class for the Segmentation and Classification Model

Figure 4 shows the distribution of the number of instances per class in the datasets used for training and validating the pig segmentation and classification model. This analysis allows for checking the proportion between classes and identifying possible imbalances in the data, an important factor for the performance and generalization of the model.

It is observed that the distribution among the ten dorsal numerical classes is not entirely uniform in the training and validation partitions. In the training set, class 5 has the highest number of instances (444), while class 3 has the lowest (200). Similarly, in the validation set, class 5 is also predominant (129) and class 3 is less represented (66). Although this moderate imbalance could potentially affect model performance, particularly in less represented classes, its impact was mitigated through the use of advanced data augmentation techniques (such as rotations, color variations, scaling, shearing, and flipping), which increased data variability and reduced overfitting. It is also worth noting that part of this imbalance may be associated with natural animal behavior during image acquisition, as certain individuals may remain in postures or positions that limit the visibility of the dorsal region, reducing the number of valid annotations. As a result, the model demonstrated good generalization capability, as confirmed by the performance metrics obtained on the test dataset.

3.1.2. Training Performance Metrics

In order to track the model’s performance throughout the training process, losses and metrics were monitored by epoch in both training and validation data. Figure 5 shows the graphs automatically generated by YOLOv11, allowing observation of the evolution of the main metrics associated with detection (bounding boxes) and segmentation (masks), as well as the respective loss functions involved in the network’s learning. The loss function follows the standard formulation implemented in the Ultralytics YOLO framework for instance segmentation tasks, combining bounding box regression, classification, and mask segmentation losses. These graphs are fundamental for evaluating training stability, model convergence, and possible signs of overfitting, in addition to providing a detailed view of the algorithm’s generalization capacity.

A consistent and progressive reduction in losses associated with detection (box_loss), segmentation (seg_loss), classification (cls_loss), and box distribution regression (dfl_loss) tasks is observed in both the training and validation sets, indicating adequate model convergence across epochs. The proximity between the training and validation curves suggests stability in learning and the absence of significant overfitting.

In terms of performance, precision and recall metrics showed rapid growth in the early epochs, followed by stabilization at values above 0.80 for both tasks—detection (B) and segmentation (M). Similarly, the mAP@0.5 and mAP@0.5:0.95 curves demonstrate consistent evolution, exceeding 0.90 and 0.80, respectively, evidencing high accuracy in both locating individuals and precisely delineating segmented masks.

These results confirm the ability of YOLOv11-L-seg to learn robust discriminative representations from RGB images, showing good generalization in the validation data, even in challenging scenarios with multiple pigs simultaneously present in the same image and with variations in posture and partial occlusion.

3.1.3. Applying the Model to the Test Data

After the training was completed, the YOLOv11 model was evaluated using the previously separated test set, composed of images that were not viewed during training. This step aimed to analyze the model’s ability to generalize on independent data, simulating real-world application scenarios. The performance metrics obtained, including precision, recall, mAP@0.50 and mAP@0.50:0.95, are presented in Table 2, reflecting the model’s effectiveness in the joint task of segmenting and classifying pigs in images with multiple individuals.

Similar results in terms of accuracy in identifying body dimensions have also been reported in previous studies. For instance, Ref. [16] proposed a method based on geometric algorithms (BSDPE) to extract body area and dorsal length from top-view images of pigs, achieving accuracy above 97%. Although the approach developed by [16] did not rely on neural networks, it highlighted the importance of accurately delineating the dorsal region for morphometric analyses.

Complementarily, Ref. [17] demonstrated that modern computer vision models can maintain high performance in weight prediction when the segmentation stage is properly executed, regardless of the specific architecture employed. Therefore, although more recent approaches have been introduced in the literature, the results obtained in this study indicate that YOLO—even as a simpler and widely established model—is fully capable of producing robust segmentations suitable for practical applications in real-world production environments.

3.2. Results of Applying the Trained Model to New Data for Weight Prediction

To evaluate the practical applicability of the trained segmentation model, it was applied to an independent subset comprising 25,994 images derived from the same overall monitoring period. These images were not used during the training, validation, or testing stages, ensuring an independent evaluation of the model within the observed conditions. This dataset captures variability in posture, interaction, and body size. The objective was to detect and segment pigs, verifying the presence of at least one valid instance per image, using a confidence threshold of 90%.

It is important to note that the model was trained using images in which pigs were in a standing posture and without significant occlusion, as accurate dorsal area extraction requires a clear and unobstructed view of the animal. Therefore, postures such as lying animals or strong overlap between individuals fall outside the intended application domain of the proposed approach and were not explicitly evaluated.

Figure 6 presents, for each capture date, the total number of generated masks and the number of images containing at least one valid detection. The model maintained consistent performance throughout the analyzed period, with a high number of segmented instances, particularly on days with a larger volume of images, such as 29 December and 31 December. Although natural variations across dates were observed—associated with differences in image quantity, animal density, and acquisition conditions—a substantial proportion of images contained at least one valid detection.

These results confirm the robustness and stability of the model under real-world, large-scale inference conditions, reinforcing its suitability for continuous automated monitoring applications, such as morphometric feature extraction and subsequent pig body weight prediction.

During the first few days of the evaluation period, a lower number of images with valid detections were observed, defined as those that presented at least one segmented mask with ≥90% confidence. This behavior can be attributed to the fact that, in the initial stages of growth, piglets tend to remain more grouped together, with frequent social interactions and playful behaviors. This body proximity makes individual segmentation of the animals difficult, reducing the model’s ability to accurately delineate the backs for classification purposes and extraction of morphological characteristics. In addition, the high confidence threshold adopted in this study contributes to filtering out detections under these conditions.

Similarly, in the last dates of the analyzed period, a new decline in the number of valid detections is observed. This behavior is possibly associated with the larger physical size of the pigs at this stage, coupled with an increase in sedentary behaviors, such as prolonged periods of rest in lateral or sternal recumbency. Such postures reduce the visibility of the dorsal region and individual markings, compromising the effective segmentation of instances, especially under the strict confidence threshold applied.

On the other hand, the largest number of successful detections was concentrated between the end of December and the beginning of January. This interval coincides with an intermediate phase of animal growth, in which body size already allows for clear identification of dorsal markings, without prolonged resting behavior yet being predominant. Thus, this period constitutes a particularly favorable time window for the application of automatic segmentation, resulting in a high success rate and a greater number of masks generated. These results highlight the potential of the proposed model for applications in real zootechnical environments and reinforce the importance of temporal analyses to guide the implementation and optimization of the system throughout the different phases of the production cycle.

Figure 7 illustrates visual examples of the segmentation performed by the model in different phases of pig development. The columns represent three distinct periods: beginning (animals still small and very active), intermediate (growth and greater morphological definition), and end (large animals, with reduced movements). It is noted that the model’s performance remains adequate throughout the phases, despite the observed morphological and behavioral differences.

Beyond the visualization of individual pig segmentations and classifications, the masks generated by the model are fundamental for quantifying the animals’ body areas. These masks were used to compute the number of segmented white pixels per image through a Python function that converts the mask into binary format and performs pixel counting. Based on these counts, the relative dorsal area per day was estimated, enabling the application of a body mass prediction equation.

Similar findings were reported by [18], who emphasized the importance of generating binary masks after segmentation to accurately isolate the pig body, eliminate background noise, and facilitate the calculation of morphometric parameters. However, unlike the method proposed by [18], the approach presented in this study employs modern neural networks to perform segmentation automatically, without additional manual steps, making the process more robust and scalable.

Figure 8 illustrates examples of segmented images at different growth stages (small, medium, and large), along with their corresponding binary masks, highlighting the model’s ability to accurately identify the dorsal regions of the pigs.

Ref. [8] developed the PigMS R-CNN model to improve pig segmentation in group-housing environments, specifically addressing situations involving overlap between individuals. Although the focus of [8] was on detection accuracy and the separation of adjacent pigs, the present approach leverages the resulting segmentations for body area computation and subsequent body mass prediction, thereby extending the practical application of instance segmentation techniques.

Figure 9 presents the evolution of the mean segmented area (in pixels) over the monitoring period, considering the ten classes previously defined based on the animals’ body characteristics. For each day, the mean segmented area was calculated for all individuals belonging to each class, using the binary masks automatically generated by the YOLOv11 model. This approach enables continuous and non-invasive monitoring of relative body growth, providing relevant information on animal development throughout the production cycle.

A progressive increase in the mean segmented area, in pixels, can be observed over the monitoring days for all analyzed classes, directly reflecting the body growth of the pigs throughout the production cycle. It is important to emphasize that classes numbered from 1 to 10 correspond exclusively to identifiers assigned to individuals during the segmentation process and therefore do not represent an increasing order of body weight or size. Nevertheless, the consistent upward trend in the segmented areas over time indicates that the model was able to robustly track the animals’ morphological development.

The small fluctuations observed between consecutive days may be attributed to factors inherent to animal behavior and image acquisition conditions, such as intense movement, overlapping between individuals, and unfavorable positioning during image capture, all of which directly affect segmentation quality. These aspects highlight the importance of continuous analyses and time-series evaluation to properly assess the model’s performance throughout the production cycle.

It is worth noting that on 17 January, a reduction in the mean segmented area was recorded compared to adjacent days, particularly for some specific classes. This behavior may be associated with a lower number of available images during this period, increased clustering and interaction among animals, or localized detection and segmentation failures for certain individuals. Since the classes represent only identifiers and not a weight-based hierarchy, it is plausible that occasional segmentation inconsistencies temporarily influenced the estimated average pixel area on that date.

3.3. Results of the Prediction Models

Figure 10 shows the evolution of the average weight of the pigs over the days, comparing the actual values obtained experimentally, the predictions of the Linear Regression model and the MLP Neural Network, as well as the reference curve used to monitor growth. It can be observed that both models consistently followed the trend of daily weight gain, with small variations between them. The standard deviation bars indicate the variability of the predictions for each day, while the reference line serves as an additional parameter to validate the expected behavior of the animals during the analyzed period.

The performance metrics for these models are shown in Table 3.

Previous studies, such as that of [3], investigated the estimation of pig body weight from 2D images using fully connected neural networks, relying on the manual extraction of geometric features, such as curvature and deviation, to compensate for postural variations in the animals. Although the authors achieved a coefficient of determination of R² = 0.79, the method exhibited greater variability in the errors and required significant manual intervention during the segmentation process. In contrast, in the present study, the direct use of the segmented area as a predictive variable, applied to both Linear Regression and MLP models, resulted in superior performance, with R² values of 0.96 and 0.95, respectively, as well as reduced average errors (RMSE ≤ 1.63 kg and MAE ≤ 1.25 kg), as shown in Table 3. These findings indicate that the direct use of segmented masks provides a more stable morphometric representation and is less sensitive to noise associated with animal posture and movement.

The temporal analysis of the mean daily weight predictions further reinforces this behavior, as both models consistently track the growth trend observed in the mean real weight throughout the experimental period. It is also observed that Linear Regression achieved slightly superior performance in terms of R² and absolute errors, whereas the MLP exhibited lower relative dispersion of predictions, reflected by a smaller percentage standard deviation (Std = 11.88%). This result suggests that, although the linear model more directly captures the global relationship between segmented area and body weight, the MLP may offer greater stability in the presence of intra-class variability observed in the images.

Additionally, Ref. [17] evaluated different machine learning algorithms for estimating pig body weight from 2D images and identified XGBoost as the best-performing model, achieving an MAE of 3.93 kg. Despite employing a broader set of predictive variables, the models proposed in the present study—based exclusively on the segmented area—achieved mean errors close to 1 kg, even under real farming conditions characterized by high behavioral and environmental variability. These findings indicate that, despite the structural simplicity of the adopted models, morphometric information derived directly from segmented masks constitutes a highly efficient, robust, and competitive predictor for estimating pig body weight.

Although Multilayer Perceptron (MLP) networks are capable of modeling complex nonlinear relationships, the results obtained in this study indicate the similar—and in some cases slightly superior—performance of Linear Regression compared to MLP. This behavior can be explained by the fact that the relationship between segmented dorsal area and pig body weight exhibited a predominantly linear trend throughout the analyzed period.

In scenarios where there is a strong linear correlation between the predictor variable and the response variable, linear models tend to achieve competitive or superior performance compared to more complex models, particularly when the dataset is limited or structural variability is controlled [19,20,21]. In such situations, the use of more complex models may not yield significant gains in accuracy and may further increase the risk of overfitting.

Thus, the observed results corroborate findings in the literature indicating that, when the underlying relationship between variables is approximately linear, linear methods provide good generalization capability, greater interpretability, and lower computational cost, making them particularly suitable for practical applications in animal monitoring systems based on computer vision [22].

The next stage of the analysis aims to individually evaluate the performance of the Linear Regression and MLP Neural Network models for each of the ten identified pig classes. Figure 11 presents the relationship between the mean observed weight and the predicted weights for each class. The plots allow visualization of the models’ ability to track weight variations specific to each individual, highlighting possible discrepancies within certain weight ranges and demonstrating the level of fit achieved by each approach.

Table 4 presents the performance of the Linear Regression and Multilayer Perceptron (MLP) Artificial Neural Network models in predicting body weight for each of the 10 classes analyzed individually. Overall, both models achieved high coefficients of determination (R²), ranging from 0.91 to 0.98, indicating strong agreement between the observed weight values and the estimated values. The associated errors, expressed by RMSE and MAE, remained low for most classes, demonstrating the good predictive capability of the models even when evaluated on an individual basis.

In particular, Classes 7 and 9 exhibited the lowest RMSE and MAE values in both models, reflecting greater estimation accuracy. In contrast, Class 8 showed inferior performance, with the lowest R² values and the highest prediction errors. Although this class presents a comparable number of samples, its behavior suggests increased estimation difficulty associated with greater variability in the relationship between dorsal area and body weight. Based on the temporal analysis of dorsal area (in pixels), Class 8 tends to exhibit consistently higher and more dispersed area values over time compared to other classes, as reflected in the broader distribution of pixel values throughout the monitoring period, without a proportional improvement in prediction accuracy. This pattern indicates increased variability in the area–weight relationship, particularly at higher weight ranges, where prediction errors become more pronounced. As a result, the mapping between input features and target values is less consistent for this class, rather than indicating a limitation of the model itself. Furthermore, segmentation metrics remained consistently high across all classes, and no direct correspondence was observed between segmentation performance and prediction error, suggesting that the observed variability is more strongly associated with biological and morphometric differences than with segmentation quality. Classes 5 and 10 also presented relatively higher errors compared to the others, although still within an acceptable range, further reinforcing the overall robustness of the proposed approach.

The results obtained for Class 8 also highlight an important aspect of the proposed approach related to the use of 2D dorsal area as a predictor of body weight. Although dorsal area proved to be a strong proxy for animal size and demonstrated high overall performance in this study, under specific conditions it may not fully capture variations associated with body volume, which is more directly related to mass. As a result, differences in body conformation, thickness, and mass distribution may introduce additional variability in the relationship between projected area and actual weight. This behavior becomes more evident in certain individuals or growth stages, as observed for Class 8, where greater dispersion in prediction errors was identified. Furthermore, no direct correspondence was observed between segmentation performance and prediction error across classes, as segmentation metrics remained consistently high, indicating that the observed variability is more strongly associated with biological and morphometric differences rather than segmentation quality. Nevertheless, the overall results indicate that the approach is robust under the evaluated conditions, and the integration of complementary information, such as depth data, may represent a promising direction for further refinement.

Following the presentation of Table 4, which details the performance of the models for each class individually, Figure 12 illustrates the overall relationship between the average segmented area of the pigs (in pixels) and the corresponding average weight obtained from the model predictions. A clear linear trend is observed, showing that the increase in area is directly associated with the weight gain of the animals. The prediction curves obtained by Linear Regression and the MLP Neural Network consistently follow the trend of the real data, with small discrepancies mainly at the upper weight extremes, possibly caused by a smaller number of samples or model saturation. This result reinforces the robustness of the segmented area as a predictor variable for weight, indicating that, even in a global model without separation by classes, it is possible to obtain good accuracy in estimating body weight from images.

Recent advances in computer vision for pig weight estimation show a transition from two-dimensional methods based on image segmentation to three-dimensional systems that extract biometric measurements and predict weight with greater accuracy. Studies such as those by [2,23] present 3D approaches using depth cameras and automated computer vision, achieving high R² values and low error margins by incorporating variables such as animal volume, area, and shape descriptors. More recent hybrid methods, such as that proposed by [24], explore the combination of 2D segmentation with depth sensors to efficiently generate 3D point clouds, reinforcing the trend toward more comprehensive three-dimensional analyses.

The results obtained in this study demonstrate that the proposed 2D approach, combining instance segmentation with Linear Regression and MLP models, achieves performance levels consistent with recent findings in the literature. Previous studies have shown that models based solely on RGB images can provide reliable weight estimation even without the use of 3D sensors [17]. In this context, the present work reinforces that well-structured 2D strategies remain competitive, particularly due to their lower cost and easier implementation in commercial production systems. At the same time, such approaches can serve as a foundation for future hybrid or three-dimensional solutions, following current technological trends in precision livestock farming.

Despite the limited number of animals evaluated in this study, continuous monitoring over time enabled the collection of a large and representative dataset, capturing variations in growth, posture, and interaction among individuals. These results highlight the importance of data diversity, indicating that the variability and quality of the input data are key factors influencing model performance in real production environments. It is important to note that the proposed model was developed and validated primarily using images of pigs in a standing posture, where the dorsal region is clearly visible and suitable for morphometric extraction. As animals grow, behavioral changes may reduce the proportion of standing postures, which can limit the direct applicability of the method under certain conditions. From a practical perspective, this aspect represents a relevant constraint for deployment in commercial systems, where animal posture is not controlled. Therefore, future studies should consider including a larger and more diverse population, as well as exploring the impact of different postures (e.g., lying or overlapping animals) on weight prediction performance. Additionally, the integration of complementary features beyond dorsal area, or the development of posture-specific models, may further improve robustness across varying production scenarios.

While the proposed approach demonstrated high accuracy in both segmentation and weight prediction tasks, computational performance aspects such as inference speed (FPS), memory consumption, and model size were not evaluated, as they were beyond the scope of this study. These factors are critical for real-time deployment in embedded and resource-constrained systems, particularly in practical precision livestock farming scenarios. From an application perspective, evaluating and optimizing these parameters is essential to ensure scalability and operational feasibility. Therefore, future work should focus on benchmarking the model under such conditions and optimizing its architecture to achieve efficient real-time performance. Additionally, a comparative evaluation of different YOLOv11 model scales was not conducted and remains an important direction for future research, aiming to balance predictive accuracy and computational efficiency.

4. Conclusions

The developed models represent an important step toward the development of low-cost tools for pig weight estimation, with potential for future implementation in embedded systems, subject to further evaluation of computational efficiency and real-time performance. This approach enables producers to continuously monitor pig productive performance and adopt timely preventive actions when weight gain is compromised by environmental, physiological, or management-related factors. The ability to automatically and non-invasively estimate body weight under real production conditions has the potential to contribute to improvements in productive efficiency, animal welfare, and decision-making within the production system. Additionally, it is important to highlight that this study is among the few that specifically address automatic body weight estimation of pigs during the growing stage, a critical phase of the production cycle characterized by high behavioral and morphological variability. In this context, the presented results demonstrate that the use of morphometric information derived from segmented 2D image masks constitutes a promising and robust approach within the evaluated conditions. Future studies should focus on expanding the dataset to include a broader range of animal weights and production stages, as well as evaluating computational performance aspects such as inference speed, memory consumption, and model size, in order to support real-time deployment in embedded and resource-constrained environments. Additionally, future work should explore the integration of markerless individual identification approaches, such as facial recognition or coat pattern analysis, to improve the scalability and practical applicability of the system.

Author Contributions

Conceptualization, J.A.O.S., V.G.C., F.A.O.V. and P.F.P.F.; methodology, F.M.d.O., J.A.O.S., V.G.C., F.A.O.V., G.A.e.S.F. and P.F.P.F.; software, F.M.d.O., F.A.O.V. and G.A.e.S.F.; validation, F.M.d.O., J.A.O.S., V.G.C., F.A.O.V., G.A.e.S.F. and P.F.P.F.; formal analysis, F.M.d.O., V.G.C., F.A.O.V., G.A.e.S.F. and P.F.P.F.; investigation, J.A.O.S., V.G.C. and F.A.O.V.; resources, J.A.O.S., V.G.C. and F.A.O.V.; data curation, F.M.d.O., V.G.C., F.A.O.V. and P.F.P.F.; writing—original draft preparation, F.M.d.O., V.G.C., G.A.e.S.F. and P.F.P.F.; writing—review and editing, F.M.d.O., J.A.O.S., V.G.C., F.A.O.V., G.A.e.S.F. and P.F.P.F.; visualization, F.M.d.O., V.G.C. and F.A.O.V.; supervision, J.A.O.S., V.G.C., F.A.O.V. and P.F.P.F.; project administration, J.A.O.S., V.G.C. and F.A.O.V.; funding acquisition, J.A.O.S., V.G.C. and F.A.O.V. All authors have read and agreed to the published version of the manuscript.

Funding

Convocatoria para el Apoyo a Proyectos de Investigación y Creación Artística en la Universidad Nacional de Colombia-Sede Medellín 2020.

Institutional Review Board Statement

The research was conducted under the approval of the Animal Ethics Committee of the National University of Colombia, Medellín campus. Code: CICUA-057-20.

Informed Consent Statement

Not Applicable.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors gratefully acknowledge the Universidad Nacional de Colombia (UNAL), Medellín Campus, for providing the primary facilities and research infrastructure where the main experimental procedures of this study were conducted, as well as for the financial support granted through institutional funding programs and research calls. The authors also thank the Federal University of Lavras and the Graduate Program in Agricultural Engineering (PPGEA), School of Engineering, Federal University of Lavras, for their academic collaboration and institutional support. Finally, the authors acknowledge CAPES for the financial support through a graduate scholarship.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Liu, Z.; Zhang, X.; Ji, B.; Banhazi, T.; Li, C.; Zhao, S. Analysis of diurnal variations in body weight of wean-to-finish pigs. Biosyst. Eng. 2023, 228, 80–87. [Google Scholar] [CrossRef]
Kongsro, J. Estimation of pig weight using a Microsoft Kinect prototype imaging system. Comput. Electron. Agric. 2014, 109, 32–35. [Google Scholar] [CrossRef]
Jun, K.; Kim, S.J.; Ji, H.W. Estimating pig weights from images without constraint on posture and illumination. Comput. Electron. Agric. 2018, 153, 169–176. [Google Scholar] [CrossRef]
Zhou, H.; Dong, J.; Han, S.; Chung, S.; Ali, H.; Kim, S. Weakly supervised learning through box annotations for pig instance segmentation. Sci. Rep. 2025, 15, 19706. [Google Scholar] [CrossRef] [PubMed]
Nguyen, A.H.; Holt, J.P.; Knauer, M.T.; Abner, V.A.; Lobaton, E.J.; Young, S.N. Towards rapid weight assessment of finishing pigs using a handheld, mobile RGB-D camera. Biosyst. Eng. 2023, 226, 155–168. [Google Scholar] [CrossRef]
Kashiha, M.; Bahr, C.; Ott, S.; Moons, C.P.; Niewold, T.A.; Ödberg, F.O.; Berckmans, D. Automatic identification of marked pigs in a pen using image pattern recognition. Comput. Electron. Agric. 2013, 93, 111–120. [Google Scholar] [CrossRef]
Hansen, M.F.; Smith, M.L.; Smith, L.N.; Salter, M.G.; Baxter, E.M.; Farish, M.; Grieve, B. Towards on-farm pig face recognition using convolutional neural networks. Comput. Ind. 2018, 98, 145–152. [Google Scholar] [CrossRef]
Tu, S.; Yuan, W.; Liang, Y.; Wang, F.; Wan, H. Automatic detection and segmentation for group-housed pigs based on PigMS R-CNN. Sensors 2021, 21, 3251. [Google Scholar] [CrossRef] [PubMed]
Liao, Y.; Qiu, Y.; Liu, B.; Qin, Y.; Wang, Y.; Wu, Z.; Xu, L.; Feng, A. YOLOv8A-SD: A Segmentation-Detection Algorithm for Overlooking Scenes in Pig Farms. Animals 2025, 15, 1000. [Google Scholar] [CrossRef] [PubMed]
Liu, J.; Xiao, D.; Liu, Y.; Huang, Y. A pig mass estimation model based on deep learning without constraint. Animals 2023, 13, 1376. [Google Scholar] [CrossRef] [PubMed]
Paudel, S.; de Sousa, R.V.; Sharma, S.R.; Brown-Brandl, T. Deep learning models to predict finishing pig weight using point clouds. Animals 2023, 14, 31. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.; Li, Q.; Yu, Q.; Qian, W.; Gao, R.; Wang, R.; Wu, T.; Li, X. A review of visual estimation research on live pig weight. Sensors 2024, 24, 7093. [Google Scholar] [CrossRef] [PubMed]
Wang, Y.; Yang, W.; Winter, P.; Walker, L.T. Non-contact sensing of hog weights by machine vision. Appl. Eng. Agric. 2006, 22, 577–582. [Google Scholar] [CrossRef]
Wang, Y.; Yang, W.; Winter, P.; Walker, L. Walk-through weighing of pigs using machine vision and an artificial neural network. Biosyst. Eng. 2008, 100, 117–125. [Google Scholar] [CrossRef]
Wongsriworaphon, A.; Arnonkijpanich, B.; Pathumnakul, S. An approach based on digital image analysis to estimate the live weights of pigs in farm environments. Comput. Electron. Agric. 2015, 115, 26–33. [Google Scholar] [CrossRef]
Lu, M.; Norton, T.; Youssef, A.; Radojkovic, N.; Fernández, A.P.; Berckmans, D. Extracting body surface dimensions from top-view images of pigs. Int. J. Agric. Biol. Eng. 2018, 11, 182–191. [Google Scholar] [CrossRef]
Chen, Y.; Li, Z.; Yin, L.; Kuang, Y. A Novel Approach of Pig Weight Estimation Using High-Precision Segmentation and 2D Image Feature Extraction. Animals 2025, 15, 2975. [Google Scholar] [CrossRef] [PubMed]
Guo, Y.; Zhu, W.; Jiao, P.; Chen, J. Foreground detection of group-housed pigs based on the combination of Mixture of Gaussians using prediction mechanism and threshold segmentation. Biosyst. Eng. 2014, 125, 98–104. [Google Scholar] [CrossRef]
Hastie, T.; Tibshirani, R.; Friedman, J.H.; Friedman, J.H. The Elements of Statistical Learning: Data Mining, Inference, and Prediction; Springer: New York, NY, USA, 2009; Volume 2, Available online: https://link.springer.com/book/10.1007/978-0-387-84858-7 (accessed on 31 March 2026).
Hastie, T.; Tibshirani, R.; Friedman, J. An Introduction to Statistical Learning; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
James, G.; Witten, D.; Hastie, T.; Tibshirani, R. An Introduction to Statistical Learning: With Applications in R; Springer: New York, NY, USA, 2013; Volume 103. [Google Scholar] [CrossRef]
Bishop, C.M.; Nasrabadi, N.M. Pattern Recognition and Machine Learning; Springer: New York, NY, USA, 2006; Volume 4, p. 738. [Google Scholar]
Fernandes, A.F.; Dórea, J.R.; Fitzgerald, R.; Herring, W.; Rosa, G.J. A novel automated system to acquire biometric and morphological measurements and predict body weight of pigs via 3D computer vision. J. Anim. Sci. 2019, 97, 496–508. [Google Scholar] [CrossRef] [PubMed]
Wang, S.; Jiang, H.; Qiao, Y.; Jiang, S. A method for obtaining 3D point cloud data by combining 2D image segmentation and depth information of pigs. Animals 2023, 13, 2472. [Google Scholar] [CrossRef] [PubMed]

Figure 1. The Figure illustrates the images that were considered and those that were not considered for annotation: (a) Pigs in an upright posture, without overlapping and positioned centrally in the camera’s field of view, considered suitable for annotation; (b) Pigs lying down or located at the edges of the image, where the dorsal area is not fully visible or suffers distortion, were discarded from the annotation process.

Figure 2. Performance of some YOLO versions. Source: Ultralytics (2025).

Figure 3. Flowchart of the YOLOv11 model application on images.

Figure 4. Distribution of the number of instances per class: (a) training data; (b) validation data. A moderate class imbalance can be observed, with Class 5 presenting the highest number of samples and Class 3 the lowest in both partitions.

Figure 5. Results of the model during training and validation.

Figure 6. Result of applying the model.

Figure 7. Segmentation and classification of pigs at different growth stages. The first column presents representative examples of small pigs, the second column shows medium-sized pigs, and the third column illustrates large pigs.

Figure 8. Segmentation and binary mask extraction results generated by the YOLOv11 model at different pig growth stages. The figure is organized into two rows and six columns, with two columns dedicated to each growth stage (small, medium, and large). The top row displays the input images with the detections and classifications assigned by the model, while the bottom row presents the corresponding binary masks highlighting the segmented regions of each animal. These masks were used to compute the number of segmented pixels per image and subsequently estimate the pigs’ body areas.

Figure 9. Results of the evolution of the average segmented area in pixels over the days.

Figure 10. Mean daily weight prediction across animals.

Figure 11. Model prediction results for individual classes.

Figure 12. Results of model prediction by area in pixels.

Table 1. Main training hyperparameters used for the YOLOv11-L-seg model.

Parameter	Value
Epochs	200
Batch size	16
Image size	640 × 640
Initial learning rate (lr0)	0.001
Optimizer	Default (Ultralytics YOLO)
Learning rate scheduler	Default (Ultralytics YOLO)
HSV-H augmentation	0.010
HSV-S augmentation	0.5
HSV-V augmentation	0.2
Rotation (degrees)	20
Translation	0.2
Scale	0.25
Shear	5
Horizontal flip	0.5
Vertical flip	0.5
Checkpoint saving interval	Every 10 epochs
Early stopping (patience)	100 (default, Ultralytics YOLO)

Table 2. YOLOv11 Model Performance Metrics for Segmentation.

	Precision	Recall	mAP@0.50	mAP@0.50–0.95
Training	0.949	0.970	0.988	0.885
Test	0.849	0.886	0.936	0.819

Table 3. Prediction results.

Model	R²	RMSE	MAE	Std (%)
Linear Regression	0.96	1.52	1.20	13.22
MLP	0.95	1.63	1.25	11.88

Table 4. Prediction results for individual classes.

Class	Linear Regression			ANN-MLP
Class	R²	RMSE	MAE	R²	RMSE	MAE
1	0.95	1.61	1.18	0.95	1.69	1.23
2	0.95	1.47	1.08	0.95	1.44	1.14
3	0.95	1.52	1.08	0.96	1.47	1.11
4	0.96	1.44	1.14	0.96	1.37	1.08
5	0.95	1.64	1.40	0.95	1.60	1.40
6	0.98	1.53	1.27	0.95	1.65	1.45
7	0.98	0.96	0.82	0.98	1.01	0.93
8	0.91	2.35	1.69	0.89	2.67	1.86
9	0.97	1.13	0.85	0.96	1.20	0.90
10	0.94	1.79	1.31	0.94	1.79	1.31

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

de Oliveira, F.M.; Cadavid, V.G.; Saraz, J.A.O.; Vega, F.A.O.; Ferraz, G.A.e.S.; Ferraz, P.F.P. Enhanced Non-Invasive Estimation of Pig Body Weight in Growth Stage Based on Computer Vision. AgriEngineering 2026, 8, 165. https://doi.org/10.3390/agriengineering8050165

AMA Style

de Oliveira FM, Cadavid VG, Saraz JAO, Vega FAO, Ferraz GAeS, Ferraz PFP. Enhanced Non-Invasive Estimation of Pig Body Weight in Growth Stage Based on Computer Vision. AgriEngineering. 2026; 8(5):165. https://doi.org/10.3390/agriengineering8050165

Chicago/Turabian Style

de Oliveira, Franck Morais, Verónica González Cadavid, Jairo Alexander Osorio Saraz, Felipe Andrés Obando Vega, Gabriel Araújo e Silva Ferraz, and Patrícia Ferreira Ponciano Ferraz. 2026. "Enhanced Non-Invasive Estimation of Pig Body Weight in Growth Stage Based on Computer Vision" AgriEngineering 8, no. 5: 165. https://doi.org/10.3390/agriengineering8050165

APA Style

de Oliveira, F. M., Cadavid, V. G., Saraz, J. A. O., Vega, F. A. O., Ferraz, G. A. e. S., & Ferraz, P. F. P. (2026). Enhanced Non-Invasive Estimation of Pig Body Weight in Growth Stage Based on Computer Vision. AgriEngineering, 8(5), 165. https://doi.org/10.3390/agriengineering8050165

Article Menu

Enhanced Non-Invasive Estimation of Pig Body Weight in Growth Stage Based on Computer Vision

Abstract

1. Introduction

2. Materials and Methods

2.1. Experiment Location and Data Acquisition

2.2. Obtaining Frames Extracted from Vídeos

2.2.1. Image Segmentation

2.2.2. Criteria for Selecting Annotations for Images

2.3. Proposed Model

YOLOv11 Network Hyperparameters

2.4. Application of the Proposed Model for Weight Prediction

2.4.1. Prediction Models

Linear Regression

Multi Layer Perceptron Neural Network

3. Results

3.1. Model Training Results

3.1.1. Splitting Data by Class for the Segmentation and Classification Model

3.1.2. Training Performance Metrics

3.1.3. Applying the Model to the Test Data

3.2. Results of Applying the Trained Model to New Data for Weight Prediction

3.3. Results of the Prediction Models

4. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI