Enhanced Prediction of Broiler Shipment Weight Using Vision-Assisted Load Cell Analysis

Lunfei Yang; Juwhan Song

doi:10.3390/agriculture15181947

and

¹

Graduate School of Artificial Intelligence, Jeonju University, Jeonju-si 55069, Republic of Korea

²

Artificial Intelligence Research Center, Jeonju University, Jeonju-si 55069, Republic of Korea

^*

Author to whom correspondence should be addressed.

Agriculture2025, 15(18), 1947;https://doi.org/10.3390/agriculture15181947

This article belongs to the Section Artificial Intelligence and Digital Agriculture

Version Notes

Order Reprints

Abstract

Accurate prediction of broiler shipment weight is essential for optimizing production planning and meeting market demand. Previous studies have estimated representative daily weight values from load cell data using K-means clustering and kernel density estimation (KDE) and have applied forecasting models such as Prophet, ARIMA, and Gompertz. Among these, the combination of K-means and Prophet demonstrated the best performance. In this study, we propose an enhanced method integrating computer vision with load cell measurements. The YOLOv8n model localizes broilers in images, while a 5-pixel edge region, both inside and outside the weighing platform boundaries, filters invalid weight values. This enables accurate broiler counting on the weighing platform. The instantaneous population mean weight distribution is estimated by dividing the total measured weight by the detected broiler count. The representative daily weight values are then calculated through averaging. Additionally, we compare five outlier processing methods to evaluate their effectiveness in improving prediction accuracy. Experimental results show that our method achieves a prediction error of less than 50 g for broiler shipment weights, which will significantly improve farm operation efficiency and reduce feeding cost losses. This approach has already been deployed in selected farms and is ready for comprehensive implementation.

Keywords:

broiler weight prediction; load cell; object detection; outlier handling; smart poultry farming

1. Introduction

The livestock industry in Republic of Korea faces numerous external challenges, including globalization, a declining agricultural population, and an aging society. These factors are making it increasingly difficult to maintain self-sufficiency in livestock and feed production. In contrast, smart livestock farming holds the potential to reduce resource waste and labor demands, thereby supporting sustainable and autonomous agricultural practices. In recent years, smart livestock management technologies based on big data and artificial intelligence have significantly improved productivity through advancements in breeding, nutrition, environmental control, and maintenance reduction [].

In the poultry sector, Republic of Korea has made continuous progress in meat quality and environmental management. However, the domestic self-sufficiency rate remains relatively low, with a heavy reliance on imports. In 2024, the total import volume of chicken breast, wings, and legs reached 184,716 tons, and this upward trend continued from January to April 2025 [].

In Jeollabuk-do, determining the optimal shipment time for broilers has long been a major challenge for poultry farmers. According to standard broiler farming contracts, if the difference between the predicted and actual shipment weight is within ±50 g, farmers receive a bonus of 3 KRW per kilogram. However, if the deviation exceeds this range, a penalty of 6 KRW per kilogram is imposed. Therefore, accurate weight prediction is directly tied to farmers’ income. Additionally, buyer requirements have become increasingly specific—for instance, distributors demand broilers weighing between 1.1 and 1.2 kg, school lunch suppliers require 1.7 kg, and processing companies request 1.9 kg. If the broilers fall outside the desired weight range, they must be sold as cut-up parts, which complicates operations and reduces profitability. As such, effective monitoring and prediction of broiler weight are essential for modern poultry farming. Some large-scale farms have already begun installing load cells and overhead cameras to collect growth data. However, numerous challenges arise during actual implementation. Most notably, load cells often struggle to distinguish between valid and anomalous data, or they apply overly strict standards, resulting in significant noise that negatively impacts the accuracy of predictive models [].

In our previous research, we proposed a prediction framework capable of automatically collecting broiler weight data from smart farms in the Namwon and Wanju regions. To address the challenges posed by manual weighing and data noise, K-means clustering and KDE were applied to optimize the raw data and extract representative daily weight values. These values were then used as inputs for various time series forecasting models, including Prophet, Gompertz, and ARIMA. Among these, the K-means + Prophet prediction model showed the best evaluation index, with an MAE of 79.65, MAPE of 4.92, and RMSE of 102.18. The model that showed the second-best prediction performance was the model that used the K-means + double exponential smoothing method. It recorded an MAPE of 9.82 and an RMSE of 207.99 []. Our study revealed that extracting representative daily weight values is the primary factor affecting prediction accuracy across various growth models. When using identical daily weight values, the Prophet predictor demonstrated superior performance, followed sequentially by the double exponential smoothing, ARIMA, and Gompertz models in descending order of effectiveness.

The K-means and KDE methods exhibit a fundamental limitation: when weight sample data from individual broilers are insufficient, two critical issues arise. First, the inability to form effective density distributions results in a complete loss of weight distribution characteristics for single broilers. Second, the limited density sample size causes the extracted age-representative weight values to significantly deviate from the true range, leading to systematic overestimation or underestimation phenomena.

Building upon the aforementioned research framework and aiming to obtain stable age-representative weight values, this study proposes an enhanced shipment weight prediction method that integrates computer vision with sensor measurements. The approach employs a YOLOv8n object detection algorithm to accurately and efficiently identify and count the number of broilers on the load cell in real time. To improve detection accuracy, an edge/center region filtering strategy is applied to exclude images that may introduce noise or errors. This study establishes a fixed edge region width of 5 pixels both inside and outside the boundaries of the weighing platform. Building upon this foundation, the total weight recorded by the load cell is divided by the detected number of broilers to estimate a more robust average weight per broiler. We employ six distinct methods to calculate representative weight values from the acquired data: (1) direct mean computation and five outlier filtering approaches—IQR + Z-score, DBSCAN, Isolation Forest, One-Class SVM, and Mahalanobis distance—each followed by mean calculation. These representative weight values are then constructed into a time series and fed into the Prophet model for final shipment weight prediction. Experimental results demonstrate that this method significantly improves prediction accuracy, primarily attributed to the highly stable calculation of age-representative weight values. The edge/center region filtering strategy excluded a significant volume of noisy weight measurements. Comparative analysis of five outlier filtering methods further indicates that outlier processing moderately enhances accuracy, confirming the method’s exceptional stability and strong robustness.

The method proposed in this study features straightforward deployment that seamlessly integrates with existing farm operations without disrupting routine feeding environments. The age-specific representative weight values demonstrate exceptional stability, with weight distributions accurately reflecting real-time growth trends of broiler populations. The method proposed in this study provides a more reliable decision support tool for transportation scheduling in smart poultry farms, demonstrating strong potential for practical application.

2. Background

2.1. Overview of Broiler Weight Prediction Methods

Predicting broiler weight is essential for optimizing production and determining shipment schedules. Traditional studies primarily employed nonlinear growth models such as the Gompertz model, the logistic model, and the von Bertalanffy model to fit broiler growth curves under various conditions [,,,,,,,]. These models generally offer high fitting accuracy, although their performance may vary depending on age and breed. Later research introduced dynamic neural networks that incorporate environmental data such as feed intake, humidity, and temperature, thereby improving prediction accuracy but also increasing data collection costs [,,]. Although these methods have achieved success, most rely on ideal or manually optimized data, which limits their applicability in real-world scenarios.

Monitoring and collecting broiler weight information can help farmers understand the growth status and trends of their flocks, allowing them to adjust feeding strategies accordingly. De Wet et al. [] observed 50 broilers raised under commercial conditions to compare traditional manual weighing with automatic weighing systems. They used nonlinear regression to analyze the relationship between body weight and target surface pixel count, as well as between body weight and target contour pixel count. Mortensen et al. [] used 3D computer vision technology combined with neural networks to predict broiler weight. The prediction error ranged from 10 to 100 g in the early stage of broiler growth and from 50 to 250 g in the later stage. Amraei et al. [] also employed 3D computer vision technology along with neural networks for broiler weight prediction. In the study, digital image processing techniques were used to extract features such as area, perimeter, convex area, major and minor axes, and eccentricity from broiler images, and a neural network was trained for weight prediction. This method achieved a prediction error of less than 50 g. Liu et al. [] used a depth camera and an electronic scale to perform individual weighing of broilers. After segmenting the broiler target regions, they applied KDE for adaptive gender classification. This method achieved a gender classification accuracy of 99.7% and an individual sampling rate of 77.32%. However, the study required 70 h of manual effort to annotate the dataset. Although numerous research achievements have been made in the field of livestock weight prediction, these methods still face various practical challenges, such as differences in farming environments and high implementation costs [].

In large-scale broiler farms where approximately 30,000 chickens are raised simultaneously, most existing research methods for monitoring broiler weight growth cannot be directly deployed in such farming environments. To further improve the development of smart poultry farms and achieve a more automated breeding system, we installed automated weighing sensors within the farm to collect broiler weight data in real time.

Our research found that collecting single broiler weight measurements contains significant noise, and these numerous invalid weight values hinder the analysis of data density, resulting in certain discrepancies between the obtained age-representative weight values and the true values. To address this issue, the crucial step of assigning broiler counts to weight values becomes essential. We configured cameras above the weighing sensors to simultaneously capture both weight data and visual information of the weighing platform. By using object detection algorithms to assign broiler counts to weight values, we obtained more comprehensive population weight distributions, which significantly contribute to shipping weight predictions and monitoring flock growth conditions during the breeding period. Therefore, employing object detection algorithms for broiler identification and localization is critically essential.

2.2. Object Detection for Broiler Counting on the Weighing Platform

With the advancement of computer vision technology, image analysis methods based on object detection have been progressively applied to poultry quantity monitoring. In 2020, Guo et al. [] employed traditional image processing methods (including color space classification, Otsu binarization, and clustering algorithms for static/moving object identification) to analyze broiler floor distribution, establishing foundational techniques for real-time monitoring tools to assess broiler behavior and spatial patterns in commercial facilities. Geffen et al. [] utilized the Faster R-CNN, a convolutional neural network (CNN)-based object detection algorithm, to automatically count caged laying hens. In 2022, Siriani et al. [] achieved a 99.9% detection accuracy for chickens in low-quality videos using the YOLOv4 model. By 2024, Cruz et al. [] adopted the YOLOv8 object detection model for precise chicken counting, while a comparative analysis with earlier models, including YOLOv5, highlighted YOLOv8′s superior accuracy and robustness [], demonstrating its strong practical potential. YOLO series of object detection algorithms, particularly YOLOv8, has emerged as one of the most performant and practical technologies for poultry quantity recognition, providing a reliable image-based foundation for broiler weight estimation.

However, existing studies remain limited as most methods focus solely on quantity counting without directly correlating detection results with weight data. To address this, our study proposes integrating object detection algorithms to identify broiler quantities on weighing sensor platforms and real-time matching with sensor data. This approach significantly enhances the monitoring accuracy of broiler weight growth by resolving the disconnection between quantity and weight data in conventional methods, thereby providing more reliable technical support for precision farming.

2.3. Outlier Handling

After determining the number of broilers on the weighing sensor platform through object detection algorithms, the weight data from the load cells can be utilized to estimate the average body weight per unit time. However, in actual production environments, automatically collected weight data are often affected by factors such as overlapping broilers, abnormal postures, feed residues, feces, or feather interference. These factors introduce significant outliers and noise into the sensor data. To ensure the accuracy and stability of the predictive model training, systematic outlier detection must be performed on the raw data prior to modeling.

2.3.1. IQR and Z-Score

Among common statistical methods, IQR (interquartile range) and Z-score (standard deviation method) [] are widely used for outlier detection in univariate data. The IQR method determines outliers by calculating the distance between the first quartile and third quartile, making it suitable for data with clear median trends. The Z-score method calculates sample deviations based on the mean and standard deviation, making it applicable to normally distributed data. Both methods are computationally simple and efficient, but their effectiveness is limited when dealing with high-dimensional or asymmetrically distributed data.

2.3.2. DBSCAN

DBSCAN (Density-Based Spatial Clustering of Applications with Noise), originally proposed by Ester et al., can divide clustering regions based on density and automatically identify non-clustered “outliers”, making it widely used in outlier detection tasks []. The algorithm determines whether a point belongs to a high-density region by defining two parameters: neighborhood radius

ε

(epsilon) and minimum number of points MinPts in the neighborhood. First, for any point

p

in the dataset, its

ε

-neighborhood is defined as in Equation (1). Here,

D

represents the sample dataset, and

d i s t (p, q)

denotes the distance between points

p

and

q

, typically calculated using Euclidean distance. If the

ε

-neighborhood of a point

p

contains at least MinPts (the minimum point threshold), then

p

is defined as a core point, i.e.,

| N_{ε} (p) | \geq M i n P t s

. Furthermore, if a point

q

lies within the

ε

-neighborhood of a core point

p

, then

q

is considered directly density-reachable from

p (q \in N_{ε} (p)

, provided

p

is a core point.

N_{ε} (p) = \{q \in D | d i s t (p, q) \leq ε\}

(1)

During algorithm execution, all core points and their density-reachable points are clustered together. Points that are neither core points nor covered by any core point—that is, points that are not density-reachable—are labeled as “noise points” or outliers, as formally defined in Equation (2). Therefore, DBSCAN achieves automatic outlier detection while performing clustering tasks through its density-based rejection mechanism.

q \notin \cup N_{ε} (p), \forall p \in {C o r e P o i n t s}

(2)

2.3.3. Isolation Forest

Isolation Forest is an unsupervised anomaly detection method based on the concept of “isolation”. Its core principle is that anomalous samples, being sparsely distributed in the data space, can be isolated more quickly through random partitioning, while normal samples typically reside in dense regions and require more splits to be isolated [].

The steps for anomaly detection with Isolation Forest are:

1.: Construct $t$ binary trees (Isolation Trees, iTrees) randomly from the training dataset;
2.: For a test sample $x$ , input it into all iTrees and record the path length $h_{i} (x)$ required from the root node to complete isolation (reaching a leaf node) in each tree. Then, calculate the average path length across all trees (Equation (3)):

$\bar{h} (x) = \frac{1}{t} \sum_{i = 1}^{t} h_{i} (x)$

(3)
3.: The anomaly score is calculated according to Equation (4), where $c (m)$ represents the theoretical expected path length for a sample size of $m$ , serving as a normalization factor for path lengths, as defined in Equation (5), where $H (i) \approx l n (i) + γ$ represents the approximation of the i-th harmonic number, with the constant $γ \approx 0.5772$ being the Euler–Mascheroni constant:

$s (x, m) = 2^{\frac{- \bar{h} (x)}{c (m)}}$

(4)

$c (m) = \{\begin{matrix} 2 H (m - 1) - \frac{2 (m - 1)}{m}, m > 2 \\ 1, m = 2 \\ 0, m \leq 1 \end{matrix}$

(5)
4.: If $s (x, m) \approx 1$ , the sample $x$ is highly likely to be an anomaly. If s $(x, m) < 0.5$ , the sample x is generally considered normal. When all samples in the dataset yield scores close to 0.5, it indicates no significant anomalies exist in the dataset:

The Isolation Forest method leverages an intuitive combination of random tree structures and path lengths, offering significant advantages including high computational efficiency and strong scalability with data size.

2.3.4. One-Class Support Vector Machines

One-Class SVM (One-Class Support Vector Machines) is an unsupervised learning method commonly used for anomaly detection, proposed by Schölkopf et al. in 1999 []. Its core concept involves identifying an optimal hyperplane in a high-dimensional feature space that separates the majority of samples from the origin, thereby detecting anomalies that deviate from the primary data distribution. The model training process is achieved by solving a convex quadratic programming problem (Equations (6) and (7)).

\min_{w, ξ_{i}, ρ} (\frac{1}{2} {‖w‖}^{2} + \frac{1}{v n} \sum_{i = 1}^{n} ξ_{i} - ρ)

(6)

s u b j e c t t o w \cdot Φ (x_{i}) \geq ρ - ξ_{i}, ξ_{i} \geq 0

(7)

Here,

Φ (x_{i})

denotes the kernel function that maps input samples to a high-dimensional feature space. The slack variable

ξ_{i}

permits some samples to reside inside the hyperplane, while the parameter

ν \in (0,1]

controls the trade-off between model complexity and anomaly tolerance. The hyperplane offset

ρ

determines the decision boundary position. After training, the One-Class SVM’s discriminant function is given by Equation (8).

f (x) = s g n (\sum_{i = 1}^{n} α_{i} k (x_{i}, x) - ρ)

(8)

s u b j e c t t o 0 < α_{i} \leq \frac{1}{v n}, \sum_{i = 1}^{n} α_{i} = 1

(9)

where

α_{i}

represents the Lagrange multipliers obtained by solving the dual optimization problem, subject to the constraints in Equation (9). A sample x is classified as anomalous when the decision function yields

f (x) = - 1

and as normal when

f (x) = + 1

. Commonly used kernel functions include the Radial Basis Function (RBF) kernel (Equation (10)). Here,

σ

denotes the bandwidth parameter of the kernel function, controlling the model’s sensitivity to data variations. A smaller

σ

makes the model sensitive to local differences, while a larger

σ

emphasizes global trends. Furthermore, Schölkopf et al. [] demonstrated that the hyperparameter

ν

in One-Class SVM has clear statistical significance: it represents both the theoretical upper bound for the proportion of anomalies and the theoretical lower bound for the proportion of support vectors. Due to its effective handling of high-dimensional data, One-Class SVM has found widespread practical applications in sensor data analysis, image recognition, industrial data cleaning, and anomaly detection tasks while maintaining strong generalization performance.

k (x_{i}, x_{j}) = e x p (- \frac{{‖x_{i} - x_{j}‖}^{2}}{2 σ^{2}})

(10)

2.3.5. Mahalanobis

The Mahalanobis distance [] is a multivariate outlier detection method that accounts for correlations between data features, originally proposed by P. C. Mahalanobis in 1936. Unlike conventional Euclidean distance, the Mahalanobis distance effectively identifies outliers in multidimensional spaces with correlated features by incorporating the data’s covariance matrix. For a given sample point x, its Mahalanobis distance from the dataset’s centroid (mean vector) μ is defined by Equation (11).

D_{M} (x) = \sqrt{{(x - μ)}^{T} Σ^{- 1} (x - μ)}

(11)

Here,

x

represents the feature vector of the sample being tested,

μ

denotes the mean vector of the dataset,

Σ

is the covariance matrix of the dataset, and

Σ^{- 1}

represents the inverse of the covariance matrix. In anomaly detection tasks, the Mahalanobis distance is typically employed to measure how significantly a sample deviates from the center of the overall data distribution. When a sample’s Mahalanobis distance substantially exceeds the average level of other samples, it can be identified as a potential outlier. For livestock sensor data—such as broiler weight, body size characteristics, or other high-dimensional features where strong correlations may exist—the Mahalanobis distance serves as a robust and effective anomaly detection method. It helps identify outliers or noisy data that may occur during data collection, thereby enhancing the prediction accuracy and reliability of models.

3. Materials and Methods

3.1. Data Collection

This study utilizes a dataset collected from broiler farms operated by a poultry company in Jeollabuk-do, Republic of Korea. The farm specifications are illustrated in Figure 1. Each farm measures 70–80 m in length and 19 m in width, with sidewall heights of 4 m and roof heights of 6.5 m. The facilities have a maximum capacity of approximately 30,000 Cobb500 broilers per rearing cycle. The raw dataset comprises at least three data categories: weight measurements, temporal data, and image data. For comprehensive monitoring, we deployed IoT-enabled weighing sensor devices (Emotion Co., Ltd., Jeonju, Jeollabuk-do, Republic of Korea) across three strategic locations in the broiler resting areas—front, middle, and rear sections—to capture both weight metrics and broiler count data.

Figure 1. Smart poultry farming facility.

Additionally, we installed overhead cameras above each weighing sensor to acquire visual data. Figure 2 displays the camera positioning and captured footage. When the automated weighing device detects value fluctuations, it indicates that broilers have either stepped onto or departed from the weighing platform. For locally recorded data, we implemented a filtering protocol that eliminates measurements below 10 g or exceeding 2500 g. The validated data are transmitted hourly to cloud databases via Integrated Gateway (IoT G/W) devices (e.g., Advantech MIC-710 series) for subsequent analysis.

Figure 2. (a) IoT-enabled weighing sensor devices; (b) synchronized device-captured footage. (red arrows denote the camera field-of-view directions).

The algorithm was applied to five datasets collected from broiler farm KF0081 between November 2023 and January 2025, with detailed dataset specifications provided in Table 1. Each dataset contains per second weight measurements and image data recorded continuously from the initial rearing stage through to the shipping date. Across the approximately 29–35-day rearing cycles, the datasets encompassed between 2,030,085 and 2,745,771 images. Variations in dataset sizes primarily stemmed from differences in broiler growth rates that affected rearing durations, along with data gaps caused by equipment malfunctions and recording adjustments due to canceled or modified shipment schedules. The saved images follow a naming convention consisting of farm name + houseID + scaleID + year-month-day + hour-minute-second + weight (g) (representing the load cell value in grams). The images are stored in JPEG format as 320 × 240-pixel RGB images.

Table 1. List of the datasets used.

Based on the originally collected image data, we further processed and annotated the images to construct a high-quality dataset suitable for object detection model training. As shown in Figure 3, during the 0–35-day rearing period, we randomly sampled images from the complete set at 4-day intervals, totaling 810 images. These images were manually annotated using the LabelImg (v1.8.6) tool, with the annotation category limited to “broiler” to ensure target consistency and dataset focus. Among these, 630 images were used for training, while 180 images (90 each for validation and testing) were allocated for model performance evaluation. To improve detection accuracy, the image samples covered broilers at different ages, postures, quantities, and occlusion conditions, aiming to reproduce the image diversity encountered in real farming environments as comprehensively as possible. Figure 3a displays the total number of annotated targets at each age, while Table 2 provides the corresponding number of images for each age group.

Figure 3. (a) Composition of the object detection dataset; (b) number of images in training, validation, and test sets.

Table 2. Image count by age group.

The primary objective of employing object detection algorithms is to accurately quantify the number of broilers on the weighing sensor platform. While various metrics generated during model training reflect the localization precision of broilers in images, these indicators cannot substitute for the essential broiler counting process itself. To evaluate the feasibility of broiler quantity counting, beyond the metrics generated during the object detection model training, this study performed manual verification on the processed images from the 2023_1117_KF0081_01-Img_Data (Farm KF0081 began rearing on 17 November 2023, with house ID 01, collected 2,191,039 images containing timestamps and weight sensor readings in filenames) dataset to further evaluate the model’s practical performance. The verification methodology involved randomly selecting 180 detected images per day (totaling 5040 images) across different age groups, with each image manually inspected to validate the accuracy of the model’s broiler counting on load cell platforms. The evaluation results were classified into three categories: (1) detected count matching actual number (true positive), (2) discrepancy between detection and actual count (false positive), and (3) severely blurred/occluded images where accurate counting was unverifiable (human uncertainty). This assessment complements quantitative metrics by demonstrating the model’s robustness under varying rearing ages, stocking densities, and lighting conditions in real-world applications.

This study proposes an edge/center region filtering strategy that effectively eliminates a substantial portion of invalid weight interference values. To validate the effectiveness of the edge/center region strategy in improving weight data quality, this study selected 14,115 images generated between 21 May 2024 03:59:58 and 21 May 2024 07:59:59 from the 2024_0502_KF0081_02-Img_Data dataset as experimental samples. In this experiment, three different edge/center region boundary configurations were applied to detect broilers in the images, and the average weights of broilers located in edge regions versus center regions were calculated. By comparing the weight distribution differences between these two regions, we verified whether edge regions would introduce significant bias to the final average weight estimation, thereby evaluating the efficacy and rationale of this regional filtering strategy for data cleaning and representative sample extraction.

3.2. Algorithm Composition and Design

The broiler shipment weight prediction algorithm proposed in this study consists of three main steps: detecting the number of broilers on the load cell (using the YOLOv8n model to locate broilers in the images, with an edge region of 5 pixels inside and outside the edges of the weighing sensor platform to filter out invalid weight values); determining representative average weight values (mean values after various outlier processing); and predicting shipment weight (when the daily age weight representative value exceeds 1000 g, the Prophet model is immediately used to predict future growth trends). The collected data were grouped by age (in days) for processing the weight measurements and image data recorded per second.

An object detection approach was employed to identify broilers in the images and assign corresponding headcount labels to each weight measurement. To enhance data reliability, a center/edge region discrimination strategy was implemented: weight data associated with images where broilers were detected in the edge regions of the load cell were discarded as potentially interfered with, while only data from images showing broilers exclusively in the central region were retained.

After obtaining the weight distribution, more robust average weight values were derived by dividing each weight measurement by its corresponding broiler count. Six distinct methods were then applied to calculate the daily average weights:

Raw Mean: Direct calculation without any preprocessing;
IQR + Z-score;
DBSCAN;
Isolation Forest;
One-Class SVM;
Mahalanobis.

The mean was computed following outlier removal. These procedures generated time-series data of daily average weights, which were subsequently fed into the Prophet model for shipment weight prediction. The predictive performance of each method was comparatively analyzed. The complete algorithmic workflow is illustrated in Figure 4.

Figure 4. Algorithm design and implementation.

For the broiler counting stage, this study constructed an object detection model using image dataset collected from the start of rearing to shipment to estimate the number of broilers appearing on the load cell in each frame. Regarding model selection, priority was given to the YOLOv8n model from the YOLOv8 detection framework, which combines lightweight characteristics with high accuracy. This model represents the most parameter-efficient and computationally optimal lightweight version in the YOLOv8 series, making it suitable for edge device deployment and real-time detection requirements. With a dataset exceeding 2 million images, the demand for model inference speed far surpasses the need for detection accuracy in this research. The image composition is simple, containing only broilers and the load cell platform. By utilizing the most lightweight version of the YOLO series, which has the fewest parameters and fastest inference speed, we still achieved exceptionally high detection accuracy. The input image size was set to 320 × 320 pixels. The model was trained for 200 epochs with a batch size of 32 images per iteration. The random seed was fixed at 1, while other parameters followed Ultralytics YOLOv8’s default settings. Training was conducted in a PyTorch (v2.0.0+cu117) environment. Detailed specifications of the experimental equipment are presented in Table 3.

Table 3. Experimental environment.

The object detection model trained through the aforementioned steps enables real-time analysis of images captured every second. In actual farm environments, load cells are installed on the floor area inside the poultry house to record instantaneous body weights of individual broilers. However, when broilers stand non-vertically on the weighing platform or only partially lean against the edge regions, it results in abnormal weight measurements, thereby introducing bias into the overall weight distribution. Such errors are difficult to eliminate through conventional methods in large-scale automated monitoring systems, necessitating the implementation of a spatial region judgment mechanism based on image analysis to filter unreliable data.

To address this issue, this study proposes a discrimination method using central region and peripheral region. This approach distinguishes between reliable and unreliable weight measurements by determining the position of the detection bounding box center points from the object detection model. Samples with detection box center points located within the central region are considered valid, while those with center points falling in the peripheral region are classified as uncertain, and their corresponding weight data are subsequently discarded.

To validate the effectiveness of this regional division strategy, this study designed three edge/center region configuration methods (Figure 5) and conducted comparative experiments using the same dataset, as detailed below:

Figure 5. (a) Experiment A (basic configuration); (b) Experiment B (center contraction); (c) Experiment C (edge expansion).

Experiment A (Basic Configuration):

A 10-pixel-wide edge region was created by extending 5 pixels inward and outward from the weighing platform boundary, with the interior area designated as the central region;

Experiment B (Center Contraction):

Building upon Experiment A, the central region was further reduced by 5 pixels inward, thereby expanding the edge region to 15 pixels in width. This configuration enables more stringent elimination of edge interference in the central region;

Experiment C (Edge Expansion):

Based on Experiment A, the edge region was extended outward by an additional 5 pixels, similarly achieving a 15-pixel width for the edge region. This setup evaluates the stability impact under more rigorous edge region settings.

During the process of collecting broiler weight data using load cells, sensor data anomalies frequently occur due to the complexity of actual farming environments. Therefore, in the broiler shipment weight prediction framework proposed in this study, outlier cleaning has been implemented as a critical preprocessing step. Its objective is to eliminate abnormal weight measurements caused by non-growth factors such as equipment errors and behavioral interference, thereby enhancing the stability and representativeness of average weight estimations. This process provides more authentic and continuous broiler weight change curves for time series modeling. This study incorporates five classical anomaly detection methods suitable for unsupervised scenarios.

This study first employs the IQR (interquartile range) method for preliminary data cleaning. The specific approach involves using a 15 min time window to locally model the distribution of weight measurements within each window. For each window, we calculate the first quartile (

Q 1

) and third quartile (

Q 3

) of the data, then compute the interquartile range

I Q R = Q 3 - Q 1

. All data points below

Q 1 - 1.5 \times I Q R

or above

Q 3 + 1.5 \times I Q R

are identified as outliers and removed from the sample set. Building upon the IQR cleaning, we further apply the Z-score method within the same 15 min sliding windows to eliminate any remaining extreme values. This additional step calculates the mean (

μ

) and standard deviation (

σ

) of the remaining data, with any values satisfying

| x - μ | > 3 σ (i . e ., Z - s c o r e > 3.0)

being flagged as anomalies.

In DBSCAN-based outlier detection, to avoid bias from manual parameter setting, this study adopts the elbow point detection method proposed by Satopaa et al. []. The method identifies the optimal eps value by analyzing the k-nearest neighbor distance plot, which has been effectively applied in multiple outlier detection and clustering studies []. The specific procedure is as follows: First, compute the distance between each sample point and its k-th nearest neighbor (set to 5 in this study) to construct a k-distance graph. Then, use the elbow method (KneeLocator) to determine the optimal inflection point as the final eps value. After performing DBSCAN clustering with this eps and

m i n_s a m p l e s = 5

, noise points are excluded, while samples from the main clusters are retained. The mean weight of these samples is calculated as the representative broiler weight value.

For outlier detection using the Isolation Forest algorithm, this study configured the number of estimators (n_estimators) as 100 and set the contamination rate to 0.03, indicating that approximately 3% of the total samples were expected to be outliers. A fixed random seed (random_state = 42) was established. After model training, each record was assigned an anomaly score and corresponding label, with samples labeled −1 identified as outliers and subsequently removed from further processing. Only normal samples labeled 1 were retained, and their average weight values within each time window (age in days) were calculated to serve as representative broiler weight measurements for the respective periods.

In the One-Class SVM method, we first standardized the original weight data using StandardScaler (scikit-learn v1.1.2) to achieve zero-mean and unit-variance distribution, thereby eliminating the influence of feature scales on the model. We then employed the Radial Basis Function (RBF) as the kernel function, with parameter nu set to 0.03 to limit the proportion of anomalous samples to no more than 3%. The gamma parameter was configured as ‘auto’, allowing automatic estimation of the kernel width based on the number of features. After model training, samples identified as anomalies (labeled −1) were removed from the dataset, retaining only normal samples for subsequent statistical analysis of average broiler weights.

In the Mahalanobis distance method, for each sample point, we calculate its Mahalanobis distance from the overall mean. Using the chi-square distribution critical value (with 2 degrees of freedom) at a 0.975 confidence level as the threshold, we determine whether the sample is an outlier. Samples exceeding this threshold distance are identified as anomalies and removed, retaining only normal samples for subsequent average weight calculations and time series construction.

The shipment weight prediction method follows the research framework established in reference []. The daily average weights, processed through different outlier detection methods, are constructed into time series data and used as input for the Prophet model. When the average weight first exceeds 1000 g at a certain age, it is considered to have entered the predictable phase. Therefore, this study uses the time series after the weight reaches 1000 g as the prediction interval, employing the Prophet model to forecast broiler shipment weights. The prediction results from various outlier treatment methods are obtained and comparatively analyzed.

3.3. Performance Evaluation of the Algorithm

This study aims to achieve effective prediction of broiler shipment weights through the proposed algorithm. To validate the prediction performance, we conducted a comprehensive comparison of the predictive models generated by six representative processing methods, with particular focus on evaluating how different outlier detection strategies impact final prediction accuracy. Using five broiler farming datasets as test subjects, we compared the Prophet model’s output predictions against actual average shipment weight labels to calculate prediction error percentages. For evaluation metrics, we employed three standard measurements: Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Root Mean Squared Error (RMSE), providing a complete assessment of each method’s prediction precision and robustness.

4. Results

4.1. Experiments in Object Detection Evaluation

To comprehensively evaluate the performance of the trained YOLOv8n model on the broiler image dataset, this study adopted standard evaluation metrics and presents Precision–Recall (P-R) curves and F1-Confidence curves (Figure 6). The P-R curve visually demonstrates the model’s precision variations at different recall rates, reflecting its stability across various detection difficulty scenarios. Results show that the YOLOv8n model achieved a mean average precision (mAP@0.5) of 0.986 at an IoU threshold of 0.5, indicating exceptionally high detection accuracy. The F1 score–confidence curve illustrates the model’s balanced performance between precision and recall across different confidence thresholds. Experimental results demonstrate that when the confidence threshold is set to 0.606, the model achieves an F1 score of 0.96, representing the optimal region of the curve. This indicates that at this threshold, the detection boxes maintain both low false positive rates and high recall rates, making it suitable as the lower confidence bound for practical deployment.

Figure 6. Performance metrics of the YOLOv8n model during training.

To further validate the practical application effectiveness of the object detection model, this study conducted manual verification on the detection results from the previously sampled images, with statistical analysis performed on recognition accuracy across different age stages. The image recognition results were categorized into three classes: true positive, false positive, and human uncertainty. The image verification results are shown in Figure 7, with true positives represented by blue points, false positives by orange points, and human uncertainty by green points. The scatter plot was generated according to the following criteria: only images without broilers detected in the peripheral region and with at least one broiler identified in the central region were retained. For these qualified images, the total weight recorded by the load cell was divided by the number of broilers detected in the central region to calculate the average body weight per broiler, which was then used to create the age-versus-weight scatter plot. The verification results of the 5040 sample images (180 randomly selected per day) were analyzed across three age groups. During the 0–10-day period, the average recognition accuracy was 95.0% (true positive), with a 3.6% false positive rate and 1.4% human uncertainty cases. The 10–20-day period showed improved performance with 98.1% true positive accuracy, a 1.2% false positive rate, and 0.7% human uncertainty. The model demonstrated optimal stability in the 20–28-day period, achieving 99.7% true positive accuracy while maintaining merely a 0.1% false positive rate and a 0.2% human uncertainty rate. These results clearly indicate that the model’s counting performance improves significantly as broilers grow larger, particularly showing exceptional accuracy after 20 days of age, with near-perfect recognition capability during the final rearing phase.

Figure 7. Results of manual validation for object detection-based broiler counting on weighing platforms.

Furthermore, analysis of the weight distribution patterns among sampled categories reveals distinct differences in scatter plot trends. Correct detection samples (blue points) predominantly cluster along a stable growth curve, demonstrating a consistent upward trajectory with increasing age that accurately reflects normal broiler weight progression. This distribution pattern confirms the object detection model’s strong stability and capacity to characterize weight development trends. In contrast, incorrect detection samples (orange points) exhibit more dispersed distributions across all growth stages, frequently deviating from the primary growth curve. Notable anomalies include unrealistically high weight readings during early stages and abnormally low measurements in mid-to-late phases, indicating persistent misjudgment risks when processing edge cases or severely occluded images.

4.2. Evaluation Experiments of Edge/Center Region Strategy

To further enhance the accuracy of individual broiler weight estimation in images, this study introduced an edge/center region identification strategy and conducted a systematic evaluation on actual datasets. Figure 8 presents weight distribution characteristics under different region configurations and detection scenarios, where A, B, and C represent distinct edge/center region division strategies: A denotes the baseline configuration (10-pixel edge region width), B indicates inward contraction of the central region based on A (expanding edge region to 15-pixel width), and C represents outward extension of the edge region from A (also achieving 15-pixel edge width).

Figure 8. (a) Under three distinct edge/center region division strategies, the weight distribution of a single broiler on the load cell platform is presented. Red scatter points represent the weight distribution when broilers appear in edge regions, while green scatter points indicate the weight distribution when broilers are located in center regions. (b) Two-broiler weight distributions: green = both in the center; red ≥ 1 in the edge. (The letters A, B, and C in the title represent the experimental configurations of basic configuration, center contraction, and edge expansion, respectively).

Figure 8 displays the temporal distribution of detected individual broiler weights (unit: g), where differently colored points represent distinct regional classification conditions. Green points indicate images where broilers were detected exclusively in the central region with no presence in edge areas—these samples are considered relatively reliable representative data. Red points correspond to images where at least one broiler was detected in edge regions, while the total count (sum of broilers in both central and edge regions) matches the labeled classification (“1 broiler” or “2 broilers”). All points represent estimated “individual broiler weights” (total weight divided by broiler count), demonstrating weight fluctuations under different configurations and detection conditions. The following observations are evident across different configurations:

Green points exhibit more concentrated distributions and stable weight ranges, demonstrating superior representativeness and predictability.

Red points show greater dispersion and higher variability, indicating edge region detection is more vulnerable to occlusions, posture variations, or weighing platform edge effects, resulting in data deviations.

Increasing broiler counts (from 1 to 2) correlate with elevated distribution dispersion, where red points progressively diverge from green clusters, suggesting that concurrent multi-broiler presence with edge region occurrences may magnify individual weight estimation errors.

These verification results collectively support prioritizing central region identification for representative value selection, while confirming edge regions’ potential to introduce measurement noise that compromises subsequent weight prediction accuracy.

Figure 9a demonstrates the system’s performance under standard detection conditions, where it accurately matches weight values with broiler counts while effectively filtering interference values (falling between single- and double-broiler weights) caused by broilers appearing in edge regions. Figure 9b illustrates the impact of motion-blurred images resulting from wing flapping on detection accuracy. The system maintains effective detection when individual broilers appear blurred but exhibits detection errors when two adjacent broilers flap their wings simultaneously.

Figure 9. (a) Typical object detection image with clear broiler visibility; (b) image capture artifact caused by broiler wing movement (motion blur).

Theoretically, dynamic adjustment of margin regions aligns better with practical scenarios compared to using fixed-edge filtering pixels. However, we must consider the distribution patterns of broiler weight data. In early growth stages, broilers have lighter weights, causing the weight distributions for one, two, and three broilers to cluster closely together, resulting in blurred and difficult-to-distinguish boundaries. As broilers grow, the weight distributions between one and two broilers gradually separate, becoming easily distinguishable through density estimation. Consequently, stricter filtering is required in early stages to accentuate the boundaries between weight distributions of one versus two broilers, while more lenient filtering can be applied in later stages. From a morphological perspective, the edge filtering region should increase proportionally with broiler body size. From a data distribution standpoint, this region should decrease as broilers grow. For practical deployment, we adopted a fixed edge filtering size based on mid-growth broiler dimensions. This approach enables stricter filtering of early-stage data while progressively relaxing data filtration over time. Notably, for broilers exceeding 1000 g in average weight (prediction phase), this filtering consideration was intentionally excluded.

4.3. Experiments in the Shipping Weight Prediction Step

Based on Configuration A of the edge/center region discrimination strategy, we detected broiler counts on the weighing platform and categorized corresponding weight values. As illustrated in Figure 10, key observations emerge: during early rearing (days 4–14), images with single broilers on the platform were scarce. Conversely, images showing 2–4 broilers simultaneously occurred more frequently, resulting in most early weight recordings being derived from multi-broiler combinations. This pattern indicates that smaller broiler size and denser activity in early stages promote concurrent platform occupancy, limiting acquisition of adequate single-broiler images for representative weight estimation. As broilers aged, single-occupancy rates increased, with weight distributions progressively converging toward physiologically plausible ranges.

Figure 10. (a) Distribution of raw weight data; (b) weight distribution categorized by detected broiler counts; (c) distribution of individual broiler weights obtained by dividing total weight by broiler count.

Figure 11a shows the weight distribution of one to four broilers on the weighing platform. Figure 11b presents the standard deviation of average weights for different broiler counts. By applying object detection algorithms to identify the number of broilers on the weighing platform, we obtained reasonable and stable broiler weight distributions. In Figure 11a, weight data for single broilers aged 6–9 days are relatively scarce, showing slightly higher average weights compared to adjacent age groups. However, in Figure 11b, when considering the average weights of multiple broilers aged 6–9 days, they demonstrate a perfect growth trend when combined with adjacent age groups, compensating for the insufficiency of single-broiler weight data. Furthermore, after day 12, a maximum of three broilers were present simultaneously on the weighing platform, which was reduced to a maximum of two broilers after day 18. Subsequently, occurrences of two broilers appearing together gradually decreased. This reduction occurs because the increasing body size of growing broilers progressively limits the platform’s capacity. This phenomenon may lead to decreased diversity in weight data samples during later growth stages, resulting in gradually increasing standard deviations of average weights across different broiler counts.

Figure 11. (a) The weight distribution when 1–4 broilers are present on the load cell platform is shown, with the black horizontal line indicating the mean weight value. (b) Growth trend of average broiler weight (total weight divided by number of broilers). Subplot titles indicate the standard deviation (STD) of the mean weight at different broiler counts. (c) Population mean weight (weight/broiler count): red lines = averages; green lines = KDE peaks (methods in []).

Figure 11c displays the weight distribution normalized by broiler count, presenting the population growth trend. For the age, the weight distribution shows gradually increasing dispersion. Compared to the weight distribution of individual broilers, the average weight values calculated by integrating different broiler counts demonstrate a much more stable growth pattern. The density peaks obtained through KDE indicate that these weight values occur at higher frequencies within the population, forming effective density distributions that are representative of typical weights. The presence of multiple density peaks reflects the individual variations among broilers in the population. As growth and development progress, the size differences between broilers become increasingly apparent, showing more distinct individual size differentiation.

After obtaining the individual broiler weight distribution, this study employed six methods to extract daily representative weight values. Using day 7 as an example (shown in Figure 12), the Raw Mean approach (without any outlier removal) contained numerous upper-bound outliers, yielding an average weight of 170 g, with fewer but still present lower-bound anomalies. The IQR + Z-score method demonstrated the most aggressive outlier elimination, substantially removing extreme values while showing signs of over-cleaning (average weight: 168 g). Isolation Forest and Mahalanobis distance methods exhibited more balanced boundary sample retention, being slightly more conservative than IQR + Z-score (average weights: 168 g and 167 g, respectively). One-Class SVM performed comparably but retained more marginal samples (average weight: 169 g).

Figure 12. Comparative effectiveness of different outlier detection methods at 7 days of age. Green points represent valid weight values, gray points indicate outliers, and the blue dashed line denotes the mean value after outlier removal. In the DBSCAN method, green points correspond to valid weight values, while all others are classified as outliers.

DBSCAN displayed optimal cleaning results visually. Its density-based clustering effectively identified and flagged isolated values while preserving the primary distribution area, ultimately producing an average weight (170 g) identical to the raw data. These comparative results reveal significant differences in outlier detection sensitivity among methods, with DBSCAN demonstrating particularly strong robustness and representative value estimation capability for this dataset.

Figure 13 presents the daily representative broiler weights extracted by six different methods. It can be observed that although these methods vary in outlier detection intensity, the resulting representative weight data demonstrate consistent overall trends, with only minor differences of a few grams in average weights across age groups. These results not only validate the stability of each method in extracting representative values but also indirectly confirm the effectiveness of the preliminary object detection method for identifying broiler counts on the weighing platform.

Figure 13. Representative daily weights by age for dataset 2023_1117_KF0081_01-Img_Data.

Figure 14 demonstrates the application effects of two previously studied broiler weight representation extraction methods—K-means clustering and KDE—across different age stages. The results reveal that during days 4–14, the scarcity of single-broiler occurrences on the weighing platform makes it difficult to effectively extract density-based aggregation regions from individual weight distributions. Consequently, both methods frequently fail to identify clear representative values during this period. For instance, the K-means approach could not form valid clusters from days 11–14, forcing researchers to estimate missing age values through growth curve interpolation. This limitation highlights the instability of relying solely on weight-value density for representative extraction during early rearing phases. In contrast, our proposed strategy integrates object detection results with edge/center region identification, enabling relatively stable value estimation even when early-stage single-broiler weight data are scarce.

Figure 14. Daily representative broiler weights obtained by density-based methods (K-means and KDE) in previous studies versus the proposed method in this research.

Table 4 presents detailed prediction results comparing the K-means and KDE methods from previous studies with the methods proposed in this current research across five datasets. Initial results from the 2023_1117_KF0081_01-Img_Data dataset showed significant performance from both K-means (1.5% error) and KDE (3.53% error). However, subsequent datasets from 2024 demonstrated substantially lower prediction errors using our proposed methods compared to K-means and KDE.

Table 4. Predictor experiment results.

Notably, in the 2024_0105 and 2024_0502_KF0081_02-Img_Data datasets, the IQR + Z-score outlier treatment method showed increased error rates compared to the untreated Raw Mean approach. This suggests that the IQR + Z-score method may be overly aggressive in filtering outliers, potentially eliminating valid but unusual broiler weight measurements that should be retained. The improved prediction accuracy observed with appropriate outlier treatment demonstrates that maintaining an optimal balance between filtering true anomalies and preserving naturally occurring weight variations can enhance overall shipping weight prediction performance.

Table 5 presents the experimental results. Evaluated using three metrics—MAE, MAPE, and RMSE—the K-means and KDE methods exhibit significantly higher errors (MAE: 94.94 g and 117.36 g, respectively), highlighting their sensitivity to early-stage data scarcity and the presence of outliers. In contrast, the Raw Mean and IQR + Z-score approaches show substantially reduced errors, suggesting improved robustness. The four advanced methods—DBSCAN, Isolation Forest, One-Class SVM, and Mahalanobis distance—demonstrate stable performance across all evaluation metrics, with the Mahalanobis distance method achieving the best overall results: MAE of 40.62 g, MAPE of 2.35%, and RMSE of 47.26 g. The proposed Mahalanobis method demonstrates a 2.87% higher MAPE than K-means, while DBSCAN, Isolation Forest, and Mahalanobis all maintain MAE and RMSE below 50 g.

Table 5. Performance comparison of outlier detection methods.

5. Discussion

The proposed method causes no disruption to farmers’ existing rearing environments, as its implementation requires no modifications to current farming workflows regarding image capture or load cell equipment configuration, demonstrating excellent deployability. The integration of image recognition with weight sensor data significantly enhances the reliability of representative weight values, maintaining stable judgment criteria even when individual broiler weight data are sparse or contain interference factors. Unlike traditional methods limited to shipment weight prediction, our approach generates real-time representative average weight values for broiler populations throughout the entire rearing cycle, providing farm managers with continuous weight monitoring capabilities to support more precise feeding control and anomaly detection.

Although the Mahalanobis method achieved optimal prediction accuracy, the DBSCAN approach demonstrates superior robustness and interpretability in outlier identification and boundary value processing from the perspective of growth monitoring throughout the rearing process. Therefore, when considering both practical application scenarios and precision performance, the DBSCAN method holds greater value for practical implementation and wider adoption in this study. Although DBSCAN is classified as a density-based clustering algorithm, its implementation essence is based on neighborhood rules (such as density direct and density reachable) rather than probability models or distance metrics to identify anomalies. This rule-based characteristic makes it suitable for our actual situation. During the breeding process, weight records of abnormally large or small broilers are preserved. When a normally sized broiler and a small-sized broiler appear on the weight sensor simultaneously, their combined weight typically falls within the 1.5 × broiler weight distribution interval and is labeled as two broilers. Other methods would filter these values as outliers. If they form meaningful density clusters, DBSCAN retains them. At the edges of the data distribution, outliers are usually transient and non-repeating, unable to form stable densities, and are therefore automatically excluded. Persistent abnormal weights (e.g., abnormally large or small broilers) that establish density clusters are preserved as valid data points. Our goal is to calculate the overall average weight. These abnormally sized broilers, as commercial products, should have their weights retained as valid values.

Meanwhile, this study still presents aspects worthy of further exploration and optimization. First, the timing of prediction initiation warrants discussion. The current study defaults to initiating predictions when the average weight reaches 1000 g, based on industry standards where the target shipment weight typically equals 1500 g. However, some actual shipment weights in our dataset exceeded 1700 g, substantially extending the prediction window and consequently increasing error accumulation. Given that weight fluctuations near shipment contribute more significantly to final weights, future studies could dynamically adjust prediction starting points according to contract-specific target weights to enhance model applicability and accuracy. Second, the edge region configuration in object detection requires refinement. While edge regions currently serve to exclude unreliable detection data when counting individual broilers, their fixed-width setting disregards actual broiler size progression. Specifically, early-stage broilers’ smaller sizes may lead to over-exclusion with fixed edges, whereas later stages may require broader edges to effectively filter edge misdetections. Two potential dynamic edge strategies merit investigation: (1) implementing nonlinearly increasing functions (e.g., parameterized sigmoid) to automatically adjust edge width according to growth patterns or (2) setting edge proportions based on detected bounding box areas for adaptive adjustment. Validating and implementing these optimizations will require extensive empirical work, offering fruitful directions for future research.

6. Conclusions

This study proposes a method for estimating daily representative broiler weights by integrating image-based object detection with load cell data. The approach effectively addresses the instability of density-based methods such as K-means and KDE during the early rearing stages, where single-broiler weight data are often sparse. By incorporating object detection to identify the number of broilers on the weighing platform and applying multiple outlier detection algorithms (IQR + Z-score, DBSCAN, Isolation Forest, One-Class SVM, and Mahalanobis distance), the proposed method significantly improves the stability and accuracy of representative weight extraction throughout the production cycle. Experimental results indicate that all six methods produced consistent overall weight trends, with only minor differences in representative values. Among them, the Mahalanobis method achieved the best performance in terms of MAE, MAPE, and RMSE. However, during the mid-rearing period, the DBSCAN method provided a more accurate and robust representation of the group’s average weight, making it more suitable for real-time monitoring applications. The methods based on DBSCAN, Isolation Forest, and Mahalanobis distance all achieved MAE and RMSE values below 50 g, which fall within the acceptable error range for the average shipment weight specified in standard broiler farming contracts.

For future improvement, we identify two promising directions: (1) dynamically setting prediction onset based on target shipping weights to avoid extended window errors, and (2) adapting edge exclusion strategies in detection according to broiler growth patterns. These enhancements will further strengthen the model’s applicability in diverse real-world farming conditions.

Author Contributions

The author L.Y. collected and organized the data, developed the idea of solving the problem addressed in this paper, and analyzed the data. He also designed the methodology and wrote the first draft of this paper. The author J.S. conceptualized this study and designed the methodology. He also managed the project and reviewed and revised the first draft of this paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by Korea Institute of Planning and Evaluation for Technology in Food, Agriculture and Forestry (IPET) and Korea Smart Farm R&D Foundation (KosFarm) through Smart Farm Innovation Technology Development Program, funded by Ministry of Agriculture, Food and Rural Affairs (MAFRA) and Ministry of Science and ICT (MSIT), Rural Development Administration (RDA), grant number RS-2025-02216818.

Institutional Review Board Statement

Not applicable.

Data Availability Statement

The dataset generated and analyzed in this paper cannot be publicly released due to the proprietary rights of Emotion Co., Ltd., Jeonju, Jeollabuk-do, Republic of Korea. However, it is available from the corresponding author upon research request with reasonable justification.

Acknowledgments

During the preparation of this manuscript, the authors used ChatGPT (free version) and DeepSeek (free version) for the purposes of translation and used Google Translate for confirmation. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Choi, H.-C.; Kim, D.-H.; Kim, E.-T.; Lee, H.-J.; Park, J.-H.; Kim, J.-B.; Lee, J.-Y.; Jeon, J.-H.; Ki, K.-S.; Kwon, K.-S. Livestock Production in Korea: Recent Trend and Future Prospects of ICT Technology Livestock Production in Korea: Recent Trend and Future Prospects of ICT Technology. FFTC Agric. Policy Platf. 2019. Available online: https://ap.fftc.org.tw/article/1616 (accessed on 1 September 2025).
Korea Meat Distribution and Export Association. Available online: http://www.kmta.or.kr/kr/data/stats_import_chicken_month.php (accessed on 7 July 2025).
Oh, Y.; Lyu, P.; Ko, S.; Min, J.; Song, J. Enhancing Broiler Weight Estimation through Gaussian Kernel Density Estimation Modeling. Agriculture 2024, 14, 809. [Google Scholar] [CrossRef]
Lee, B.; Song, J. Development of an Algorithm for Predicting Broiler Shipment Weight in a Smart Farm Environment. Agriculture 2025, 15, 539. [Google Scholar] [CrossRef]
Beiki, H.; Pakdel, A.; Moradi-Shahrbabak, M.; Mehrban, H. Evaluation of Growth Functions on Japanese Quail Lines. J. Poult. Sci. 2013, 50, 20–27. [Google Scholar] [CrossRef]
Al-Samarai, F.R. Growth Curve of Commercial Broiler as Predicted by Different Nonlinear Functions. Am. J. Appl. Sci. Res. 2015, 1, 6–9. [Google Scholar]
Araújo, C.C.; Rodrigues, K.F.; Vaz, R.G.M.V.; Conti, A.C.M.; Amorim, A.F.; Campos, C.F.A. Analysis of Growth Curves in Different Lineages of Caipira Broiler Type. Acta Scientiarum. Anim. Sci. 2018, 40, 38806. [Google Scholar] [CrossRef]
Johansen, S.V.; Bendtsen, J.D.; Martin, R.; Mogensen, J. Broiler Weight Forecasting Using Dynamic Neural Network Models with Input Variable Selection. Comput. Electron. Agric. 2019, 159, 97–109. [Google Scholar] [CrossRef]
Mouffok, C.; Semara, L.; Ghoualmi, N.; Belkasmi, F. Comparison of Some Nonlinear Functions for Describing Broiler Growth Curves of Cobb 500 Strain. Poult. Sci. J. 2019, 7, 51–61. [Google Scholar] [CrossRef]
Koushandeh, A.; Chamani, M.; Yaghobfar, A.; Sadeghi, A.A.; Baneh, H. Comparison of the Accuracy of Nonlinear Models and Artificial Neural Network in the Performance Prediction of Ross 308 Broiler Chickens. Poult. Sci. J. 2019, 7, 151–161. [Google Scholar] [CrossRef]
Kucukonder, H.; Demirarslan, P.C.; Alkan, S.; Özgur, B.B. Curve Fitting with Nonlinear Regression and Grey Prediction Model of Broiler Growth in Chickens. Pak. J. Zool. 2020, 52, 347. [Google Scholar] [CrossRef]
Zuidhof, M.J. Multiphasic Poultry Growth Models: Method and Application. Poult. Sci. 2020, 99, 5607–5614. [Google Scholar] [CrossRef]
Yalcin, S.; Settar, P.; Ozkan, S.; Cahaner, A. Comparative Evaluation of Three Commercial Broiler Stocks in Hot versus Temperate Climates. Poult. Sci. 1997, 76, 921–929. [Google Scholar] [CrossRef]
Wang, D.Q.; Lu, L.Z.; Ye, W.C.; Shen, J.D.; Tao, Z.R.; Tao, Z.L.; Ma, F.L.; Chen, Y.C.; Zhao, A.Z.; Xu, J. Study on the Growth Regularity of Jinyun Muscovy Duck. Zhejiang J. Anim. Sci. Vet. Med. 2004, 6, 3–5. [Google Scholar]
Yang, Y.; Mekki, D.M.; Lv, S.J.; Wang, L.Y.; Yu, J.H.; Wang, J.Y. Analysis of Fitting Growth Models in Jinghai Mixed-Sex Yellow Chicken. Int. J. Poult. Sci. 2006, 5, 517–521. [Google Scholar] [CrossRef]
De Wet, L.; Vranken, E.; Chedad, A.; Aerts, J.-M.; Ceunen, J.; Berckmans, D. Computer-Assisted Image Analysis to Quantify Daily Growth Rates of Broiler Chickens. Br. Poult. Sci. 2003, 44, 524–532. [Google Scholar] [CrossRef] [PubMed]
Mortensen, A.K.; Lisouski, P.; Ahrendt, P. Weight Prediction of Broiler Chickens Using 3D Computer Vision. Comput. Electron. Agric. 2016, 123, 319–326. [Google Scholar] [CrossRef]
Amraei, S.; Abdanan Mehdizadeh, S.; Salari, S. Broiler Weight Estimation Based on Machine Vision and Artificial Neural Network. Br. Poult. Sci. 2017, 58, 200–205. [Google Scholar] [CrossRef]
Liu, D.; Vranken, E.; Van Den Berg, G.; Carpentier, L.; Fernández, A.P.; He, D.; Norton, T. Separate Weighing of Male and Female Broiler Breeders by Electronic Platform Weigher Using Camera Technologies. Comput. Electron. Agric. 2021, 182, 106009. [Google Scholar] [CrossRef]
Wang, C.-Y.; Chen, Y.-J.; Chien, C.-F. Industry 3.5 to Empower Smart Production for Poultry Farming and an Empirical Study for Broiler Live Weight Prediction. Comput. Ind. Eng. 2021, 151, 106931. [Google Scholar] [CrossRef]
Guo, Y.; Chai, L.; Aggrey, S.E.; Oladeinde, A.; Johnson, J.; Zock, G. A Machine Vision-Based Method for Monitoring Broiler Chicken Floor Distribution. Sensors 2020, 20, 3179. [Google Scholar] [CrossRef]
Geffen, O.; Yitzhaky, Y.; Barchilon, N.; Druyan, S.; Halachmi, I. A Machine Vision System to Detect and Count Laying Hens in Battery Cages. Animal 2020, 14, 2628–2634. [Google Scholar] [CrossRef]
Siriani, A.L.R.; Kodaira, V.; Mehdizadeh, S.A.; de Alencar Nääs, I.; de Moura, D.J.; Pereira, D.F. Detection and Tracking of Chickens in Low-Light Images Using YOLO Network and Kalman Filter. Neural Comput. Appl. 2022, 34, 21987–21997. [Google Scholar] [CrossRef]
Cruz, E.; Hidalgo-Rodriguez, M.; Acosta-Reyes, A.M.; Rangel, J.C.; Boniche, K. AI-Based Monitoring for Enhanced Poultry Flock Management. Agriculture 2024, 14, 2187. [Google Scholar] [CrossRef]
Ultralytics YOLOv8. Available online: https://docs.ultralytics.com/models/yolov8 (accessed on 30 June 2025).
Chandola, V.; Banerjee, A.; Kumar, V. Anomaly Detection: A Survey. ACM Comput. Surv. (CSUR) 2009, 41, 3. [Google Scholar] [CrossRef]
Ester, M.; Kriegel, H.-P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Kdd, Portland, OR, USA, 2–4 August 1996; Volume 96, pp. 226–231. [Google Scholar]
Liu, F.T.; Ting, K.M.; Zhou, Z.-H. Isolation Forest. In Proceedings of the 2008 Eighth IEEE International Conference on Data Mining, Pisa, Italy, 15–19 December 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 413–422. [Google Scholar]
Schölkopf, B.; Williamson, R.C.; Smola, A.; Shawe-Taylor, J.; Platt, J. Support Vector Method for Novelty Detection. In Proceedings of the Advances in Neural Information Processing Systems, Denver, CO, USA, 29 November–4 December 1999; MIT Press: Cambridge, MA, USA, 1999; Volume 12. [Google Scholar]
McLachlan, G.J. Mahalanobis Distance. Resonance 1999, 4, 20–26. [Google Scholar] [CrossRef]
Satopaa, V.; Albrecht, J.; Irwin, D.; Raghavan, B. Finding a “Kneedle” in a Haystack: Detecting Knee Points in System Behavior. In Proceedings of the 2011 31st International Conference on Distributed Computing Systems Workshops, Minneapolis, MN, USA, 20–24 June 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 166–171. [Google Scholar]
Schubert, E.; Sander, J.; Ester, M.; Kriegel, H.P.; Xu, X. DBSCAN Revisited, Revisited: Why and How You Should (Still) Use DBSCAN. ACM Trans. Database Syst. (TODS) 2017, 42, 3. [Google Scholar] [CrossRef]

Figure 1. Smart poultry farming facility.

Figure 2. (a) IoT-enabled weighing sensor devices; (b) synchronized device-captured footage. (red arrows denote the camera field-of-view directions).

Figure 3. (a) Composition of the object detection dataset; (b) number of images in training, validation, and test sets.

Figure 4. Algorithm design and implementation.

Figure 5. (a) Experiment A (basic configuration); (b) Experiment B (center contraction); (c) Experiment C (edge expansion).

Figure 6. Performance metrics of the YOLOv8n model during training.

Figure 7. Results of manual validation for object detection-based broiler counting on weighing platforms.

Figure 8. (a) Under three distinct edge/center region division strategies, the weight distribution of a single broiler on the load cell platform is presented. Red scatter points represent the weight distribution when broilers appear in edge regions, while green scatter points indicate the weight distribution when broilers are located in center regions. (b) Two-broiler weight distributions: green = both in the center; red ≥ 1 in the edge. (The letters A, B, and C in the title represent the experimental configurations of basic configuration, center contraction, and edge expansion, respectively).

Figure 9. (a) Typical object detection image with clear broiler visibility; (b) image capture artifact caused by broiler wing movement (motion blur).

Figure 10. (a) Distribution of raw weight data; (b) weight distribution categorized by detected broiler counts; (c) distribution of individual broiler weights obtained by dividing total weight by broiler count.

Figure 11. (a) The weight distribution when 1–4 broilers are present on the load cell platform is shown, with the black horizontal line indicating the mean weight value. (b) Growth trend of average broiler weight (total weight divided by number of broilers). Subplot titles indicate the standard deviation (STD) of the mean weight at different broiler counts. (c) Population mean weight (weight/broiler count): red lines = averages; green lines = KDE peaks (methods in []).

Figure 12. Comparative effectiveness of different outlier detection methods at 7 days of age. Green points represent valid weight values, gray points indicate outliers, and the blue dashed line denotes the mean value after outlier removal. In the DBSCAN method, green points correspond to valid weight values, while all others are classified as outliers.

Figure 13. Representative daily weights by age for dataset 2023_1117_KF0081_01-Img_Data.

Figure 14. Daily representative broiler weights obtained by density-based methods (K-means and KDE) in previous studies versus the proposed method in this research.

Table 1. List of the datasets used.

File Name	Start Date	Delivery Date	Avg. Weight (g)	Images
2023_1117_KF0081_01-Img_Data	17 November 2023 9:13	21 December 2023 7:59	1703.5	2,191,039
2024_0105_KF0081_01-Img_Data	5 January 2024 11:30	8 February 2024 3:59	1810	2,745,771
2024_0502_KF0081_01-Img_Data	2 May 2024 10:09	6 June 2024 23:45	1910	2,589,510
2024_0502_KF0081_02-Img_Data	2 May 2024 10:09	6 June 2024 23:45	1910	2,461,297
2024_1226_KF0081_01-Img_Data	26 December 2024 11:04	24 January 2025 19:59	1366	2,030,085

Table 2. Image count by age group.

Age (Days)	0	4	8	12	16	20	24	28	32
train	70	70	70	70	70	70	70	70	70
val	10	10	10	10	10	10	10	10	10
test	10	10	10	10	10	10	10	10	10

Table 3. Experimental environment.

Equipment	Model	Manufacturer (Location)
Processor	Intel Xeon Gold 5218R CPU Processor NVidia a100-pcie-80gb	Intel Corporation (Santa Clara, CA, USA) NVIDIA Corporation (Santa Clara, CA, USA)
RAM	256 GB	Samsung Electronics (Suwon, Republic of Korea)
SSD	6 TB	Samsung Electronics (Suwon, Republic of Korea)
OS	Ubuntu 20.04 LTS	Canonical Ltd. (London, UK)

Table 4. Predictor experiment results.

Data	K-Means	KDE	Raw Mean	IQR + Z-Score	DBSCAN	Isolation Forest	One-Class SVM	Mahalanobis	Avg. Weight (g)
2023_1117_KF0081_01-Img_Data	1677.9	1643.3	1589.2	1598.4	1623.5	1616.5	1600.6	1615.6	1703.5
2023_1117_KF0081_01-Img_Data	(error%) 1.50%	3.53%	6.71%	6.17%	4.70%	5.11%	6.04%	5.16%	1703.5
2024_0105_KF0081_01-Img_Data	1755.4	1692.8	1773.9	1768	1774.7	1782.1	1776	1776.9	1810
2024_0105_KF0081_01-Img_Data	(error%) 3.02%	6.48%	1.99%	2.32%	1.95%	1.54%	1.88%	1.83%	1810
2024_0502_KF0081_01-Img_Data	1574.3	1609.2	1856	1859.3	1867.5	1867.2	1867.6	1874.9	1910
2024_0502_KF0081_01-Img_Data	(error%) 17.58%	15.75%	2.83%	2.65%	2.23%	2.24%	2.22%	1.84%	1910
2024_0502_KF0081_02-Img_Data	1902.5	1929.3	1931.2	1938.4	1945.2	1944.4	1933.2	1935.7	1910
2024_0502_KF0081_02-Img_Data	(error%) 0.39%	1.01%	1.11%	1.49%	1.84%	1.80%	1.21%	1.35%	1910
2024_1226_KF0081_01-Img_Data	1314.7	1276.7	1339	1342.2	1339.2	1341.9	1335.8	1344.7	1366
2024_1226_KF0081_01-Img_Data	(error%) 3.76%	6.54%	1.98%	1.74%	1.96%	1.76%	2.21%	1.56%	1366

Table 5. Performance comparison of outlier detection methods.

Method	MAE (g)	MAPE (%)	RMSE	STD	MEAN
K-Means	94.94	5.25%	154.28	136.0	94.9
KDE	117.36	6.66%	152.44	118.4	109.6
Raw Mean	50.52	2.92%	60.77	49.1	42.0
IQR + Z-score	50.00	2.87%	57.89	48.2	38.6
DBSCAN	43.96	2.54%	47.77	41.7	29.9
Isolation Forest	43.24	2.49%	48.87	43.6	29.5
One-Class SVM	46.54	2.71%	54.76	44.9	37.3
Mahalanobis	40.62	2.35%	47.26	40.5	30.3

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Enhanced Prediction of Broiler Shipment Weight Using Vision-Assisted Load Cell Analysis

Abstract

1. Introduction

2. Background

2.1. Overview of Broiler Weight Prediction Methods

2.2. Object Detection for Broiler Counting on the Weighing Platform

2.3. Outlier Handling

2.3.1. IQR and Z-Score

2.3.2. DBSCAN

2.3.3. Isolation Forest

2.3.4. One-Class Support Vector Machines

2.3.5. Mahalanobis

3. Materials and Methods

3.1. Data Collection

3.2. Algorithm Composition and Design

3.3. Performance Evaluation of the Algorithm

4. Results

4.1. Experiments in Object Detection Evaluation

4.2. Evaluation Experiments of Edge/Center Region Strategy

4.3. Experiments in the Shipping Weight Prediction Step

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Article Metrics

Citations

Article Access Statistics