Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Machine Learning Performance Analysis for Bagging System Improvement: Key Factors, Model Optimization, and Loss Reduction in the Fertilizer Industry

AgriEngineering 2025, 7(6), 187; https://doi.org/10.3390/agriengineering7060187

by Ari Primantara^1,*

, Udisubakti Ciptomulyono² and Berlian Al Kindhi³

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Mehdi Tlija

AgriEngineering 2025, 7(6), 187; https://doi.org/10.3390/agriengineering7060187

Submission received: 14 April 2025 / Revised: 28 May 2025 / Accepted: 4 June 2025 / Published: 11 June 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper Machine Learning Performance Analysis for Bagging System Improvement in the Fertilizer Industry represents investigates the use of four machine learning models to predict and minimize fertilizer bagging weight inconsistencies at PT Petrokimia Gresik. The study highlights random forest regression as the best-performing model, achieving significant improvements in predictive accuracy and proposing an IoT-integrated Smart Bagging System. This is interesting research and could be used in different research areas. In general, this is very well written paper, but still, there are some major issues related with the paper that need to be addressed before it can be considered for publishing.

First worrying thing is really small dataset for machine learning. There is a risk of an overfitting and limits of the generalisability of the results. Is it possible to make it bigger, for example with some generation of samples?

Only random forest built-in feature importance is used. More robust methods like SHAP values or permutation importance should be used. Please, make a comparison or at least desribe, what could be changes.

Discussion about operational risks, system maintenance variability, or downtime costs during implementation should be added as cost-benefit analysis assumes a simple straight-line depreciation.

Could this model be easily used on another datasets?

Paper occasionally repeats itself, particularly when describing methodology and background. Some sections could be more concise.

There are also few language mistakes in paper, because of that I recommend a native speaker for proofreading of the paper.

Author Response

Comment 1: First worrying thing is really small dataset for machine learning. There is a risk of an overfitting and limits of the generalisability of the results. Is it possible to make it bigger, for example with some generation of samples?

Response 1:

To address concerns about dataset size and overfitting, we expanded the dataset from 100 to 1,000 real-time field samples, collected directly during active production using systematic sampling. No synthetic data were used. This expansion enhances the model’s generalizability, ensuring that the results accurately reflect actual operational conditions.

Comment 2: Only random forest built-in feature importance is used. More robust methods like SHAP values or permutation importance should be used. Please, make a comparison or at least desribe, what could be changes.

Response 2:

Thank you for the suggestion. A comparison between built-in feature importance, SHAP values, and permutation importance has been added in lines 463–477. This section discusses the differences in interpretation and robustness among the methods. While the built-in feature importance provides a quick overview, SHAP and permutation importance offer deeper insights into feature contributions and interactions, enhancing the model’s interpretability.

Comment 3: Discussion about operational risks, system maintenance variability, or downtime costs during implementation should be added as cost-benefit analysis assumes a simple straight-line depreciation.

Response 3:

We have added a paragraph at the end of Section 3.8, line 699-708 to acknowledge potential operational risks such as downtime and maintenance variability. While the current analysis assumes straight-line depreciation, we now highlight this limitation and suggest that future studies consider more robust approaches like sensitivity and life cycle cost analysis.

Comment 4: Could this model be easily used on another datasets

Response 4:

Yes, this has been addressed in subsection 3.9 “Model Generalization and Transferability” (lines 709–740), where we discuss the model's potential to be applied to other datasets. While it can be transferred with proper adjustments, its effectiveness depends on the similarity of data characteristics and context.

Comment 5: Paper occasionally repeats itself, particularly when describing methodology and background. Some sections could be more concise.

Response 5:

We have revised the manuscript to remove redundant content and improve clarity. The sensor placement and sensor descriptions were streamlined, and the sampling techniques section was shortened to avoid repetition while keeping the key information intact.

Comment 6: There are also few language mistakes in paper, because of that I recommend a native speaker for proofreading of the paper.

Response 6:

Thank you for the suggestion. I have carefully revised the manuscript and addressed several language issues with the assistance of Grammarly. I believe the overall clarity and grammar have now improved significantly. However, I remain open to further proofreading if needed.

Author Response File: Author Response.docx

Reviewer 2 Report

Comments and Suggestions for Authors

The manuscript presents a study on enhancing the precision of a fertilizer bagging system using machine learning (ML) models integrated within an IoT-based Smart Bagging System (SBS). The authors collected 100 real-time data samples from a urea bagging machine, capturing sensor-based parameters such as clamping time, humidity, and air pressure, and evaluated the predictive performance of four ML algorithms—Artificial Neural Network (ANN), Random Forest Regression (RFR), Linear Regression (LR), and Support Vector Regression (SVR)—to forecast bag weight. Among these, RFR achieved the highest accuracy (R² = 0.9638), and the SBS enabled real-time feedback control, significantly reducing underweight occurrences by 95%, thus aligning with sustainable development goals (SDGs) related to industry and resource efficiency. The study claims a potential economic saving of approximately 30.6 billion IDR annually for the case company. Here are my concerns:

The Introduction briefly states that four algorithms (ANN, RFR, LR, SVR) “were selected for their complementary characteristics…” (lines 119–131) but does not clearly justify why these methods are optimal for this specific bagging-system problem. What unique features of fertilizer bagging (e.g. nonlinearity, sensor noise) make Random Forest or SVR preferable over other regression or ensemble methods?
Suggestion: Add a paragraph explicitly mapping each algorithm’s strengths (e.g. RFR’s robustness to outliers, ANN’s capacity for nonlinear interactions) to the physical and statistical characteristics of the bagging data.
The manuscript lacks a clear statement of the gap in existing fertilizer–bagging literature. The brief review (lines 93–102) focuses on robotics in food packaging (e.g. [16–19]) rather than data-driven prediction in fertilizers .
Suggestion: Articulate the specific absence of machine-learning studies in fertilizer bagging.
There is no explicit hypothesis (e.g. “We hypothesize that RFR will outperform LR due to…”) or theoretical model linking sensor inputs to weight deviations.
Suggestion: Introduce a simple conceptual diagram or equation relating sensor variables (humidity, clamping time, etc.) to weight error, setting up the rationale for ML application.
Currently, the Introduction runs continuously from contextual background through literature review into methodological overview without clear breaks.
Suggestion:
- Motivation: Briefly outline the operational and economic impact (e.g. 37% nonconformance, 30.6 billion IDR loss) , and the need for predictive control. In “Motivation,” explicitly state the gap: “While Industry 4.0 emphasizes ML in manufacturing, fertilizer bagging remains under-studied.” End the “Motivation” subsection by pinpointing this precise niche, priming the reader for your novel contributions.
- Contributions: Numbered bullets stating: (1) comparative ML evaluation, (2) IoT-driven Smart Bagging System design, (3) economic impact analysis, (4) alignment with SDG 9 & 12.
The description (Section 2.2) says 100 samples from a single shift on 24 Sept 2024 at 30 s intervals, but does not justify whether this is statistically sufficient for ML training, nor mention train/test split proportions beyond cross-validation.
Suggestion: Detail how the 100 samples were partitioned (e.g. 80/20 train/test) and provide rationale for sample size relative to model complexity.
No sensitivity or ablation analysis is presented. For example, how does performance degrade if you omit “clamping time” or “air pressure”?
Suggestion: Include either (a) a one-factor-at-a-time sensitivity plot, or (b) an ablation table showing R² drops when key features are removed.
All performance claims rely on point estimates (R², MAE, RMSE). There is no confidence interval or significance test (e.g. Wilcoxon signed-rank) to show that RFR’s 0.9638 R² is significantly better than ANN’s 0.9277.
Suggestion: Perform a paired statistical test across CV folds (Wilcoxon signed-rank) and report p-values or 95% CIs for each metric.
The paper does not report training or inference times, which are critical for real-time IoT implementation.
Suggestion: Add a small section (or table) comparing per-bag prediction latency for each model on your hardware.
There is no direct comparison to any existing predictive model on a similar bagging problem.
Suggestion: Identify at least one benchmark from manufacturing (e.g. [10,23–27]) and, if possible, run it on your dataset or discuss why such methods aren’t directly comparable.
The title abbreviates “Random Forest Regression (RFR)” but never uses “BS” for Bagging System elsewhere.
Suggestion: Spell out all acronyms at first use (e.g. “Bagging System (BS)”), or eliminate unused acronyms in the title.
The Conclusion (Section 5) is overly repetitive of Results and future work, and its “95% reduction” claim is presented without caveat.

Suggestion:
- Summarize only key insights (best model, most influential features, economic impact).
- Explicitly discuss limitations (small sample size, controlled vs. field conditions) and managerial applications (e.g. how plant managers could integrate the Smart Bagging System into existing SCADA dashboards).
No explicit limitations are discussed.
Suggestion: Add a “Limitations” paragraph: data representativeness, sensor drift, need for longer-term field validation.
The Abstract is wordy and reads like a mini-introduction; it should focus on problem statement, method, key results, and main takeaway.
Suggestion: Trim to 200–250 words. Remove background details (e.g. SDG alignment can be one sentence).
Several paragraphs repeat the same points (e.g. future work restated twice).
Suggestion: Consolidate repeated ideas—especially in Sections 3.4 and 4.
Table numbering is inconsistent (Table 1 appears twice)—the model configuration table is “Table 1,” then the performance results are also “Table 1.”
Suggestion: Renumber tables sequentially and ensure each has a clear, descriptive caption.

Additional Critical Comments
With only 100 samples from a single shift, models (especially ANN) risk overfitting. Consider expanding data collection across multiple shifts and seasons to capture variability.
The Smart Bagging System is never piloted in the field. A small validation run demonstrating live weight corrections would greatly strengthen the claims.
Since SDG 12 and resource efficiency are cited, briefly discuss any environmental impacts of additional sensors or data-center energy use.

I hope these comments help the authors substantially strengthen their manuscript. I look forward to a thorough revision that addresses these points in depth.

Comments on the Quality of English Language

There are occasional run-on sentences (e.g. p. 3, “Furthermore, machine learning offers significant advantages…”) and minor typos (“an Auto-Readjusting System” spelled inconsistently).
Suggestion: Engage a professional technical editor or use a tool like Grammarly for a final polish.

Author Response

Comment 1: The Introduction briefly states that four algorithms (ANN, RFR, LR, SVR) “were selected for their complementary characteristics…” (lines 119–131) but does not clearly justify why these methods are optimal for this specific bagging-system problem. What unique features of fertilizer bagging (e.g. nonlinearity, sensor noise) make Random Forest or SVR preferable over other regression or ensemble methods?
Suggestion: Add a paragraph explicitly mapping each algorithm’s strengths (e.g. RFR’s robustness to outliers, ANN’s capacity for nonlinear interactions) to the physical and statistical characteristics of the bagging data.

Response 1:

This has been addressed in lines 128–143, where each algorithm’s strengths are mapped to the specific characteristics of the bagging system. RFR’s robustness to noise, SVR’s suitability for high-dimensional data, ANN’s ability to capture nonlinear patterns, and LR as a baseline are all justified in relation to the data’s physical and statistical properties.

Comment 2: The manuscript lacks a clear statement of the gap in existing fertilizer–bagging literature. The brief review (lines 93–102) focuses on robotics in food packaging (e.g. [16–19]) rather than data-driven prediction in fertilizers.
Suggestion: Articulate the specific absence of machine-learning studies in fertilizer bagging.

Response 2:

Addressed in lines 105–113. The revised text clarifies the gap by highlighting the absence of machine learning applications in fertilizer bagging, unlike prior studies focused on robotics. This study fills that gap with a data-driven approach using real sensor data.

Comment 3: There is no explicit hypothesis (e.g. “We hypothesize that RFR will outperform LR due to…”) or theoretical model linking sensor inputs to weight deviations.
Suggestion: Introduce a simple conceptual diagram or equation relating sensor variables (humidity, clamping time, etc.) to weight error, setting up the rationale for ML application.

Response 3:

This concern has been addressed as follows:

Hypothesis (lines 150–157):
A clear hypothesis has been added, stating that Random Forest Regression (RFR) is expected to outperform other models (ANN, SVR, LR) due to its robustness against noise, outliers, and nonlinear relationships in the fertilizer bagging data.
Theoretical Model (lines 73–88):
A theoretical explanation is provided to link each sensor input (e.g., temperature, humidity, clamping time, pressure) to its potential impact on weight deviation. This establishes a conceptual foundation for applying machine learning to predict weight variations in the bagging system.

Comment 4: Currently, the Introduction runs continuously from contextual background through literature review into methodological overview without clear breaks.
Suggestion:

Motivation: Briefly outline the operational and economic impact (e.g. 37% nonconformance, 30.6 billion IDR loss) , and the need for predictive control. In “Motivation,” explicitly state the gap: “While Industry 4.0 emphasizes ML in manufacturing, fertilizer bagging remains under-studied.” End the “Motivation” subsection by pinpointing this precise niche, priming the reader for your novel contributions.

Contributions: Numbered bullets stating: (1) comparative ML evaluation, (2) IoT-driven Smart Bagging System design, (3) economic impact analysis, (4) alignment with SDG 9 & 12.

Response 4:

The Introduction is now divided into clear subsections: 1.1 Motivation, 1.2 Literature Review and Gap, 1.3 Hypotheses, 1.4 Contributions, and 1.5 Previous Study and Research Positioning, to enhance structure and clarity.

Comment 5: The description (Section 2.2) says 100 samples from a single shift on 24 Sept 2024 at 30 s intervals, but does not justify whether this is statistically sufficient for ML training, nor mention train/test split proportions beyond cross-validation.
Suggestion: Detail how the 100 samples were partitioned (e.g. 80/20 train/test) and provide rationale for sample size relative to model complexity.

Response 5:

This has been addressed in lines 255–259. The study now uses 1000 samples collected via systematic random sampling across multiple shifts to ensure representativeness. A 10-fold cross-validation strategy was applied to enhance statistical robustness given the dataset size and model complexity.

Comment 6: No sensitivity or ablation analysis is presented. For example, how does performance degrade if you omit “clamping time” or “air pressure”?
Suggestion: Include either (a) a one-factor-at-a-time sensitivity plot, or (b) an ablation table showing R² drops when key features are removed.

Response 6:

This has been addressed in lines 478–506 (Section 3.5.1. Ablation Analysis). An ablation analysis was conducted by removing one feature at a time and measuring the drop in R² (ΔR²). Results are summarized in Table 8, showing that while the model remains stable overall, features like clamping time and air pressure are retained due to their technical relevance and high importance scores.

Comment 7: All performance claims rely on point estimates (R², MAE, RMSE). There is no confidence interval or significance test (e.g. Wilcoxon signed-rank) to show that RFR’s 0.9638 R² is significantly better than ANN’s 0.9277.
Suggestion: Perform a paired statistical test across CV folds (Wilcoxon signed-rank) and report p-values or 95% CIs for each metric.

Response 7:

This has been addressed in lines 408–419, Subsection 3.4.1 Wilcoxon Test. A Wilcoxon signed-rank test was performed on R² values across 10-fold CV, with results in Table 4 showing that RFR significantly outperforms ANN, SVR, and LR (p < 0.05). This confirms the statistical significance of RFR’s superior performance.

Comment 8: The paper does not report training or inference times, which are critical for real-time IoT implementation.
Suggestion: Add a small section (or table) comparing per-bag prediction latency for each model on your hardware.

Response 8:

This has been addressed in lines 420–434 (Section 3.4.2). Training time and per-bag inference latency for all models were measured and reported in Table 5. All models, including RFR (12.04 ms), demonstrated sub-20 ms latency, confirming their suitability for real-time IoT deployment.

Comment 9: There is no direct comparison to any existing predictive model on a similar bagging problem.
Suggestion: Identify at least one benchmark from manufacturing (e.g. [10,23–27]) and, if possible, run it on your dataset or discuss why such methods aren’t directly comparable.

Response 9:

Thank you for the suggestion. This has been addressed in lines 336–353 (Section 3.3 Benchmark Comparison with Existing Manufacturing Models). A relevant benchmark from injection molding [37] was identified, and a conceptual comparison is provided. While direct testing on our dataset is not feasible due to differing objectives and data types, we highlight methodological similarities and show how our approach extends prior work with real-time integration and quality control for fertilizer bagging.

Comment 10: The title abbreviates “Random Forest Regression (RFR)” but never uses “BS” for Bagging System elsewhere.
Suggestion: Spell out all acronyms at first use (e.g. “Bagging System (BS)”), or eliminate unused acronyms in the title.

Response 10:

After careful review, we confirm that the term “Bagging System” was not abbreviated as “BS” anywhere in the manuscript. To ensure consistency and clarity, we have chosen to use the full term “Bagging System” throughout the paper without abbreviation. Accordingly, no unused acronyms appear in the title or body of the manuscript.

Furthermore, all other abbreviations—such as Artificial Neural Network (ANN), Random Forest Regression (RFR), Support Vector Regression (SVR), Linear Regression (LR), and Internet of Things (IoT)—have been properly defined at first mention and used consistently throughout the text to enhance readability and maintain academic standards.

Comment 11. The Conclusion (Section 5) is overly repetitive of Results and future work, and its “95% reduction” claim is presented without caveat.

Suggestion:

Summarize only key insights (best model, most influential features, economic impact).

Response 11:

The Conclusion has been revised in Section 5 to avoid repetition and focus on key findings. The updated version highlights the best-performing model (RFR), key influential features, and economic impact. It now includes clear limitations (controlled conditions, sample size) and discusses managerial relevance, particularly how the Smart Bagging System can be integrated into SCADA for practical use.

Comment 12: Explicitly discuss limitations (small sample size, controlled vs. field conditions) and managerial applications (e.g. how plant managers could integrate the Smart Bagging System into existing SCADA dashboards).

No explicit limitations are discussed.
Suggestion: Add a “Limitations” paragraph: data representativeness, sensor drift, need for longer-term field validation.

Response 12:

Thank you for the suggestion. A new Section 6. Limitations has been added. This section discusses key constraints, including limited data representativeness, potential sensor drift, and the need for longer-term field validation to assess system robustness in real-world operations.

Comment 13: The Abstract is wordy and reads like a mini-introduction; it should focus on problem statement, method, key results, and main takeaway.
Suggestion: Trim to 200–250 words. Remove background details (e.g. SDG alignment can be one sentence).

Response 13:

Thank you for the feedback. The Abstract has been revised and shortened. It now focuses on the problem statement, methodology, key findings, and main contributions. Background details, including the SDG alignment, have been reduced to a single sentence to maintain conciseness and clarity.

Comment 14: Several paragraphs repeat the same points (e.g. future work restated twice).
Suggestion: Consolidate repeated ideas—especially in Sections 3.4 and 4.

Response 14:

We have carefully revised the manuscript to eliminate repetitive content and improve clarity.

Comment 15: Table numbering is inconsistent (Table 1 appears twice)—the model configuration table is “Table 1,” then the performance results are also “Table 1.”
Suggestion: Renumber tables sequentially and ensure each has a clear, descriptive caption.

Response 15:

We have corrected the table numbering throughout the manuscript to ensure consistency. Each table now follows a sequential order, and all captions have been reviewed and updated to provide clear and descriptive information.
Additional Critical Comments

Comment 16: With only 100 samples from a single shift, models (especially ANN) risk overfitting. Consider expanding data collection across multiple shifts and seasons to capture variability.

Response 16:

In response to this concern, we have expanded the dataset from the initial 100 samples to a total of 1,000 samples collected across multiple shifts over several days of operation. This expanded dataset captures a broader range of machine conditions, environmental variations, and operational dynamics, thereby reducing the risk of overfitting—especially for more complex models such as ANN. The additional data also improve the robustness and generalizability of the machine learning models presented in this study.

Comment 17: The Smart Bagging System is never piloted in the field. A small validation run demonstrating live weight corrections would greatly strengthen the claims.

Response 17:

We would like to clarify that the IoT-based sensor system has already been installed and integrated into the actual Urea Bagging Unit at PT Petrokimia Gresik. The current study focused on model development and simulation-based validation under controlled conditions, as a preliminary step before field testing. We fully agree that a live validation of automatic weight correction would enhance the practical strength of our findings. As such, a limited-scale field trial is already being planned as part of the next phase of implementation. This trial will test the real-time response of the Smart Bagging System and evaluate its performance under varying operating conditions. These plans have been mentioned in Section 6 (Limitations) to acknowledge the gap and outline our future direction.

Comment 18: Since SDG 12 and resource efficiency are cited, briefly discuss any environmental impacts of additional sensors or data-center energy use.

Response 18:

Thank you for the suggestion. This has been addressed in lines 816–820 within the Limitations section. The environmental impact of added sensors and computation is acknowledged, with justification that the minimal energy use is outweighed by the significant reduction in material waste, supporting SDG 12 on responsible consumption and production.

Author Response File: Author Response.docx

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

The authors proposed product weight predictions-based ML algorithms. The following comments should be considered for paper improvements:

Lines 18-19 “ The dataset used consisted of 18 nine sensor variables collected in real-time” It is better to cite here the variable types.
Lines 98-102 “This uses cameras and adaptive motion planning, but does not yet involve machine learning…” If the technology approvals, process optimization techniques, and parametric predictive models of the packaging systems meet the required performance, why it is needed to adopt ML-based predictive models for decision-making? A deep review of the above improvements, including more recent research contributions, is required to better express the paper's challenges.
Section 3.1. Tools/algorithms used for the preprocessing step (Completeness Verification, Consistency Checks, and Outlier Detection) should be specified.
The collected data during the specified duration appears to follow relationships and laws that can be mathematically modeled and no need for product weight predictions-based ML (there is no information about data values in this data collection section). The collected data needs an insights presentation.
“Artificial Neural Network 316 (ANN), Support Vector Regression (SVR), Linear Regression (LR), and Random Forest 317 Regression (RFR)” is repeated several times in the manuscript. Also, use abbreviations consistently (e.g., IoT after initial definition).
The manuscript includes two descriptions of ML algorithms in Section 3.1 and Introduction. It is better to merge them in one algorithm selection paragraph.
For values in Table 1, use the same decimal place employed in the manuscript text.
Equation 4 is duplicated in the manuscript.
Lines 384-385 “Furthermore, the model was evaluated on unseen test data to confirm generalization performance” How these data are collected? What are the results of this evaluation?
In the manuscript, there are several repetitions of the same idea decreasing the manuscript quality, such as:
Lines 399-403 and Lines 417-422 The same idea about clamping time and air pressure is repeated twice.
Lines 483-486 “This reflects a 3.4% improvement compared to 483 the untuned model's RMSE of 0.05, as calculated using Equation 4. Applying hyperparameter tuning on RFR reduced the RMSE value from 0.05 (without tuning) to 0.0483, 485 equivalent to a decrease of about 3.4%.” The same idea about RMSE reduction is duplicated in two successive phrases.
Given the parametric relationship (power = force x distance/time) and distance and cross-section area are constants (same machine components), clamping time and air pressure are linearly correlated if the power of the clamping system is constant. It is expected if one of the measured data has an important contribution, the second one also follows the same importance level. This verdict should be discussed previously, e.g. in the sensors selection section for data collection.
Figure 4 is unclear and requires enhancements.
In line 449“After a specified time, the clamp releases the bag, and the bucket valve closes.”. the clamping time is supposed to be predefined and constant. But, in line 454 “In this study, the 453 clamping time ranged between 3.23 and 4.96 seconds,” meaning that the clamping time is defined by a range. Also, the air pressure varied from 7.35 to 11.39 bar. Is this pressure variation due to functional requirements (initial system design) or pneumatic system defects (performance)? A further description of the bagging process is required to avoid any ambiguity.
Despite the Figure 4 is unclear, the authors discussed the complexity of the relationship between clamping time and air pressure after showing the results of ML-based predictive models. The previous comments prove that this complexity must be discussed before deciding to use a predictive model, for better understanding and methodology consistency. This discussion must include the other remaining parameters.
The layout of Table 3 should be reviewed.
Several phrases need English and structure improvements such as Lines 547-549 and Line 568.
Section 3.4 is limited to presenting a preliminary design idea of a monitoring framework for further implementation. Review section title which implies an actual implementation and tests. At least, a framework flowchart should be presented. Is the proposed monitoring framework limited to pretrained data (offline) or will include the newly collected data to enhance the training (online)? For the second option will the algorithm runtime impact the time-effectivenss of the monitoring system?

Regards

Author Response

Comment 1:Lines 18-19 “ The dataset used consisted of 18 nine sensor variables collected in real-time” It is better to cite here the variable types.

Response 1:

This has been clarified in line 18, where we specify that the dataset consists of nine numeric sensor variables, addressing the concern about variable type.

Comment 2: Lines 98-102 “This uses cameras and adaptive motion planning, but does not yet involve machine learning…” If the technology approvals, process optimization techniques, and parametric predictive models of the packaging systems meet the required performance, why it is needed to adopt ML-based predictive models for decision-making? A deep review of the above improvements, including more recent research contributions, is required to better express the paper's challenges.

Response 2:

This has been addressed in lines 98–104. We have expanded the explanation to clarify that while rule-based and parametric models support automation, they lack adaptability to nonlinear, multivariate sensor data. The added text highlights ML’s advantages in predictive accuracy and dynamic decision-making, supported by recent studies comparing ML-based systems to conventional packaging methods.

Comment 3: Section 3.1. Tools/algorithms used for the preprocessing step (Completeness Verification, Consistency Checks, and Outlier Detection) should be specified.

Response 3:

This has been addressed in lines 305–310. The preprocessing tools used—such as WEKA’s “Remove with Missing Values” filter and IQR-based outlier visualization—are now specified to clarify how completeness, consistency, and outliers were handled prior to modeling.

Comment 4: The collected data during the specified duration appears to follow relationships and laws that can be mathematically modeled and no need for product weight predictions-based ML (there is no information about data values in this data collection section). The collected data needs an insights presentation.

Response 4:

This has been addressed in lines 311–321. Descriptive statistics for all sensor variables and the target (product weight) have been added in Table 2. The presented insights demonstrate variability and multivariate relationships that are not easily captured by simple mathematical models, thereby justifying the use of machine learning for accurate prediction.

Comment 5: “Artificial Neural Network 316 (ANN), Support Vector Regression (SVR), Linear Regression (LR), and Random Forest 317 Regression (RFR)” is repeated several times in the manuscript. Also, use abbreviations consistently (e.g., IoT after initial definition).

Response 5:

We have revised the manuscript to ensure consistency in the use of abbreviations. Full terms such as “Artificial Neural Network (ANN)” and “Internet of Things (IoT)” are now defined only at their first mention, and abbreviated forms are used consistently thereafter. In addition, to reduce redundancy, repeated mentions of the full list of machine learning models have been replaced with “the four models” or similar expressions where appropriate. These changes improve clarity and streamline the presentation of the manuscript.

Comment 6: The manuscript includes two descriptions of ML algorithms in Section 3.1 and Introduction. It is better to merge them in one algorithm selection paragraph.

Response 6:

In response to your suggestion, we have removed the description of machine learning algorithms from Section 3.1 to eliminate redundancy. The explanation in the Introduction has also been revised and condensed into a single, clear paragraph that presents the rationale behind the model selection. This revision ensures that the information is concise yet informative for the reader.

Comment 7: For values in Table 1, use the same decimal place employed in the manuscript text.

Response 7:

Table 1 has been updated to Table 3 (line 373) due to the addition of new content and an expanded dataset (from 100 to 1000 samples). Metric values have been recalculated and adjusted for decimal consistency. Despite these updates, RFR consistently remains the most accurate and reliable model across all evaluation metrics.

Comment 8: Equation 4 is duplicated in the manuscript.

Response 8:

One of the duplicate Equation 4 entries has been removed to avoid redundancy. The equation now appears only once in the manuscript.

Comment 9: Lines 384-385 “Furthermore, the model was evaluated on unseen test data to confirm generalization performance” How these data are collected? What are the results of this evaluation?

Response 9:

This has been clarified in lines 402–405. The “unseen data” refers to the 10% test subset in each fold of the 10-fold cross-validation, which was not used during training. Model performance on these subsets is reflected in the reported average metrics, confirming its generalization capability.

Comment 10: In the manuscript, there are several repetitions of the same idea decreasing the manuscript quality, such as:

Lines 399-403 and Lines 417-422 The same idea about clamping time and air pressure is repeated twice.

Lines 483-486 “This reflects a 3.4% improvement compared to 483 the untuned model's RMSE of 0.05, as calculated using Equation 4. Applying hyperparameter tuning on RFR reduced the RMSE value from 0.05 (without tuning) to 0.0483, 485 equivalent to a decrease of about 3.4%.” The same idea about RMSE reduction is duplicated in two successive phrases.

Response 10:

The duplicate explanation has been removed. The two paragraphs have been merged and revised into a single, concise version to avoid redundancy and improve the clarity and flow of the manuscript.

Comment 11: Given the parametric relationship (power = force x distance/time) and distance and cross-section area are constants (same machine components), clamping time and air pressure are linearly correlated if the power of the clamping system is constant. It is expected if one of the measured data has an important contribution, the second one also follows the same importance level. This verdict should be discussed previously, e.g. in the sensors selection section for data collection.

Response 11:

This has been addressed in lines 223–232. The mechanical power equation acknowledges the relationship between clamping time and air pressure. However, in practical industrial settings, these variables are independently adjusted and measured to capture distinct aspects of pneumatic dynamics. Their separate inclusion is supported by prior studies and justified during sensor selection.

Comment 12: Figure 4 is unclear and requires enhancements.

Response 12:

We have already sharpened Figure 4

Comment 13: In line 449“After a specified time, the clamp releases the bag, and the bucket valve closes.”. the clamping time is supposed to be predefined and constant. But, in line 454 “In this study, the 453 clamping time ranged between 3.23 and 4.96 seconds,” meaning that the clamping time is defined by a range. Also, the air pressure varied from 7.35 to 11.39 bar. Is this pressure variation due to functional requirements (initial system design) or pneumatic system defects (performance)? A further description of the bagging process is required to avoid any ambiguity.

This has been addressed in lines 520–535. Although clamping time is predefined in the system, actual values vary due to operator adjustments in response to real-time deviations. Similarly, air pressure variation results from both functional design settings and performance-related fluctuations in the pneumatic system. These operational dynamics reflect the semi-automated nature of the current system and justify the need for predictive control.

Comment 14: Despite the Figure 4 is unclear, the authors discussed the complexity of the relationship between clamping time and air pressure after showing the results of ML-based predictive models. The previous comments prove that this complexity must be discussed before deciding to use a predictive model, for better understanding and methodology consistency. This discussion must include the other remaining parameters.

Response 14:

Clamping time and air pressure were selected for detailed discussion because they were identified as key variables based on attribute importance analysis. The focus on these two parameters ensures practical relevance and aligns with the study’s goal to support real-time control. Figure 4 was used as an illustrative tool to visualize their relationship with fertilizer weight, highlighting the non-linear and independent effects. While the discussion centers on these variables, all nine parameters were included in the model and evaluated accordingly.

Comment 15: The layout of Table 3 should be reviewed.

Response 15:

The layout of Table 3 has been revised to enhance readability and clarity of presentation.

Comment 16: Several phrases need English and structure improvements such as Lines 547-549 and Line 568.

Response 16:

Thank you for pointing this out. The phrasing and sentence structure in lines 547–549 and 568 have been revised for improved clarity and grammatical accuracy.

Comment 17: Section 3.4 is limited to presenting a preliminary design idea of a monitoring framework for further implementation. Review section title which implies an actual implementation and tests. At least, a framework flowchart should be presented. Is the proposed monitoring framework limited to pretrained data (offline) or will include the newly collected data to enhance the training (online)? For the second option will the algorithm runtime impact the time-effectivenss of the monitoring system?

Response 17:

While the section is titled "IoT and Real-Time Monitoring Implementation in Bagging System," the text now explicitly states that the current work represents a partial implementation. The architecture has been developed and tested with offline-trained models, and an online learning mechanism is planned for future integration. A conceptual workflow diagram (Figure 6) has also been added to illustrate the system design and operational logic.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Paper could be accepted.

Author Response

Thank you for your valuable input throughout the review process. We appreciate your feedback, which helped improve our manuscript, and we are delighted that it has now been accepted.

Reviewer 2 Report

Comments and Suggestions for Authors

All of my concerns have been adequately addressed, and the manuscript is now suitable for publication.

Author Response

Thank you for your valuable input throughout the review process. We appreciate your feedback, which helped improve our manuscript, and we are delighted that it has now been accepted.

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Authors,

I acknowledge and appreciate the significant efforts made in revising the manuscript. The quality of the paper has been substantially improved. However, I would like to highlight that some illustrations still require further refinement to enhance their clarity and readability.

Regards,

Author Response

Thank you for your valuable comments. Following your advice, we have refined Figures 4 and 5 to improve their clarity and readability. We appreciate your feedback, which has greatly strengthened the paper, and we are delighted that it has now been accepted.

Article Menu

Machine Learning Performance Analysis for Bagging System Improvement: Key Factors, Model Optimization, and Loss Reduction in the Fertilizer Industry

Further Information

Guidelines

MDPI Initiatives

Follow MDPI