Abstract
Accurate prediction of shrimp body weight is critical for optimizing harvest timing, feed management, and stocking density decisions in intensive aquaculture. While prior studies emphasize environmental factors, operational management variables—particularly harvesting metrics—remain understudied. This study quantified the predictive importance of harvesting-related variables using 5 years of industrial-scale operational data from 12 ponds (5479 cleaned records, 34.94% retention rate). We trained seven machine learning models and applied three independent feature importance methods: consensus importance ranking, SHAP explainability analysis, and Pearson correlations. Main findings: Operational variables (days of culture: 2.833 SHAP, stocking density: 1.871, cumulative feed: 1.510) ranked substantially above environmental variables (temperature: 0.123, pH: 0.065, dissolved oxygen: 0.077). Partial harvest frequency showed bimodal clustering, indicating two distinct viable operational strategies. The Weighted Ensemble model achieved the highest performance (R2 = 0.829, RMSE = 4.23 g, MAE = 3.12 g). Model stability analysis via 10-fold GroupKFold cross-validation showed that the Artificial Neural Network (ANN) exhibited the tightest confidence bounds (0.708 g width, 27.7% coefficient of variation), indicating exceptional consistency. This is the first study to systematically analyze the importance of harvesting variables using SHAP explainability, revealing that operational management decisions may yield greater returns than marginal environmental control investments. Our findings suggest that operational optimization may be more impactful than environmental fine-tuning in well-managed systems.