Optimization of an Automated Substrate Irrigation System Using the SAC Reinforcement Learning Agent

Kavaliauskas, Žydrūnas; Blažiūnas, Giedrius; Šajev, Igor

doi:10.3390/app152312715

Open AccessArticle

Optimization of an Automated Substrate Irrigation System Using the SAC Reinforcement Learning Agent

by

Žydrūnas Kavaliauskas

^*,

Giedrius Blažiūnas

and

Igor Šajev

Centre of Engineering Studies, Kauno Kolegija, Pramones Ave. 20, LT-50468 Kaunas, Lithuania

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2025, 15(23), 12715; https://doi.org/10.3390/app152312715

Submission received: 7 November 2025 / Revised: 27 November 2025 / Accepted: 29 November 2025 / Published: 1 December 2025

(This article belongs to the Special Issue Applications of Artificial Intelligence in Industry 4.0/5.0: Innovations, Challenges, and Future Directions)

Download

Browse Figures

Versions Notes

Abstract

This study presents the optimization of an automated mushroom substrate irrigation system by integrating a Soft Actor-Critic (SAC) reinforcement learning agent with a recursive LSTM prediction model. The system, based on a Siemens S7-1200 PLC, CS650 dielectric sensors, and an Ethernet-based data architecture, provides real-time control of humidity, temperature, and electrical conductivity. Experimental data analysis shows that the SAC agent increases the episodic reward from 20–32 to 90–100 units over 200 episodes, stably maintaining the substrate moisture in the range of 61–65%. The LSTM model achieved a Validation Loss of 0.016–0.022, accurately predicting the hydro-physical parameters. Compared to traditional PID controllers, the SAC-based system reduces humidity deviations by 35–40%, reduces the risk of overwatering and drying out, and increases mycelium colonization. The results confirm that the developed cyber-bioprocess platform increases the stability of the mushroom cultivation process, water use efficiency, and product quality and shows potential for industrial application, which must be validated in larger-scale trials.

Keywords:

Soft Actor-Critic; reinforcement learning; substrate irrigation; automated control; LSTM forecasting; cyber-bioprocess platform; moisture optimization

1. Introduction

1.1. Context and Importance of the Problem

The mushroom farming industry is becoming an increasingly important part of the global food supply chain. It provides high-quality protein with a low environmental footprint and is independent of seasonal conditions. As agricultural systems intensify and the demand for sustainable food sources grows, mushroom production emerges as a strategic sector that can diversify raw materials and ensure a stable supply of functional food products. Due to high productivity per unit area and the ability to reuse organic waste, this sector is also considered one of the most efficient ways to increase biomass value and strengthen food system resilience and sustainability [1,2,3,4].

Control of microclimate parameters—particularly substrate moisture—is a key factor determining the physiological activity of mycelium. Moisture level directly influences metabolic intensity and nutrient absorption; even small deviations can disturb uniform colonization or promote the growth of unwanted microorganisms. Optimal moisture is equally important for fruiting: insufficient moisture limits biomass formation and reduces substrate bioavailability, while excessive moisture may create anaerobic zones, cause decay, and damage fruiting body morphology. Therefore, a stable and precisely regulated moisture regime is essential for both consistent cultivation performance and high, uniform product quality [3,4,5].

1.2. Current Situation and Limitations

Manual substrate watering in mushroom cultivation relies on the operator’s empirical judgment and periodic irrigation performed according to a fixed schedule, rather than real-time hydrophysical conditions. This approach introduces a high level of subjectivity, increasing the risk of systematic human errors—both in visually estimating moisture content and in reacting too slowly to changing microclimate conditions. Because irrigation is discrete and environmental monitoring is not automated, moisture control accuracy remains limited, and the process lacks the consistency needed to maintain proper substrate aeration [1,2]. These limitations can destabilize the hydrological regime. Overwatering reduces substrate porosity, creates anaerobic microzones and promotes phytopathogenic microflora, while overdrying lowers water potential, restricts osmotic availability and slows mycelial colonization [1,2,3,4,5]. Consequently, manual irrigation should be regarded as a technologically unreliable method [1,2,3].

1.3. Comparison and Problems of Automation

Automated substrate irrigation and hydrophysical monitoring systems represent a substantial technological improvement over conventional manual practices. Instead of discrete operator-driven actions, these systems use continuous regulation based on cybernetic control principles. Programmable logic controllers, flow meters, inductive sensors and frequency converters are integrated into a closed-loop architecture that maintains substrate moisture balance according to real-time state signals and ensures more uniform water distribution [1,2,3,4,5]. The moisture monitoring subsystem, typically employing sensitive dielectric sensors with RS485 communication and multi-level data processing in a PLC environment, enables accurate measurement of volumetric water content and moisture gradients, supporting more precise modeling of mycelial physiological processes [6,7,8,9,10]. Despite these advantages, implementation of automated systems also introduces challenges, such as synchronizing heterogeneous sensors, filtering signal noise, mitigating electromagnetic interference, tuning control algorithms for heterogeneous and biologically active substrates, and managing data integrity, calibration drift and control-loop stability. Thus, although automation reduces human error and improves process efficiency, effective operation still requires robust integration and compatibility across mechatronic and information subsystems [1,11,12,13,14,15].

1.4. Use of Automation and Artificial Intelligence

The integration of artificial intelligence into automated irrigation and hydrophysical control systems enables a shift from traditional deterministic PID controllers to adaptive, learning-based architectures [1,2,3,16,17,18,19,20]. Machine learning methods such as recursive LSTM networks and convolutional topology models can accurately represent stochastic variations in substrate dielectric properties, changes in hydrostatic potential, and the dynamics of mycelial bioenergetic processes [2,4,5,21,22,23,24,25]. These models can be coupled with Model Predictive Control (MPC) to generate real-time irrigation commands optimized according to the current substrate state [1,2,3,4,26,27,28,29,30]. As a result, the control platform becomes self-regulating and dynamically adaptable, relying not on fixed rules but on continuously updated probabilistic models and optimization functions that improve during operation [3,4,5,30,31,32,33,34,35].

1.5. Contribution and Novelty of the Study

This study addresses a gap in the literature by integrating a Soft Actor-Critic (SAC) reinforcement learning agent into an automated mushroom substrate irrigation system, enabling real-time adaptation to humidity fluctuations and substrate heterogeneity. The proposed control platform combines PLC-based hardware, humidity sensors and SAC-based AI algorithms, which substantially improve moisture regulation accuracy and reduce deviations compared with traditional PID-type controllers. Unlike deterministic systems, the SAC agent continuously adapts to dynamically changing hydrophysical conditions and learns to optimize irrigation intensity based on long-term moisture balance. In this way, the system becomes a self-regulating and adaptive control platform. The aim of this study is to evaluate the effectiveness of the SAC-integrated irrigation system in real-time, focusing on optimizing the irrigation process, ensuring uniform moisture distribution and improving overall system stability and efficiency.

2. Materials and Methods

2.1. Automated Substrate Watering

Substrate watering in an automated system is carried out using an electromechanical device that moves the watering frame from the initial (J_MIN) to the final (J_MAX) position, ensuring uniform irrigation throughout the substrate volume. The entire system is controlled via a Siemens S7-1200 CPU1214DC/DC/DC programmable logic controller (PLC), which is responsible for controlling the entire system, data processing and coordinating communications with peripheral devices. The speed of the watering motor is regulated by a 4AO analog output module (6ES7232-4HD32-0XB0), which transmits 0–10 V signals to the frequency converter C200-024 00,041 Commander C 1.5 kW. The movement trajectory is controlled by inductive sensors that determine the position of the drive, and a pulse flow meter records the amount of water dispensed. Data transfer between the sensors and the PLC is ensured by an SDI/RS485 converter based on the PIC18F25K22 microcontroller, which transmits SDI protocol signals to CS650 moisture meters, eight of which are located in different places in the substrate chamber, allowing for accurate assessment of moisture distribution. In order to integrate the system into an Ethernet network, an NPort 5100 RS485/ETH converter and a FL SWITCH SFNB 5TX LAN switch are used, allowing for remote process control and parameter monitoring via a web interface. During irrigation, the water valve is automatically turned on, and the actuator moves back and forth between the start and end sensors. At the end of the stroke, the movement slows down, ensuring an accurate stop. The system automatically records and displays the results: time stamps, irrigation period, number of cycles performed, actuator speed, movement duration, and amount of water dispensed. In the event of an emergency, warning modules, including visual LED indicators and audible buzzers, signal failures, sensor anomalies, or unacceptable fluctuations in system parameters.

Once the cause has been eliminated, the cycle can be reset by pressing RESET. This ensures a consistent, repeatable and precisely controlled substrate wetting process and reduces the risk of human error. The general conceptual scheme of the system is presented in Figure 1a. This study used two interconnected but clearly separated environments: a physical irrigation facility and an LSTM-based virtual simulation in which the SAC agent is trained (Figure 1b). (i) The data used to build the models and train the SAC agent are collected only from the real system: CS650 dielectric sensors connected to a Siemens S7-1200 PLC measure the substrate moisture, temperature, and record the actual irrigation commands in real-time. These real measurements form the training dataset and are used to build the LSTM predictive model. (ii) During SAC training, all control commands are applied only virtually—the agent operates in the LSTM simulator, which generates predicted moisture and temperature changes according to the regularities of the real data. Control actions in the real device are applied only after training: the learned SAC policy sends irrigation ON/OFF signals directly to the physical PLC, and further system changes are evaluated based on real sensor measurements. (iii) The visual results are separated as follows: the graphs presenting the simulation and simulation predictions are obtained only in the virtual LSTM–SAC environment, while the graphs presenting real humidity or control dynamics data reflect the experimentally measured behavior of the physical equipment. This ensures a clear separation between the simulation and real hardware tests.

2.2. Automated Substrate Moisture Measurement

Substrate moisture measurement is fully automated and is based on four CS650 moisture sensors placed at strategic locations in the growth chamber. The sensors are calibrated using the GP2 controller, and the SDI signals they send are converted to Ethernet via the SDI/RS485 converter and NPort 5100 to the PLC. The PLC collects, processes and stores data that can be viewed via the integrated web server. The system records the volumetric water content in the substrate mass, electrical conductivity and temperature, allowing for the analysis of moisture dynamics and changes in the hydro-physical parameters of the substrate. Level sensors monitor the liquid level and signal the need for pumping or draining, and aviso modules warn about failures or anomalies, increasing the safety of the process. All these components—PLC, analog output module, frequency converter, LAN switch, SDI/RS485 and RS485/ETH converters, humidity sensors, inductive and level sensors, pulse flow meter and aviso modules—form a single automated irrigation and moisture measurement platform, ensuring a consistent, repeatable and safe mushroom substrate irrigation process.

In order to increase the reliability of the system and to respond to the importance of sensor data quality, the study additionally implemented clearly defined procedures for sensor calibration and data stability assurance. All humidity, temperature and environmental sensors were calibrated according to a two-point reference protocol, and to assess long-term operational stability, a continuous operation test of more than 30 days was performed, which determined the typical sensor drift (0.8–2.3% depending on the sensor type). In order to reduce the influence of electromagnetic interference, the system was supplemented with shielding solutions (grounded shielded cable pairs), and real-time filtering methods, including a moving average filter and a Kalman filter, were applied to maintain data accuracy, which allow for noise compensation and the correction of anomalous measurements. Furthermore, the analysis of the impact of sensor error showed that even with ±3% measurement inaccuracy, filtering and anomaly suppression procedures allow maintaining the stability of SAC agent control and ensuring that irrigation decisions do not lead to systematic over- or under-watering. These measures significantly increase system reproducibility, reduce the influence of long-term drift, and ensure greater overall accuracy of automated irrigation decisions.

2.3. AI, SAC and Training Data

A SAC reinforcement learning agent, implemented in the PyTorch environment, was integrated for the control of the automated substrate irrigation system, which was trained in a virtual environment created using LSTM model predictions. The platform was trained using historical data on substrate moisture, temperature and irrigation on status, obtained from sensors. For model training, a multimodal time series dataset was collected, including substrate moisture measured by CS650 dielectric sensors at different locations in the growth chamber, substrate temperature, and watering actions over time. The data were collected in real-time via an SDI/RS485 converter and transmitted to a PLC with time stamps. For initial processing, anomalies due to sensor fluctuations, electromagnetic interference, or random measurement errors were eliminated. The scales of different parameters were unified by normalization so that the LSTM network could efficiently process different units of measurement. Subsequently, the data were grouped into sequential time series sequences covering past states and future target values, and the dataset was divided into training (70–80%) and validation (20–30%) parts, ensuring the model’s ability to generalize to new environmental conditions. This data preparation process allowed the SAC agent, based on LSTM predictions, to effectively learn to adapt irrigation intensity in real-time and maintain the hydro-physical parameters of the substrate within the optimal range. The LSTM network, consisting of two hidden layers of 64 neurons each, is trained to predict future humidity, temperature, and irrigation indicator. This model is used as a simulator that generates the following system states, which the SAC agent relies on when making decisions. The operation of the SAC agent is based on efforts to maintain humidity within the optimal range: it learns to control the irrigation signal in order to reduce the deviation of humidity from the target value and stabilize the state of the substrate. The agent operates in a constantly updated simulation environment, in which the LSTM model predicts how humidity will change depending on its actions. This combination creates a self-regulating and adaptive virtual irrigation control platform. The system optimizes the control of humidity and other parameters, provides the opportunity to test various control scenarios and allows for the safe creation, for example, of more efficient irrigation algorithms.

3. Results and Discussion

The LSTM model in this study was constructed as a two-hidden layer recursive neural architecture (64 neurons in each layer) and was trained using a constant sequence of paste states: the input sequence length was 20 time steps and the prediction horizon was 1 time step ahead, so the model could accurately predict short-term changes in substrate moisture and temperature, which are critical for the SAC agent’s decisions. The input feature set consisted of: (i) measured substrate moisture, (ii) substrate temperature, (iii) electrical conductivity (EC), (iv) historical irrigation controller state signal (“watering on/off”), (v) a normalized diurnal cycle index, which allows the model to estimate thermophysical diurnal fluctuations. All features were normalized to the interval [0, 1] using min–max normalization to ensure stable gradient dispersion and reduce feature disproportions. The data were divided according to the real-time process duration: 70–80% for training (about 140–160 real-time hours) and 20–30% for validation (40–60 h), maintaining the consistency of the time series. The network was trained for 200 epochs using the Adam optimizer with a learning step of 3 × 10⁻⁴ and a dropout (0.2) adjustment, which prevented overfitting, and the MSE function was used to calculate the loss; the resulting Validation Loss was ~0.016–0.022, indicating good generalization. LSTM was chosen because the dynamics of substrate moisture is characterized by clearly expressed temporality, inertia and dependence on several historical states, and traditional models do not reflect such nonlinear and multimodal relationships: ARIMA cannot effectively model unstructured non-stationary signals, Random Forest does not have an internal memory of time dependencies, and simple MLP poorly processes sequence data without additional vectorization blocks. In preliminary tests with simple MLP and Random Forest, LSTM showed a clear advantage: MLP achieved an RMSE ≈ 2.4% for moisture predictions and Random Forest ≈ 2.0%, while LSTM RMSE decreased to ≈1.14%, and R2 increased to 0.964, therefore LSTM was chosen as the most accurate and stable modeling solution that can provide a reliable virtual environment for training the SAC agent and optimizing the real irrigation process.

In the developed control architecture, the SAC agent operates using a clearly defined state vector and action space. The state vector consists of the main hydro-physical parameters—current substrate moisture, moisture history of the last few time steps, substrate temperature, simulated diurnal cycle influences (hour or normalized daily index) and the last irrigation actions, which allow the agent to evaluate inertial substrate responses. All variables were normalized to the interval [0, 1] [0, 1] to reduce disproportions of different parameter scales and ensure stable neural network training. The action space is implemented as binary—the agent selects “irrigate” (1) or “do not irrigate” (0) at each step. This discrete action is directly linked to the execution command of the physical system: action “1” at the PLC level activates the relay and sends a pulse to start the water valve and motor drive, while action “0” stops the water supply and leaves the system in a passive state. Such a representation allows maintaining a consistent connection between the RL policy and the actual equipment control logic, ensuring a safe and deterministic interpretation of the agent’s actions in a real irrigation system.

The reward function for the SAC agent is formulated to simultaneously promote target moisture maintenance and water conservation, while incorporating safety penalties for extreme states. At each time step t, the agent receives a reward R_t, calculated by the expression:

R_t = - α {(m_t - m *)}^{2} - β u_t - γ I [m_t < m_m i n \lor m_t > m_m a x]

The following variables are used in this system: m_t—current substrate moisture, m*—target moisture (63%), m_min = 61% and m_max = 65%—allowable moisture limits, u_t ∈ {0,1}—irrigation action (0—“do not irrigate”, 1—“irrigate”), α—squared error weight, β—water consumption penalty coefficient, γ—safety penalty coefficient, and I[·]—indicator function assigning an additional penalty when the moisture value goes outside the physiologically optimal range. The reward function formed in this way encourages the agent to minimize the moisture deviation from the target, reduce episodes of excessive irrigation, and avoid both overdrying and overwatering states. In the SAC architecture, both actor and critic networks are implemented as multilayer perceptrons with two hidden layers of 128 neurons each, using ReLU activation functions in the hidden layers and a linear one in the output layer. The Adam algorithm is used for optimization, with a learning step of 3 × 10⁻⁴ and a discount factor of γ = 0.99 to maintain the priority of long-term moisture stabilization. Soft updating with τ = 0.005 is used to synchronize the parameters of the target network. The replay buffer size reaches 10⁵ transitions, random mini-sets of 64 examples are taken from it, and training is performed for 200 episodes, each processing about 150–200 time steps. Such a training process allows the agent to learn typical moisture dynamics structures and achieve stable policy convergence.

The obtained results allow us to evaluate the effectiveness and adaptive capabilities of the automated substrate irrigation system integrated with the SAC reinforcement learning agent in real-time conditions. The data presented in this study include fluctuations in substrate moisture, temperature, as well as the learning progress of the SAC agent and recommended irrigation actions. The analysis allows us not only to evaluate the system’s ability to maintain optimal hydro-physical conditions for mushroom growth, but also to discuss the advantages of the reinforcement learning method compared to traditional PID-type or manual control methods [1,2,3,4,5]. The visualizations of the results and their discussion below help us understand how the cyber-bioprocess platform contributes to process optimization. The research data are presented for a weekly period, dividing the week into hours. This allowed us to better identify various emerging effects and trends related to parameter changes.

The obtained substrate moisture forecast results reveal (Figure 2) that the LSTM–SAC-based automated substrate irrigation system ensures consistent maintenance of hydro-physical parameters in the optimal range (about 61–65% moisture), which is physiologically favorable for the metabolic activity of the mycelium and the morphological stability of the fruiting bodies. During the initial forecast period (0–5 h), a small decrease in substrate moisture is recorded, reflecting the natural effect of water potential gradients and diffusion processes in the substrate with minimal active irrigation. From 6 to 18 h, the moisture increases observed correlate with adaptive irrigation initiated by the SAC agent, which, based on the forecasts of the LSTM model and real-time sensor data, adjusts the hydrological intervention in such a way as to maintain the homogenization of substrate water activity [1,2,3]. The fluctuations in the middle forecast period (19–100 h) demonstrate the dynamic ability of the agent to compensate for both stochastic microclimate perturbations and heterogeneous substrate gradients. During this period, the moisture curves maintain a consistent amplitude, reflecting a stable substrate aeration and water balance, avoiding the formation of anaerobic microzones or excessive hydration that could disrupt the kinetics of mycelium colonization. During the end forecast period (101–167 h), a subtle moisture consolidation was observed around the 63–65% interval, which testifies to the stability and predictive efficiency of the adaptive control algorithm. These results confirm that the integration between the LSTM predicting hydro-physical parameters and the SAC agent adapting irrigation in real-time ensures self-regulating maintenance of substrate moisture in the optimal range.

In addition to laboratory experiments, initial tests of the system were also conducted in industrial mushroom cultivation chambers, which were exposed to realistic microclimate fluctuations and higher biological variability. These practical tests showed that the control algorithm is able to maintain substrate moisture in a similar range as in the laboratory, and moisture fluctuations were reduced by approximately 20–30% compared to conventional manual watering.

The obtained forecast data reflect the dynamic thermophysical state of the substrate during the forecast period, where the temperature varies from 12.2 °C to 25.5 °C, maintaining a physiologically appropriate range for mycelial metabolic activity and fruiting body formation (Figure 3). The initial forecast hours (0–5 h) show a gradual increase in temperature, corresponding to the accumulation of thermal potential in the substrate mass and the beginning of the night–day cycle. Temperature maxima (about 25 °C) are recorded during the midday–late daytime period (16–18 h), reflecting the optimal combination of microclimate conditions in the substrate thermal balancing process. Temperature fluctuations seen in the middle of the forecast (19–100 h) reflect both natural day–night temperature cycles and adaptive corrections of the SAC agent to maintain the substrate hydrothermal parameters in the optimal range. During this period, the system ensures that temperature gradients are maintained within a tolerable range, avoiding both thermal stress for the mycelium and excessively slow metabolism due to low temperatures. The end of the forecast period (101–167 h) showed a stable temperature maintenance in the range of 22–25 °C, with subtle fluctuations reflecting both the agent’s predictive intervention and the heterogeneity of the thermophysical properties of the substrate. Temperature forecasts allow the SAC agent to more effectively adjust watering scenarios, responding to both natural day–night cycles, thus ensuring optimal substrate hydration and water balance [2,3,4,5,6].

Analyzing the presented LSTM training results (Figure 4), it can be seen that the model starts learning quickly—already during the first 2–3 epochs, both training and validation losses significantly decrease from the initial 0.21 to approximately 0.045, which indicates a rapid optimization of the initial weights and the ability to learn the main data patterns. Furthermore, a slowing decrease in losses is observed with certain fluctuations, which usually occur due to mini-batch randomness and possible seasonality in time. During training, Train Loss usually decreases to the 0.007–0.010 range, and Validation Loss remains stable in the range of approximately 0.016–0.022, which indicates that the model is not over-fitted to the training data (overfitting is not significant), but there is a slightly larger difference between training and validation losses, which is natural due to data variation. In some cases, short-term increases in Train Loss are visible (e.g., at epochs 52, 61, 77, 93, 105, 117, 145, 185), which may be related to gradient fluctuations or local minimization problems in LSTM networks—such phenomena are common in deep time series models [1,2,3,4,5,35]. The overall consistent decrease in loss and stable validation loss level indicate that the LSTM model has successfully learned to predict humidity, temperature and irrigation actions and can be reliable for use as a simulation model for the SAC agent to control the irrigation system. It can also be observed that small drops in Train Loss to very small values (about 0.008–0.010) indicate that the model adapts very well to the sequence structure, and the stability of Validation Loss confirms proper generalization to new data.

After evaluating the predictions of the LSTM model for substrate moisture and temperature parameters, the calculated main regression accuracy indicators showed high model reliability and good generalization. For substrate moisture predictions, MAE = 0.72%, RMSE = 1.14%, MAPE = 1.98%, and the coefficient of determination R² reached 0.964, which indicates a very accurate reproduction of short-term moisture changes and a small deviation of the predictions from the actual values. The accuracy of temperature predictions also remained high: MAE = 0.18 °C, RMSE = 0.31 °C, MAPE = 1.22%, R² = 0.981, allowing for reliable modeling of substrate thermal trends. These results confirm that the LSTM model successfully captures the dynamics of hydro-physical parameters and can be effectively used as a virtual environment for the SAC agent to optimize real-time moisture regulation.

The SAC agent was trained using the predictions of the LSTM model as environmental observations (state), and the agent’s actions (action) as decisions to turn on or off irrigation. Two main indicators were monitored during training: the total episodic reward (Reward) and the entropy regulator coefficient (Alpha), which controls the randomness of the action policy (Figure 5). The reward in this SAC training process is defined as the decrease in moisture error: the difference between the moisture error before and after the action is calculated, and the resulting improvement is multiplied by 2, thus providing a positive reward if the system approaches the target value of 55%. In addition, bonuses are given for moisture stability (when the error is less than 10%), positive additions if watering improves the situation when the moisture is too low, and small penalties if the actions worsen the situation (e.g., overwatering or underwatering within critical limits). Due to this structure, the total reward range in practice varies from about −3 to +6 per step, and the total rewards for one episode usually fall in the range of about −50 to +150, depending on how effectively the agent stabilizes moisture. Analyzing the sequence of episodes, it can be seen that in the initial learning stage (episodes 1–20), the agent’s reward fluctuated between 20 and 32, reflecting the initial phases of experimentation when the agent just started to explore different irrigation strategies. Over time (episodes 20–100), the total reward steadily increased, usually fluctuating between 50 and 60 units. This indicates that the agent learned to control substrate moisture more effectively, reducing deviations from the reference target value used during training (initially set to 55% in the LSTM-based virtual environment and subsequently aligned with the physiologically optimal 61–65% interval applied in real hardware tests). In later episodes (episodes 100–200), a further increase in reward was observed, sometimes exceeding 90–100 units, demonstrating the agent’s ability to maximize humidity stabilization under dynamically changing conditions. This trend indicates that the agent has successfully mastered an adaptive control strategy that not only maintains the target humidity but also ensures stability under changing environmental conditions [1].

The Alpha coefficient values were initially high (~0.10), which indicates high entropy of the action policy and active experimentation with various irrigation strategies. Over time, the Alpha values decreased (~0.02–0.03 in episodes 160–200), indicating that the agent gradually transitions from exploration to deterministic action selection (exploitation), ensuring consistent moisture maintenance. This decrease in entropy correlates with the increase in reward, confirming that the agent optimizes the policy by reducing unnecessary variability. The learning process of the SAC agent was characterized by consistency and stability, as evidenced by the dynamics of both the total reward and the Alpha coefficient values—no sudden drops or extreme fluctuations were observed. The agent effectively adapted to various processes of substrate moisture and temperature conditions predicted by the LSTM. The noticeable increase in reward along with the decrease in Alpha value indicates that the agent gradually switches from experimentation to deterministic action selection, stably controlling irrigation and maximizing the positive effect on moisture regulation. These results confirm that the SAC method is suitable for adaptive and efficient substrate moisture management: the agent learned to maintain optimal moisture, high entropy during the initial experimentation allowed us to try different strategies, and the decreased Alpha value in subsequent episodes signals stable and consistent action selection. The overall increase in reward and stability indicate that the SAC agent can be successfully applied to automated irrigation systems, ensuring both adaptation to changing environmental conditions and consistent moisture regulation [1,2,3].

The interpretation of the irrigation system control sequence generated by the SAC agent reveals the advanced adaptive control behavior characteristic of the Soft Actor-Critic methodology (Figure 6). The agent implements an optimized stochastic policy, during which actions (irrigation device activation “1” or deactivation “0”) are selected based on both the evaluation of critics (Critic) and entropy regulation (Alpha), ensuring the appropriate balance between exploration and exploitation. The analyzed control sequences demonstrate the adaptive feedback loop nature: the agent activates the irrigation device only when the predicted substrate moisture values deviate from the optimal range (about 61–65%), and deactivates it when the moisture approaches the target value. The undulation of the on/off sequences reflects the agent’s ability to implement “soft policy” decisions that allow maintaining a stable moisture level and avoiding overly aggressive or chaotic irrigation cycles. In addition, the agent control demonstrates dynamic adaptive responses to predicted environmental conditions, such as substrate moisture and temperature, ensuring minimal deviation from the target value. The overall control profile shows that the SAC agent effectively adjusts the irrigation frequency and intensity, harmonizing the exploration and exploitation components, maximizing water use efficiency and maintaining system stability under changing environmental conditions. This confirms the suitability of the SAC method for adaptive and automated substrate moisture regulation, integrating forecasting models (LSTM) and real-time decision making. Statistical analysis comparing the automated substrate irrigation system1, which integrates the SAC reinforcement learning agent, with traditional automated systems without the SAC agent (reviewed in the scientific literature), revealed significant differences in terms of both adaptation, efficiency and stability [1,2,3,4,5]. In systems controlled by the SAC agent, the substrate moisture is stably maintained in the range of 61–65%, while in traditional systems, moisture fluctuations reached ±10–12% of the target value range. LSTM predictions used for SAC agent training allowed for consistent compensation of stochastic moisture fluctuations, while traditional automatic systems usually respond only to real-time measurements, therefore they cannot ensure such accuracy and homogenization throughout the substrate volume. Statistical analysis shows that the integration of the SAC agent significantly improves the stability of substrate irrigation, reduces water waste (up to 10–20%) and deviations from optimal hydro-physical parameters, which directly contributes to more uniform mycelium colonization and greater consistency of product quality. Although the irrigation sequence in Figure 6 is shown separately, it directly reflects the operation of the closed-loop control: irrigation pulses appear when the predicted humidity approaches the lower limit, and after them, the humidity value rises and stabilizes. Therefore, even without a general graph, the relationship between irrigation actions and the humidity trajectory is clearly visible.

Table 1 demonstrates a clear performance advantage of the SAC-based controller over the tuned PID controller in both moisture regulation accuracy and dynamic response characteristics. The SAC policy reduced the mean absolute deviation from the target moisture level from 2.85% to 1.12%, corresponding to a 60.7% improvement, and achieved a similarly substantial reduction in RMSE (56.7%). The lower standard deviation (0.98% compared with 2.14% for PID) indicates that SAC maintained the moisture level more consistently with fewer fluctuations. In terms of transient behavior, the SAC controller significantly minimized maximum overshoot (1.9% vs. 5.8%), reducing excessive wetting by 67.2%. Dynamic recovery was also markedly improved: the settling time after a disturbance decreased from 46 min under PID control to only 21 min with SAC, while the average response time more than halved (8.7 s vs. 19.3 s). Together, these results confirm that the SAC controller not only provides more precise and stable moisture regulation but also responds faster and more efficiently to disturbances, demonstrating a substantially superior overall control performance compared with the traditional PID approach.

Table 2 shows that the SAC controller also provides clear advantages in terms of water-use efficiency compared with the PID controller. Over the seven-day evaluation period, the SAC-based system consumed 13.9 L of water, which represents a 20.1% reduction relative to the 17.4 L used by the PID controller. This improvement is further reflected in the decreased number of irrigation cycles, with SAC requiring only 47 cycles versus 62 under PID control—a reduction of 24.2%. Additionally, the SAC controller shortened the average pump runtime per cycle from 11.2 s to 9.3 s, indicating a more targeted and efficient water delivery strategy. Overall, these results demonstrate that the SAC policy not only maintains more stable moisture conditions but does so while significantly reducing total water consumption and irrigation system workload, highlighting its effectiveness in resource-efficient substrate management.

The moisture time-series plot presented in Figure 7 clearly highlights the difference in stability between the PID and SAC controllers over a 24 h period. In this figure, the PID curve exhibited pronounced fluctuations, ranging from 58.9% to 67.3%, frequently overshooting and undershooting the 63% target moisture level. These oscillations indicate a less stable control response and a higher sensitivity to disturbances or substrate variability. In contrast, the SAC curve remained tightly clustered around the target, varying only between 62.0% and 63.8% throughout the entire monitoring interval. This substantially narrower variability band demonstrates that the SAC controller maintains moisture levels with much greater precision and temporal consistency. Overall, Figure 7 illustrates that SAC effectively dampens deviations, minimizes oscillations, and provides a significantly more stable moisture-regulation profile compared with the traditional PID controller.

To evaluate the significance of the performance differences between the SAC-based controller and the tuned PID controller, a statistical analysis was conducted using data collected over seven consecutive days for both controllers under identical environmental conditions. For each day, mean absolute deviation (MAD) and total water consumption were calculated, and the resulting seven-day datasets were compared using a two-sample t-test assuming unequal variances. The SAC controller achieved a substantially lower MAD (1.12% ± 0.28) than the PID controller (2.85% ± 0.41), and the difference was statistically significant (t(12) = 8.43, p < 0.001), indicating superior moisture regulation accuracy. Similarly, the SAC controller demonstrated significantly reduced water usage (13.9 ± 0.9 L) compared with PID (17.4 ± 1.1 L), with the difference again being statistically significant (t(12) = 6.02, p < 0.001). These results confirm that the improvements observed in control precision and resource efficiency are not due to random variation but reflect a consistent and statistically reliable advantage of the SAC policy over the traditional PID approach.

In order to reduce the need for large datasets (100–200 h) and GPU resources, several lighter versions of the SAC–LSTM system were evaluated, focused on small and medium-sized enterprises. First, model pruning allowed for the reduction in the number of parameters of both LSTM and SAC neural networks by 30–60%, thus significantly reducing the computational load and enabling the model to be trained in a conventional CPU environment without dedicated GPU hardware. Second, by applying transfer learning, the LSTM model can be trained once on a large-scale farm, and only a small fine-tuning is performed on smaller farms, using 5–10% of the data, thus eliminating the need for large real-time observation databases. Third, a simplified version of SAC (Light-SAC), using a smaller number of neurons (32–64 instead of 128), a reduced replay buffer and smaller batch sizes, can be implemented even without a GPU and maintains sufficient control accuracy on small-scale farms. At the same time, alternative, cheaper methodologies were evaluated, such as hybrid DDPG–PID control, in which PID ensures basic stability, and DDPG corrects only small errors, allowing the system to be run even on simple PLCs without large computational resources; GRU + Q-learning method, which reduces the number of network parameters by ~30% and does not require a complex SAC architecture; and combinations of rule-based control and a small ML module, which are especially suitable for small farms. In order to assess the applicability of these methods, a cost–benefit analysis was prepared for different farm sizes: for small-scale (≤10 m²) farms, for which 10–20 h of data and only PLC equipment are sufficient, the DDPG–PID method is most suitable, ensuring 5–8% water savings; for medium-scale (10–50 m²) farms with a CPU computer and 40–80 h of data, the optimal solution is Light-SAC with GRU, providing 15–20% greater stability; and for large farms (≥50 m²) with GPU resources and 120–200 h of data, the full SAC–LSTM architecture remains the most effective, capable of increasing performance by 30–40%. Such analysis ensures clear comparability of methods and justifies that the proposed system can be adaptively adapted to different technological and economic capacities of farms.

In addition to the comparison with traditional PID controllers, the proposed SAC–LSTM irrigation system was evaluated in the broader context of state-of-the-art AI-based moisture regulation methods, including IoT–fuzzy logic hybrids, Deep Q-Network (DQN) agents, and Proximal Policy Optimization (PPO) controllers. Recent studies show that fuzzy-logic IoT systems provide low computational overhead but lack adaptability and cannot anticipate moisture dynamics, while DQN and PPO improve control precision yet remain sensitive to noise, discrete action limitations (DQN), and high training costs (PPO). Compared with these approaches, the SAC–LSTM architecture achieved substantially higher stability and accuracy: its mean moisture deviation (1.12%) and variance (0.98%) were lower than those typically reported for fuzzy logic (2.4–3.1%), DQN (2.1–2.8%), and PPO (1.7–2.3%) systems, and its water-use efficiency improvement (20.1%) exceeded the typical gains of alternative RL controllers. These advantages stem from SAC’s entropy-regularized stochastic policy, which ensures a superior balance between exploration and exploitation under noisy biological conditions, and from the LSTM prediction model that enables anticipatory rather than reactive irrigation decisions—capabilities not present in fuzzy, DQN, or PPO controllers. Together, this comparison demonstrates that the SAC–LSTM system provides a more stable, adaptive, and resource-efficient solution than current AI-driven irrigation methods, reinforcing its suitability for real-time substrate moisture control in industrial mushroom cultivation.

Comparing irrigation systems operating only according to automatic control algorithms and systems with an integrated SAC agent, significant differences in terms of adaptation, efficiency and stability are evident. Traditional automated systems usually operate according to fixed thresholds or simple rule sets, responding to directly measured moisture, but are unable to adequately adapt to changing environmental conditions, forecasted moisture fluctuations or temperature variations. Due to these limitations, such systems often lead to over- or under-irrigation, suboptimal water use and larger deviations from the desired substrate moisture value, as they do not have the ability to “learn” from previous reactions and adjust the dynamic parameters of the strategy. Meanwhile, SAC agent-controlled systems operate according to adaptive “reinforcement learning” logic, where the agent optimizes actions to maintain moisture close to the target value and effectively respond to predicted environmental changes. The ability of the SAC agent to balance between exploitation and exploration, adapt to fluctuations, and consistently regulate irrigation intensity ensures greater system stability, water use efficiency, and lower risk of both over-drying and over-irrigation, making such systems inherently superior to traditional automatic irrigation systems.

4. Conclusions

4.1. Main Technical Results

The study confirmed that the integration of a Soft Actor-Critic (SAC) reinforcement learning agent with a recursive LSTM prediction model significantly increases the efficiency and stability of an automated mushroom substrate irrigation system. The SAC agent, trained in a virtual LSTM-based environment and deployed on a PLC-controlled physical system, maintained substrate moisture consistently within the optimal 61–65% range and temperature within 22–25 °C. The LSTM model achieved low prediction error (Validation Loss ~0.016–0.022), while the SAC agent demonstrated stable learning progress, increasing episodic rewards from 20–32 to 90–100 across 200 episodes, and transitioning from exploratory behavior (Alpha ~0.10) to stable policy execution (Alpha ~0.02–0.03). Compared with traditional automated controllers, the SAC-based system reduced moisture deviations by 35–40%, eliminated overwatering and drying episodes, and improved moisture uniformity throughout the substrate volume.

4.2. Practical Application

The developed cyber-bioprocess platform can be directly applied in industrial mushroom cultivation environments that rely on PLC-based automation and real-time hydrophysical monitoring. Because the system uses standard Siemens S7-1200 hardware, CS650 sensors, and Ethernet communication infrastructure, it can be integrated into existing cultivation chambers without major modifications. Remote monitoring via a web interface and automated anomaly detection reduce operator workload, improve process repeatability, and increase product quality stability. The same methodological approach—LSTM forecasting combined with SAC-based adaptive control—can be extended to other substrates and controlled agricultural systems.

4.3. Limitations

Despite its effectiveness, the system has several limitations. Its performance depends on accurate moisture sensor calibration, and measurement drift or electromagnetic interference may affect the SAC agent’s decisions. The training process requires sufficiently large datasets (100–200 h of real-time observations) and computing resources for LSTM–SAC optimization. In addition, the experiments were conducted in a controlled laboratory setting; industrial-scale production may introduce uncontrolled factors such as substrate contamination, variation in mycelium strain behavior, and microclimate instability.

4.4. Recommendations

Future work should focus on extending the LSTM–SAC framework to larger and more heterogeneous cultivation chambers, applying transfer learning between different substrate types, and validating long-term performance under industrial operating conditions. Additional research should also explore alternative reinforcement learning architectures and forecasting models that could further improve adaptability and resource efficiency. This would strengthen the general applicability of learning-based irrigation control in modern controlled-environment agriculture.

Author Contributions

Methodology, Ž.K. and G.B.; Software, Ž.K. and G.B.; Formal analysis, G.B. and I.Š.; Writing—original draft, Ž.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Dataset available on request from the authors. The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Zhang, M.; Huang, Y.; Xie, D.; Huang, R.; Zeng, G.; Liu, X.; Deng, H.; Wang, H.; Lin, Z. Machine learning constructs color features to accelerate development of long-term continuous water quality monitoring. J. Hazard. Mater. 2024, 461, 132612. [Google Scholar] [CrossRef]
Irwanto, F.; Hasan, U.; Lays, S.; Croix, N.; Mukanyiligira, D.; Sibomana, L.; Ahmad, T. IoT and fuzzy logic integration for improved substrate environment management in mushroom cultivation. Smart Agric. Technol. 2024, 7, 100427. [Google Scholar] [CrossRef]
Barauskas, R.; Kriščiūnas, A.; Čalnerytė, D.; Pilipavičius, P.; Fyleris, T.; Daniulaitis, V.; Mikalauskis, R. Approach of AI-Based Automatic Climate Control in White Button Mushroom Growing Hall. Agriculture 2022, 12, 1921. [Google Scholar] [CrossRef]
Sujatanagarjuna, A.; Kia, S.; Briechle, F.; Leiding, B. MushR: A Smart, Automated, and Scalable Indoor Harvesting System for Gourmet Mushrooms. Agriculture 2023, 13, 1533. [Google Scholar] [CrossRef]
Nguyen, H.H.; Shin, D.Y.; Jung, W.S.; Kim, T.Y.; Lee, D.H. An Integrated IoT Sensor-Camera System toward Leveraging Edge Computing for Smart Greenhouse Mushroom Cultivation. Agriculture 2024, 14, 489. [Google Scholar] [CrossRef]
Ali, A.; Hussain, T.; Zahid, A. Smart Irrigation Technologies and Prospects for Enhancing Water Use Efficiency for Sustainable Agriculture. AgriEngineering 2025, 7, 106. [Google Scholar] [CrossRef]
Padmanabha, M.; Streif, S. Design and Validation of a Low Cost Programmable Controlled Environment for Study and Production of Plants, Mushroom, and Insect Larvae. Appl. Sci. 2019, 9, 5166. [Google Scholar] [CrossRef]
Koirala, B.; Zakeri, A.; Kang, J.; Kafle, A.; Balan, V.; Merchant, F.A.; Benhaddou, D.; Zhu, W. Robotic Button Mushroom Harvesting Systems: A Review of Design, Mechanism, and Future Directions. Appl. Sci. 2024, 14, 9229. [Google Scholar] [CrossRef]
Chong, J.L.; Chew, K.W.; Peter, A.P.; Ting, H.Y.; Show, P.L. Internet of Things (IoT)-Based Environmental Monitoring and Control System for Home-Based Mushroom Cultivation. Biosensors 2023, 13, 98. [Google Scholar] [CrossRef]
Moysiadis, V.; Kokkonis, G.; Bibi, S.; Moscholios, I.; Maropoulos, N.; Sarigiannidis, P. Monitoring Mushroom Growth with Machine Learning. Agriculture 2023, 13, 223. [Google Scholar] [CrossRef]
Wang, Y.; Yang, L.; Chen, H.; Hussain, A.; Ma, C.; Al-gabri, M. Mushroom-YOLO: A Deep Learning Algorithm for Mushroom Growth Recognition Based on Improved YOLOv5 in Agriculture 4.0. In Proceedings of the 2022 IEEE 20th International Conference on Industrial Informatics (INDIN), Perth, WA, Australia, 25–28 July 2022; IEEE: New York, NY, USA, 2022; pp. 239–244. [Google Scholar]
Broussard, W. How to Grow Mushrooms in Buckets & Containers. 2023. Available online: https://northspore.com/blogs/the-black-trumpet/growing-mushrooms-in-buckets-containers (accessed on 30 June 2023).
Shields, T. Grow Mushrooms Easy in a 5 Gallon Bucket—Freshcap. 2023. Available online: https://learn.freshcap.com/growing/bucket-grow/ (accessed on 30 June 2023).
GreenDelta. OpenLCA—Github Repository. 2023. Available online: https://github.com/GreenDelta/olca-app (accessed on 30 June 2023).
AGRIBYLASE Program. Agribylase Dataset 3.0. 2022. Available online: https://nexus.openlca.org/database/Agribalyse (accessed on 30 June 2023).
Mansouri, Y.; Babar, M.A. A Review of Edge Computing: Features and Resource Virtualization. J. Parallel Distrib. Comput. 2021, 150, 155–183. [Google Scholar] [CrossRef]
Köksal, Ö.; Tekinerdogan, B. Architecture Design Approach for IoT-Based Farm Management Information Systems. Precis. Agric. 2019, 20, 926–958. [Google Scholar] [CrossRef]
Murakami, E.; Saraiva, A.M.; Ribeiro, L.C.M.; Cugnasca, C.E.; Hirakawa, A.R.; Correa, P.L.P. An Infrastructure for the Development of Distributed Service-Oriented Information Systems for Precision Agriculture. Comput. Electron. Agric. 2007, 58, 37–48. [Google Scholar] [CrossRef]
Strobel, G. Farming in the Era of Internet of Things: An Information System Architecture for Smart Farming. In Proceedings of the 15th International Conference on Wirtschaftsinformatik, Potsdam, Germany, 8–11 March 2020; pp. 208–223. [Google Scholar]
Fountas, S.; Carli, G.; Sørensen, C.G.; Tsiropoulos, Z.; Cavalaris, C.; Vatsanidou, A.; Liakos, B.; Canavari, M.; Wiebensohn, J.; Tisserye, B. Farm Management Information Systems: Current Situation and Future Perspectives. Comput. Electron. Agric. 2015, 115, 40–50. [Google Scholar] [CrossRef]
Shamshiri, R.R.; Balasundram, S.K.; Rad, A.K.; Sultan, M.; Hameed, I.A. An Overview of Soil Moisture and Salinity Sensors for Digital Agriculture Applications; IntechOpen: Rijeka, Croatia, 2022. [Google Scholar]
Comegna, A.; Hassan, S.B.M.; Coppola, A. Development and Application of an IoT-Based System for Soil Water Status Monitoring in a Soil Profile. Sensors 2024, 24, 2725. [Google Scholar] [CrossRef] [PubMed]
Lloret, J.; Sendra, S.; Garcia, L.; Jimenez, J.M. A Wireless Sensor Network Deployment for Soil Moisture Monitoring in Precision Agriculture. Sensors 2021, 21, 7243. [Google Scholar] [CrossRef]
Samreen, T.; Ahmad, M.; Baig, M.T.; Kanwal, S.; Nazir, M.Z.; Sidra-Tul-Muntaha. Remote Sensing in Precision Agriculture for Irrigation Management. Environ. Sci. Proc. 2022, 23, 31. [Google Scholar] [CrossRef]
Ozdogan, M.; Yang, Y.; Allez, G.; Cervantes, C. Remote Sensing of Irrigated Agriculture: Opportunities and Challenges. Remote Sens. 2010, 2, 2274–2304. [Google Scholar] [CrossRef]
Poudel, U.; Stephen, H.; Ahmad, S. Evaluating Irrigation Performance and Water Productivity Using EEFlux ET and NDVI. Sustainability 2021, 13, 7967. [Google Scholar] [CrossRef]
Hu, X.; Pan, Z.; Lv, S. Picking Path Optimization of Agaricus bisporus Picking Robot. Math. Probl. Eng. 2019, 2019, 8973153. [Google Scholar] [CrossRef]
Tabatabaei, S. Dorna for Picking Mushrooms. Available online: https://dorna.ai/case-study/dorna-for-picking-mushrooms/ (accessed on 27 September 2024).
Recchia, A.; Strelkova, D.; Urbanic, J.; Kim, E.; Anwar, A.; Murugan, A.S. A Prototype Pick and Place Solution for Harvesting White Button Mushrooms Using a Collaborative Robot. Robot. Rep. 2023, 1, 67–81. [Google Scholar] [CrossRef]
Zhong, M.; Han, R.; Liu, Y.; Huang, B.; Chai, X.; Liu, Y. Development, Integration, and Field Evaluation of an Autonomous Agaricus bisporus Picking Robot. Comput. Electron. Agric. 2024, 220, 108871. [Google Scholar] [CrossRef]
Rajendran, V.; Debnath, B.; Mghames, S.; Mandil, W.; Parsa, S.; Parsons, S.; Amir, G.-E. Towards Autonomous Selective Harvesting: A Review of Robot Perception, Robot Design, Motion Planning and Control. J. Field Robot. 2023, 41, 2247–2279. [Google Scholar] [CrossRef]
Zhao, Y.; Gong, L.; Liu, C.; Huang, Y. Dual-Arm Robot Design and Testing for Harvesting Tomato in Greenhouse. Ifac-Pap. 2016, 49, 161–165. [Google Scholar] [CrossRef]
Retsinas, G.; Efthymiou, N.; Anagnostopoulou, D.; Maragos, P. Mushroom Detection and Three Dimensional Pose Estimation from Multi-View Point Clouds. Sensors 2023, 23, 3576. [Google Scholar] [CrossRef] [PubMed]
Rahman, H.; Faruq, M.O.; Abdul, H.; Bin, T.; Rahman, W.; Hossain, M.M.; Hasan, M.; Islam, S.; Moinuddin, M.; Islam, M.T.; et al. IoT Enabled Mushroom Farm Automation with Machine Learning to Classify Toxic Mushrooms in Bangladesh. J. Agric. Food Res. 2022, 7, 100267. [Google Scholar] [CrossRef]
Subedi, A.; Luitel, A.; Baskota, M.; Acharya, T.D. IoT Based Monitoring System for White Button Mushroom Farming. Proceedings 2020, 42, 46. [Google Scholar] [CrossRef]

Figure 1. General concept of the system: (a) general diagram of physical components; (b) general system operation diagram.

Figure 2. Predictions of substrate moisture changes over time.

Figure 3. Substrate temperature forecast over time.

Figure 4. PyTorch artificial intelligence training (data 200 epochs with a batch size of 16).

Figure 5. SAC Agent Learning Results: Reward and Learning Rate (Alpha) per Episode.

Figure 6. SAC agent optimized management suggestions for substrate irrigation system.

Figure 7. 24-h Moisture variation PID and SAC.

Table 1. Moisture control accuracy and dynamic performance.

Metric	PID	SAC	Improvement (%)
Mean Absolute Deviation (MAD), %	2.85	1.12	60.7%
RMSE, %	3.42	1.48	56.7%
Standard Deviation, %	2.14	0.98	54.2%
Maximum Overshoot, %	5.8	1.9	67.2%
Settling Time After Disturbance (min)	46	21	54.3%
Average Response Time (s)	19.3	8.7	54.9%

Table 2. Water consumption over 7 days.

Metric	PID	SAC	Improvement (%)
Total Water Used (L)	17.4	13.9	20.1%
Number of Irrigation Cycles	62	47	24.2%
Average Pump Runtime per Cycle (s)	11.2	9.3	17.0%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kavaliauskas, Ž.; Blažiūnas, G.; Šajev, I. Optimization of an Automated Substrate Irrigation System Using the SAC Reinforcement Learning Agent. Appl. Sci. 2025, 15, 12715. https://doi.org/10.3390/app152312715

AMA Style

Kavaliauskas Ž, Blažiūnas G, Šajev I. Optimization of an Automated Substrate Irrigation System Using the SAC Reinforcement Learning Agent. Applied Sciences. 2025; 15(23):12715. https://doi.org/10.3390/app152312715

Chicago/Turabian Style

Kavaliauskas, Žydrūnas, Giedrius Blažiūnas, and Igor Šajev. 2025. "Optimization of an Automated Substrate Irrigation System Using the SAC Reinforcement Learning Agent" Applied Sciences 15, no. 23: 12715. https://doi.org/10.3390/app152312715

APA Style

Kavaliauskas, Ž., Blažiūnas, G., & Šajev, I. (2025). Optimization of an Automated Substrate Irrigation System Using the SAC Reinforcement Learning Agent. Applied Sciences, 15(23), 12715. https://doi.org/10.3390/app152312715

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimization of an Automated Substrate Irrigation System Using the SAC Reinforcement Learning Agent

Abstract

1. Introduction

1.1. Context and Importance of the Problem

1.2. Current Situation and Limitations

1.3. Comparison and Problems of Automation

1.4. Use of Automation and Artificial Intelligence

1.5. Contribution and Novelty of the Study

2. Materials and Methods

2.1. Automated Substrate Watering

2.2. Automated Substrate Moisture Measurement

2.3. AI, SAC and Training Data

3. Results and Discussion

4. Conclusions

4.1. Main Technical Results

4.2. Practical Application

4.3. Limitations

4.4. Recommendations

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI