2. Materials and Methods
This section describes the overall research procedure for predicting and visualizing virtual sensor data in a digital twin-based smart swine barns environment. The study aims to implement a digital twin framework that integrates the Physical World and the Virtual World using real barn data, while designing and validating a hybrid prediction model that reflects spatio-temporal correlations.
2.1. Data Collection and Preprocessing
The experimental barn was a single-story finisher pig house located in Jeollanam-do, Korea. The internal floor area measured approximately 9 m in width and 25 m in length. A total of 30 fattening pigs (Sus scrofa domesticus) were housed in the facility under a breeding-to-farrowing management system. The interior was operated as two environmental zones corresponding to Zones 1 and 2, and the animals were group-housed within these zones throughout the monitoring period.
Environmental data collected from Zones 1 and 2 of a smart swine barns located in Jeollanam-do, Korea, were used for analysis. As summarized in
Table 1, each zone was equipped with sensors measuring temperature (°C), humidity (%), carbon dioxide (CO
2, ppm), and ammonia (NH
3, ppm). Measurements were taken at approximately 10-min intervals. The data collection period spanned from 1 January–31 August 2025, resulting in a total of approximately 34,900 recorded entries.
Some missing and anomalous values were found in the collected metadata. Missing values, which accounted for approximately 1.5% of the dataset, were primarily caused by sensor communication errors or temporary power instability. Linear interpolation was applied to preserve the continuity of the time-series data and compensate for missing entries. Extreme values (e.g., temperatures below −50 °C or humidity levels of 0% or 120%) were identified as sensor malfunctions and removed from the dataset.
The final refined dataset contained 34,992 data points after preprocessing. These were used for defining virtual sensors and training the prediction model. The processed dataset also functioned as input for virtual sensor generation, performance verification, and construction of the digital twin visualization environment.
The environmental variables were monitored using commercially available sensors widely adopted in livestock production facilities. Temperature and relative humidity were measured using the SHT3x-DIS digital sensor (Sensirion AG, Stäfa, Switzerland), which provides factory-calibrated and temperature-compensated outputs with high accuracy. CO
2 concentrations were recorded using the MH-Z14A NDIR sensor (Winsen Electronics, Zhengzhou, China), offering a measurement range up to 5000 ppm with integrated temperature compensation to ensure stability under barn conditions. NH
3 concentrations were monitored using the ME3NH3 (Winsen Electronics, Zhengzhou, China) electrochemical sensor, which is suitable for detecting low-level ammonia emissions typically observed in pig housing environments. The specifications of these sensors, including measurement ranges and accuracies, are summarized in
Table 2. Prior to installation, all sensors were calibrated according to manufacturer guidelines, and additional stability checks were conducted during the monitoring period to maintain reliable performance in the field.
The barn was divided into two environmental zones (Zone 1 and Zone 2) along the longitudinal direction of the facility. This zoning followed the existing management layout of the finisher pens rather than an arbitrary spatial split.
In the experimental finisher pig house, the animals were group-housed in fixed pens, and their movement was confined within each pen. As a result, animal behavior mainly influenced the internal environment through gradual changes in metabolic heat and gas emission rather than through abrupt relocation between zones. The environmental sensors were installed near the ceiling along the central axis of the barn, so that they measured well-mixed air conditions representing each zone rather than local point fluctuations near the animals. Therefore, short-term movements of individual pigs were not found to cause distinct spikes in the recorded temperature, humidity, CO2, or NH3 data at the 10 min sampling interval; instead, animal activity contributed to slow temporal variations that were naturally captured by the hybrid prediction model.
2.2. Defining and Modeling Virtual Sensors
Because the two zones exhibit gradual rather than abrupt environmental transitions, the midpoint between Zone 1 and Zone 2 represents the intermediate region where air mixing occurs along the central ventilation path. Placing the virtual sensor at this location allowed the model to capture the spatial gradient between the two zones while avoiding local noise generated near pen-level activity. This placement provides a physically meaningful target point for interpolating spatial information and enhances the reliability of spatio-temporal prediction.
Figure 4 and
Figure 5 illustrate the virtual sensor placement and conceptual diagram used in this study. Sensors were installed in Zones 1 and 2 within the barn, and a virtual sensor was defined at the midpoint between the two zones. Because it is often impractical to densely install sensors in real barn environments, the virtual sensor was introduced to compensate for blackout zones and enhance the overall precision of spatial environmental monitoring.
The virtual sensor was designed to measure the same four variables as the physical sensors: temperature (°C), humidity (%), CO2 (ppm), and NH3 (ppm). The virtual sensor values were generated using a hybrid approach that combines spatial interpolation with time-series prediction.
First, the value at the central position between Zones 1 and 2 was estimated using the measured data from both zones, as expressed in Equation (1). Spatial interpolation was performed using either linear interpolation or IDW. The virtual sensor value
at time
was calculated as the distance-weighted average of the measured values from the two adjacent zones, as follows:
Here, and denote the sensor values in Zones 1 and 2, respectively, and and represent the distances from the virtual sensor to each zone.
As the virtual sensor was placed at the midpoint,
, and Equation (1) can be simplified into a simple average as in Equation (2):
The internal operations of the LSTM network used in this study are defined in Equations (3a)–(3f). At each time step
, the LSTM computes four gate values and updates its cell and hidden states. The input gate
regulates how much new information enters the memory cell (3a), while the forget gate
determines how much of the previous cell state
is retained (3b). A candidate cell state
is generated using the hyperbolic tangent activation function (3c), and the final cell state
is updated by combining the retained memory and new candidate information (3d). The output gate
controls how much of the cell state contributes to the hidden state (3e), and the final hidden state
is obtained by modulating the activated cell state (3f).
Here, denotes the input at time , is the previous hidden state, and is the memory cell state. and represent learnable weight matrices, are the bias terms, is the sigmoid activation function, is the hyperbolic tangent function, and denotes element-wise multiplication. This formulation enables the LSTM to capture long-term dependencies and nonlinear temporal patterns essential for environmental prediction tasks.
While simple interpolation techniques are computationally efficient and intuitive, they fail to capture the temporal dynamics of environmental variations within barns. Conversely, time-series models, such as LSTM are powerful in learning nonlinear patterns and long-term dependencies but do not directly incorporate spatial relationships, thereby overlooking the spatial structure of sensor placement.
To address these limitations, this study employed a hybrid approach that integrates spatial interpolation with time-series prediction.
The interpolated value
from the spatial step functioned as the input to the LSTM model, which then produced the final predicted value
as expressed in Equation (4):
Here, denotes the learning parameter of the LSTM model, and represents the input sequence length. The hybrid model first estimated the virtual sensor’s initial value through spatial interpolation between Zones 1 and 2, and then applied the LSTM model to learn temporal patterns and correct residual errors. This process allows the model to simultaneously consider spatial correlations and temporal continuity, thereby improving prediction accuracy and stability compared with single-method approaches.
Notably, contrary to prior studies that focused solely on model performance comparison, this study verified the results through integration with a WebGL-based digital twin visualization platform. This approach enabled intuitive observation of how the virtual sensor values were represented within the actual environment, providing a practical foundation for smart barn operators to assess data reliability in real time and support data-driven control decisions—representing a key distinction from existing studies.
2.3. Digital Twin Visualization Environment
In this study, a WebGL-based web environment was developed to visualize the indoor conditions of the smart swine barns as a digital twin. The visualization system was constructed using the Three.js (version r152) and Plotly.js (version 2.27.0) libraries and comprise four main components: three-dimensional spatial modeling, dynamic updates of sensor data, visual changes according to risk-level assessment, and time-series graph representation.
Figure 6 illustrates the virtual environment and sensor dashboard of the smart farm digital twin.
To simplify the spatial configuration of the barn, a rectangular container structure with a length of 25 m, width of 9 m, and height of 5 m was modeled. The top (ceiling) and front faces were rendered transparent to allow visual observation of the internal sensor nodes, while the remaining surfaces were shaded in gray tones to provide a realistic spatial impression of the barn.
Sensor nodes were positioned at three locations—Zone 1, the virtual sensor, and Zone 2—and each position was designed to include four sensors measuring temperature, humidity, carbon dioxide, and ammonia. The nodes were represented as spheres, and each was labeled using CSS2DRenderer with tags, such as “Zone1-T” and “Virtual-CO2” such that the sensor type and location could be intuitively identified.
Data were loaded from a pre-collected JSON file (swine_barns_data_full.json) and updated frame by frame in chronological order. Sensor values were dynamically color-coded in real time according to the environmental criteria defined in this study.
The criteria were set as follows: temperature was defined as comfortable at 18–21 °C under normal conditions or 28–30 °C during summer; humidity between 50% and 70%; CO2 concentration below 3000 ppm; and ammonia concentration below 20 ppm were considered normal. Sensor nodes were displayed in green under normal conditions, yellow for warning states, and red for danger states, enabling intuitive identification of the risk level in each zone.
Using Plotly.js, independent time-series graphs were generated for temperature, humidity, carbon dioxide, and ammonia. Each graph simultaneously displays the measured values from Zone 1, the virtual sensor, and Zone 2, allowing comparative analysis between predicted results and actual sensor readings. This allows observation of both spatial visualization and temporal variation patterns, enabling quantitative assessment of how environmental changes affect livestock growth.
The proposed digital twin visualization environment integrates spatial configuration and temporal changes to comprehensively represent environmental conditions and risk levels within the barn. By combining risk-level visualization of sensor data with graph-based time-series analysis, the system allows managers to intuitively detect abnormal conditions in specific zones and at specific points in time.
Beyond providing a graphical representation of the swine barns layout, the WebGL-based digital twin environment offers practical benefits for livestock management. Because the visualization operates in a standard web browser without requiring dedicated software, producers and facility managers can access real-time environmental conditions from any device. The ability to render spatially continuous distributions of temperature, humidity, CO2, and NH3 enables users to intuitively identify risk areas that may not be apparent from raw sensor values alone. In addition, the platform can be extended to integrate control logic or alert functions, allowing the visualization to serve as a front-end interface for real-time decision-making and operational adjustments. These practical features enhance the usability and applicability of the proposed system in routine swine barns management.
2.4. Experimental Design and Evaluation Methods
This study designed a systematic experiment to verify the performance of the virtual sensor-based data generation method and evaluate the applicability of the digital twin visualization environment. The experiments consisted of three scenarios.
First, the normal condition scenario aimed to verify the prediction accuracy of the model under stable environmental conditions by predicting the virtual sensor values at the central zone based on the physical sensor data from Zones 1 and 2.
Second, the sensor loss scenario simulated real-world sensor failure by intentionally removing data from either Zone 1 or Zone 2. This scenario was designed to evaluate the capability of the virtual sensor to compensate for missing data when sensor malfunctions occur within the barn.
Third, the abrupt environmental change scenario assumed situations where temperature, humidity, CO2, and NH3 concentrations change rapidly within the barn, to verify the model’s ability to track nonlinear and sudden environmental variations.
The dataset used for the experiments consisted of 34,992 records collected from Zones 1 and 2 between 1 January and 31 August 2025. Each record contained four environmental variables: temperature (°C), humidity (%), CO2 concentration (ppm), and NH3 concentration (ppm).
Considering the temporal characteristics of the data, it was divided into training data (January–June 2025, 26,244 records, 75%) and validation data (July–August 2025, 8748 records, 25%). This time-based split reflects a practical scenario in which accumulated past data are used to predict future conditions in real barn operations.
For comparison, three baseline methods were implemented: simple mean interpolation, linear regression, and pure LSTM modeling. Simple mean interpolation represents the most basic spatial estimation method, while linear regression captures linear relationships among variables using a statistical approach. Pure LSTM focuses on learning temporal patterns and models time-based changes but does not incorporate spatial relationships. In contrast, the proposed hybrid interpolation–LSTM model combines spatial interpolation with temporal pattern learning, achieving both accuracy and robustness.
To quantitatively evaluate performance, the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2) were used as evaluation metrics. MAE represents the average absolute difference between predicted and actual values, RMSE measures the square root of the mean of squared errors and is sensitive to large deviations, and R2 indicates how well the model explains the variance of the observed data.
3. Results
3.1. AI Model Evaluation Results
To verify the prediction performance of the virtual sensor, three models were applied: the spatial interpolation-based IDW model, time-series LSTM model, and hybrid model combining IDW and LSTM. Model performance was evaluated using RMSE, MAE, and R2 metrics.
Table 3 presents the prediction accuracy for each variable. For temperature (T_virtual), both the LSTM and hybrid models showed a high level of agreement with the actual observed values (R
2 = 0.970 and 0.959, respectively), with RMSE values of 0.438 and 0.515. For humidity (RH_virtual), the LSTM achieved the best results (RMSE = 1.805, R
2 = 0.978), while the hybrid model demonstrated comparable performance (RMSE = 1.887, R
2 = 0.976). For CO
2 prediction, the hybrid model achieved slightly higher accuracy (RMSE = 22.733, R
2 = 0.981) than the LSTM (RMSE = 23.348, R
2 = 0.980). For NH
3, the hybrid model also outperformed the LSTM, with RMSE = 0.340 and R
2 = 0.964, compared with RMSE = 0.392 and R
2 = 0.952 for the LSTM.
Figure 7 presents a visual comparison between actual and predicted values. Across all variables, the predicted values closely followed the actual patterns, and the hybrid model in particular demonstrated stable tracking during periods of rapid fluctuations in CO
2 and NH
3 concentrations. This result experimentally confirms that the hybrid approach incorporates both spatial contexts and temporal patterns.
Figure 8 compares the actual temperature values at the virtual sensor location—calculated from the measured data of Zones 1 and 2—with the results of the three prediction models (IDW, LSTM, and Hybrid).
The IDW method, which estimates values through simple distance-based interpolation, failed to capture the overall temporal trend and showed excessive deviations in intermediate sections. This occurred because IDW does not account for temporal dynamics, resulting in poor representation of short-term temperature rises or drops.
In contrast, the LSTM model reproduced the general shape of the actual temperature variation curve by learning temporal continuity and nonlinear fluctuation patterns. However, prediction lag occurred in certain sections, and the model could not account for the spatial influence of sensor locations.
The proposed hybrid model (IDW + LSTM) combined the strengths of both methods using the spatially interpolated initial values as inputs for LSTM-based temporal correction. The differences between the actual and predicted values were minimal, and the prediction curve consistently reproduced the actual temperature variation trends across the entire period.
Across the different scenarios in
Figure 8, the performance differences among the three models can be interpreted based on their underlying computational characteristics. IDW relies solely on geometric distance; therefore, it responds immediately to changes in neighboring sensor values but cannot reflect temporal continuity, making it more sensitive to short-term fluctuations or inconsistencies caused by sensor drift. In contrast, the LSTM model smooths abrupt variations by learning temporal dependencies, which improves stability but can delay the model’s response to sudden environmental changes. The hybrid model integrates both spatial and temporal components, enabling it to adjust to abrupt changes while still maintaining continuity over time. This combined behavior explains why the hybrid framework shows the most balanced performance under both gradual and rapid environmental variations.
In quantitative evaluation, the hybrid model achieved an RMSE of 0.515 and R2 of 0.959, representing approximately a 3.4% improvement in accuracy over the standalone LSTM model. This result indicates that a model incorporating both spatial and temporal factors can estimate environmental changes inside the barn with greater precision.
3.2. Scenario-Specific Result Interpretation
Table 4 summarizes the performance comparison among the algorithms. The IDW method produced results quickly through simple computation, but its inability to reflect temporal patterns limited its applicability in real-world environments. In contrast, the LSTM model demonstrated strong performance in learning time-series patterns, while the hybrid model, which simultaneously considered spatial distribution and temporal dynamics, produced the most stable results for certain variables.
First, IDW interpolation offered the advantage of being simple and intuitive, and it provided reasonable estimates when the spatial gradient between Zones 1 and 2 was relatively smooth. However, because IDW relies solely on geometric distance, it cannot represent temporal dependencies or delayed responses to environmental changes. As a result, its performance degraded under more complex and dynamically changing barn conditions, where both spatial and temporal variability jointly influenced the environment.
Second, the LSTM-based time-series model captured nonlinear temporal patterns and short-term fluctuations using historical sequences from each sensor location. This approach was effective in modeling gradual changes and recurrent patterns over time. Nevertheless, because the LSTM operated independently on each sensor without explicit spatial structure, it tended to produce spatially averaged behavior and could not fully reflect environmental differences between zones or at unseen intermediate locations.
Third, the hybrid model (IDW + LSTM) integrated the strengths of both approaches by using IDW-based spatial estimates as inputs to the LSTM. This allowed the model to preserve spatial relationships between zones while simultaneously learning temporal dynamics, resulting in the most stable and accurate predictions at the virtual sensor location. The benefit of the hybrid framework was particularly evident under scenarios with both spatial heterogeneity and temporal variability, whereas in more spatially homogeneous conditions its advantage over the standalone LSTM model was smaller but still comparable in accuracy.
4. Discussion
Previous studies have emphasized that environmental monitoring in swine barns buildings is often hindered by spatial heterogeneity, limited sensor placement, and sensor degradation over time [
39]. These issues lead to incomplete environmental representations and reduce the reliability of data-driven management systems. The present study ad-dresses these technical challenges by integrating spatial interpolation with temporal pre-diction to construct a virtual sensing framework capable of compensating for blackout zones and mitigating the effects of sensor drift. Similar virtual sensor and digital twin approaches have been reported in greenhouse monitoring and environmental control applications [
40,
41], but research specific to swine barns remains limited. Our findings extend these prior works by demonstrating that a hybrid spatial–temporal model can more accurately reconstruct unmeasured environmental variables in a swine barns setting characterized by rapid airflow variations and temperature gradients.
The experimental results demonstrate that the proposed hybrid interpolation–LSTM model provides more accurate estimations for variables characterized by strong spatial heterogeneity, particularly CO2 and NH3. These gases typically exhibit localized concentration gradients within swine barns environments due to animal activity, manure emission points, ventilation airflow, and structural constraints. Purely spatial methods such as IDW cannot fully represent these complex dispersion patterns, while LSTM alone lacks spatial context and therefore produces temporally consistent but spatially generalized predictions. By combining both approaches, the hybrid model leverages spatial priors from IDW and subsequently refines them through temporal learning, enabling it to better represent abrupt fluctuations and asymmetric spatial profiles. This integrated mechanism explains the superior performance observed for spatially sensitive variables.
The structural differences between the standalone LSTM model and the hybrid approach further highlight the importance of incorporating spatial information into the prediction process. While LSTM effectively captures nonlinear temporal dependencies across long sequences, it relies solely on historical patterns and cannot account for spatial variability between different zones. As a result, its predictions tend to converge toward averaged temporal behaviors, particularly when environmental conditions vary sharply over space. In contrast, the hybrid model conditions the temporal prediction on an initial spatial estimate, allowing the LSTM to learn deviations from spatial baselines rather than reconstructing the full spatio-temporal pattern from scratch. This leads to improved stability and responsiveness, especially under dynamic CO2 and NH3 conditions.
Several considerations must be addressed when applying the proposed model to real swine barns environments. The performance of spatial interpolation is influenced by the physical layout of the barn, the geometry of ventilation paths, and the distribution of animals; thus, model recalibration may be required when facility configurations change. In addition, the reliability of LSTM-based predictions depends on the availability of sufficient high-quality historical data. Noise introduced by sensor drift, communication delays, or intermittent failures could reduce predictive accuracy unless properly preprocessed. Furthermore, real-time implementation demands low-latency data processing and stable network infrastructure to ensure seamless integration with the digital twin interface. Addressing these practical constraints will be essential for scaling the proposed approach to commercial barn operations.
Compared with previous studies, the proposed method offers a more integrated framework that combines spatial interpolation, time-series modeling, and digital twin visualization. Earlier works have typically focused on either predictive modeling or digital twin design, but few have demonstrated a fully operational system capable of producing virtual sensor outputs while simultaneously supporting real-time monitoring. By validating the hybrid prediction model within an interactive 3D environment, this study bridges the gap between theoretical modeling and real-world applicability, providing a practical foundation for future digital twin-based swine barns management systems.
From an operational perspective, the use of virtual sensors can provide meaningful economic benefits in swine barns. In commercial pig houses, NH3 sensors—typically electrochemical devices with a service life of 8–12 months—are among the most frequently replaced components due to exposure to high humidity, dust, and corrosive gases. The ME3NH3 sensors used in this study generally cost between USD 60 and 120 per unit, and typical barn configurations require multiple sensors to monitor different zones. Replacing these sensors annually can therefore represent a recurring expense. By contrast, virtual sensors can estimate environmental conditions at additional locations without the need for new hardware. For a facility equipped with two physical gas sensors, the ability to substitute one or more of these with virtual sensors can reduce annual hardware replacement and maintenance costs by approximately 30–40%, depending on the facility layout and the number of installed devices. These results highlight the practical economic value of integrating virtual sensing into continuous monitoring systems for pig housing environments.
The WebGL-based visualization further strengthens the practical utility of the framework by enabling intuitive identification of environmental anomalies and facilitating remote monitoring through a browser-based interface. Such features allow the digital twin to function not only as a prediction tool but also as a decision-support component within daily swine barns facility management.
5. Conclusions
This study introduced a digital twin system and virtual sensor technique to address the limitations of sensor deployment and the high maintenance costs encountered in smart swine barns. In environments where it is difficult to install physical sensors at all locations due to physical constraints and economic burdens, virtual sensors can function as an effective alternative to complement blackout zones and accurately estimate environmental data throughout the entire barn.
From a methodological perspective, this study adopted a hybrid approach that combines spatial interpolation with LSTM-based time-series prediction. Specifically, real sensor data from Zones 1 and 2 were used to define a virtual sensor in the central zone. The initial estimates were obtained through spatial interpolation and then refined using the LSTM model to incorporate temporal patterns. Furthermore, a WebGL-based digital twin visualization environment was developed to provide an integrated platform for intuitively verifying the predicted results of virtual sensor data.
Experimental results demonstrated that the virtual sensor achieved high prediction performance across all variables—temperature, humidity, CO2, and NH3—with coefficients of determination (R2) exceeding 0.95. In particular, the hybrid model produced more stable results for variables with significant spatial heterogeneity, such as CO2 and NH3, by simultaneously capturing spatial and temporal characteristics. These findings experimentally confirm the practical effectiveness of integrating digital twin and virtual sensor technologies for smart barn environmental management.
By compensating for sensor blackout zones through virtual sensing, this study enabled the acquisition of high-resolution environmental data and the establishment of a reliable predictive model. The digital twin-based visualization environment also allowed operators to intuitively interpret data and assess real-time environmental risks. Additionally, the virtual sensor technology reduces installation and maintenance costs, enabling small- and medium-sized farms to adopt smart barn systems at a lower cost. This advancement can contribute to improving animal welfare, enhancing productivity, and ultimately stabilizing farm income. Moreover, the proposed approach aligns with national policies on digital transformation and low-carbon swine barns management and can support government initiatives to expand smart barn deployment and promote carbon neutrality. As an ICT-integrated swine barns management model, it holds potential to serve as a foundational technology for sustainable agricultural and swine barns innovation at the national level.
The applicability of the proposed framework can be extended to larger or structurally diverse swine barns, as the hybrid model does not rely on a fixed barn geometry. However, facilities with multiple ventilation inlets, multi-level layouts, or highly heterogeneous environmental zones may require additional sensor layers or zoning adjustments to ensure accurate spatial representation. While the present study demonstrates the feasibility of virtual sensing within a two-zone pig barn, future work should examine scalability in larger commercial settings to verify performance under more complex airflow patterns and environmental distributions.
Future research should aim to further enhance predictive performance by incorporating advanced spatio-temporal fusion models, such as Transformers or graph neural networks, and by validating the proposed framework across swine barns with different structural layouts or environmental characteristics. Extending the methodology to environments that exhibit vertical stratification, including multi-level barns or greenhouse systems, may also provide valuable insights into its broader applicability. In addition, integrating WebGL-based visualization with real-time control systems would support the development of more comprehensive and intelligent swine barns-environment management platforms.