This study develops a digital twin-enabled, IoT-driven predictive modeling framework designed to enhance the resilience of lead-time prediction across sourcing partners in complex adaptive supply chains. These systems are characterized by high variability, dynamic interactions, and uncertainty arising from both technological processes and human decision-making. In such environments, lead times across vendors frequently exhibit asymmetric and heavy-tailed distributions as a result of disruptions, heterogeneous performance, and evolving operational conditions, thereby limiting the effectiveness of traditional deterministic or static probabilistic models.
Access to empirical datasets that concurrently represent historical supplier performance, operational disruptions, and high-frequency IoT data streams remains limited as a result of confidentiality constraints and varying levels of digital maturity across supply chain actors. To overcome this limitation while preserving analytical robustness, this study employs semi-synthetic datasets enhanced through generative techniques, designed to replicate realistic supply chain variability under controlled experimental conditions. These datasets incorporate stochastic lead-time behavior, dynamic disruption patterns, and simulated IoT data streams that reflect real-time system observations.
The proposed framework integrates three key methodological components within a system-oriented modeling environment. First, advanced statistical modeling using quantile regression and EVT is employed to capture both central tendencies and extreme disruptions in supplier lead times. This enables a more comprehensive representation of uncertainty, particularly in environments characterized by heavy-tailed and non-stationary behavior. Second, a digital twin-enabled DES environment is developed to model system-level interactions, feedback loops, and disruption propagation across supply chain components. This allows the evaluation of predictive performance under varying operational scenarios and supports scenario-based analysis. Third, the framework incorporates AI-driven adaptive learning mechanisms, combining TFT for multi-horizon time-series prediction with RL for dynamic decision-making.
Each component is included for a distinct methodological reason. Quantile regression is used to capture heterogeneous conditional behavior across the lead-time distribution; EVT is used to model rare but high-impact delays; the digital twin-DES layer is used to represent system evolution and disruption propagation; RL is used to support adaptive correction based on evolving operational state; and the TFT layer is used to model temporal dependencies across multivariate sequential inputs. Online learning and particle filtering are then used to support updating under non-stationary conditions. The framework is therefore not intended as an arbitrary aggregation of methods, but as a layered architecture in which each component addresses a specific limitation of static forecasting under dynamic uncertainty.
Together, these components form a unified framework that integrates statistical rigor, real-time data acquisition, simulation-based evaluation, and adaptive intelligence, supporting predictive resilience and system-level decision-making. The methodological design was developed to address specific analytical limitations in existing lead-time prediction studies, rather than to combine multiple techniques without a unifying rationale. Quantile regression was selected because supplier lead times in complex adaptive supply chains often exhibit heterogeneous and asymmetric variability that cannot be adequately captured by mean-based models alone. EVT was incorporated to explicitly model rare but high-impact disruptions, which are particularly important in heavy-tailed and disruption-prone environments. The digital twin-enabled DES environment was chosen because lead-time behavior is shaped not only by statistical variability but also by system-level interactions, feedback loops, and disruption propagation across interconnected supply chain components. In turn, TFT was adopted to capture complex temporal dependencies and multi-horizon forecasting patterns, while RL was introduced to support adaptive decision-making based on real-time IoT observations. Finally, online learning and particle filtering were included to ensure that the framework remains responsive and robust under evolving and non-stationary operating conditions.
3.1. Statistical Modeling of Supplier Lead Time and Sensitivity Analysis
This subsection presents the statistical modeling and sensitivity analysis procedures used in the proposed framework to evaluate how lead-time uncertainty propagates under varying operational conditions in complex adaptive supply chains. Supplier lead time is represented as a stochastic variable to reflect variability and uncertainty in complicated adaptive supply chain situations. To address the limitations of fixed parametric models, this study employs a hybrid quantile regression–EVT framework, which captures both central tendencies and extreme deviations in lead-time behavior.
Quantile regression estimates the conditional distribution of lead time across different quantiles, enabling the modeling of heterogeneous and asymmetric variability. In the present study, quantile regression is estimated at the quantile levels
. These quantiles were selected to provide balanced coverage of lower-tail, central, and upper-tail lead-time behavior, thereby capturing routine variability as well as asymmetric escalation patterns under disruption. The
-th conditional quantile is defined as:
where
represents explanatory variables derived from operational and IoT data, and
is the quantile-specific parameter vector. Model parameters are obtained by minimizing the quantile loss function:
where
is the asymmetric check loss function, allowing different penalties for overestimation and underestimation. The lower quantiles (
and
) support characterization of relatively stable operating conditions, the median quantile (
) provides a robust central reference, and the upper quantiles (
and
) are used to capture disruption-sensitive and delay-amplified lead-time behavior.
To model rare but high-impact disruptions, EVT is applied to exceedances above a threshold
using the Generalized Pareto Distribution (GPD):
where
and
denote the shape and scale parameters, respectively. The expected magnitude of extreme exceedances is:
In the present study, the EVT threshold is selected using a high-quantile thresholding rule based on the empirical lead-time distribution. Specifically, is set at the 90th percentile of the simulated lead-time samples within each regime, ensuring that EVT is applied only to the upper-tail observations corresponding to rare and disruption-driven delays. This rule was chosen to provide a stable balance between retaining a sufficient number of exceedances for reliable GPD estimation and isolating the extreme tail behavior most relevant to resilience analysis. The adequacy of the selected threshold was further checked using mean residual life behavior and parameter stability across nearby upper-tail quantiles, and the 90th-percentile threshold was retained because it provided consistent tail-shape estimates across simulation regimes. GPD parameters were then estimated from the exceedance samples using maximum likelihood estimation.
Within the methodology, sensitivity analysis is then performed by perturbing key distributional characteristics and evaluating their effects within the digital twin-enabled simulation environment, following variance-based sensitivity analysis principles as implemented in tools such as the sensobol package [
52]. This procedure is used to examine how uncertainty propagates through system-level interactions and influences predictive performance under different variability regimes. In the present study, however, the sensitivity analysis is intentionally focused on three variables, namely
,
, and
, because these represent the principal scenario-level sources of variation examined in the proposed framework: baseline lead-time level, uncertainty intensity, and IoT-driven adaptive correction. Together, these variables capture the core statistical, stochastic, and adaptive dimensions of the simulation design and therefore provide a targeted system-level view of how predictive error responds to changing operating conditions. Other parameters associated with the TFT architecture, RL learning process, particle filtering, online updating, and DES configuration were not included in the formal sensitivity analysis because the objective of this component was not to perform a full global decomposition across all model and hyperparameter settings, but rather to examine predictive responsiveness to the most influential operating-condition drivers of uncertainty. A fully comprehensive sensitivity analysis across all model, simulation, and learning parameters would substantially increase methodological and computational complexity and lies beyond the scope of the present simulation-based proof-of-concept study. Accordingly, the reported sensitivity analysis should be interpreted as focused rather than exhaustive, and broader global sensitivity analysis across additional parameters is identified as an important direction for future research.
Finally, real-time IoT data streams are integrated to enable dynamic recalibration of predictions. Coupled with online learning and particle filtering, the model continuously updates its parameters in response to evolving conditions, enhancing predictive resilience under both routine variability and extreme disruptions.
3.2. RL-Based Adaptive Control Using IoT Data Streams
To incorporate real-time operational dynamics within complex adaptive supply chains, an RL-based adaptive control mechanism is introduced to dynamically update lead-time predictions using IoT data streams. Instead of relying on a static adjustment factor, the proposed approach models system deviations as outputs of an adaptive policy learned through RL. Let
denote the baseline lead time and
represent the adaptive correction derived from IoT observations. The adjusted lead time is defined as:
where
is no longer a predefined stochastic variable but a control signal generated by an RL policy based on real-time system states. This formulation enables continuous adaptation to operational disturbances such as machine downtime, congestion, and logistics delays.
In the proposed formulation, the RL environment is embedded within the digital twin-enabled DES framework. At each decision step , the agent observes a state vector composed of current IoT-derived signals, predicted lead-time statistics, and operational context variables. Specifically, includes machine availability status, congestion level, the current lead-time estimate, recent forecast error, and the previous adjustment factor. The action corresponds to selecting an adaptive correction to the lead-time prediction, implemented either directly as an updated value or indirectly through the adjustment weights applied to sensor-derived deviations. The reward is defined to encourage accurate and stable prediction while penalizing excessive corrective oscillation, and is expressed as a negative weighted combination of prediction error and adjustment magnitude. This design enables the RL agent to learn policies that improve predictive responsiveness without introducing unstable control behavior.
For the RL implementation used in this study, policy learning is performed using a value-based Q-learning scheme with discrete action bins defined over the admissible range of . The adjustment interval is discretized into 8 candidate actions, allowing the agent to choose bounded lead-time corrections at each decision step. The state-action value function is updated using a learning rate of 0.1 and a discount factor . The discount factor is intentionally set to a relatively high value because supply chain disturbances often generate effects that extend beyond the current decision step. Thus, the RL agent must consider not only immediate prediction-error reduction but also the longer-term consequences of corrective actions on future lead-time behavior and predictive stability. In this framework, provides a balance between short-term responsiveness and long-horizon adaptation, ensuring that the learned policy remains sensitive to current disruptions while still accounting for delayed operational effects within the digital twin environment. Action selection during training follows an -greedy exploration strategy, with initialized at 0.20 and decayed linearly to 0.01 over training. Training is conducted for 500 episodes, each corresponding to a simulated supply chain trajectory in the digital twin environment. Convergence is monitored through moving-average reward stabilization, and training is terminated early if the average episode reward does not improve for 25 consecutive episodes.
Within the simulation environment, each episode proceeds by initializing the digital twin state, sampling IoT-linked operational conditions, applying the selected RL action, updating the adjusted lead-time prediction, and computing the resulting reward from forecast performance and control smoothness. The environment then transitions to the next state through DES-based event updates, including congestion buildup, machine downtime, and supplier-delay propagation. In this way, RL training is grounded in system-level simulation rather than isolated signal adjustment, allowing the learned policy to reflect feedback-rich operational dynamics.
The feedback loop in the proposed framework operates as a closed adaptive cycle linking sensing, prediction, decision adjustment, and system-state update within the digital twin environment. At each decision step, IoT-derived operational signals and current lead-time estimates are used to form the state vector observed by the RL agent. Based on this state, the agent selects an action that determines the corrective adjustment or the associated sensor-weight configuration. This adjustment is then applied to update the predicted lead time, after which the digital twin simulates the resulting operational transition, including disruption propagation, congestion effects, and machine-state changes. The updated system response generates a new forecast error and reward signal, which are fed back to the RL agent to update the policy and to construct the next state. In this way, the output of one decision step becomes part of the input to the next step, allowing the framework to learn from evolving operational consequences rather than from static one-step corrections alone.
3.2.1. Statistical and Data-Driven Definition of the IoT Adjustment Factor
To facilitate reproducibility and systematic experimentation, the IoT adjustment factor is initially modeled as a constrained stochastic parameter:
where
denotes a uniform distribution over the interval
, representing bounded operational variability (e.g.,
to
). This formulation serves as a baseline approximation of IoT-driven perturbations in the absence of structured sensor data. However, to align with data-driven and adaptive modeling principles, the framework extends this formulation by defining
as a function of real-time IoT observations:
where:
denotes a vector of IoT measurements,
represents time-dependent model parameters,
is a nonlinear mapping function learned through TFT and/or RL models.
To capture dynamic uncertainty and evolving system conditions, the parameters
are updated recursively using online learning mechanisms:
where
is the learning rate and
denotes the loss function at time
. In addition, particle filtering techniques are employed to estimate latent system states and refine
under non-stationary conditions, ensuring robustness in dynamic environments.
3.2.2. Practical Implementation in Real-World IoT Systems
In real-world implementations, the adjustment factor
can be directly computed from IoT measurements as a normalized deviation from reference operational values:
where
represents real-time sensor observations and
denotes expected or baseline values. This formulation translates raw IoT signals into proportional lead-time adjustments, enabling real-time responsiveness.
In advanced settings, multiple IoT signals can be fused using weighted aggregation, anomaly detection techniques, or state estimation methods such as Exponentially Weighted Moving Average (EWMA) or Kalman filtering, providing a more robust estimate of system deviation [
53,
54]. Furthermore, when integrated with RL, the adjustment process evolves from passive estimation to adaptive decision-making, where optimal responses to disruptions are learned through interaction with the environment.
Within the digital twin-enabled DES framework, these IoT-driven adjustments are continuously synchronized with the virtual system, enabling real-time scenario evaluation, predictive analysis, and system-level coordination. This integration supports the modeling of complex socio-technical interactions, where physical processes and digital intelligence co-evolve.
In this study, is instantiated using bounded stochastic distributions (Equation 6) to approximate realistic IoT-driven deviations, while Equations (6) and (7) define the adaptive learning extension that enables future real-world deployment. This layered formulation ensures consistency between controlled simulation experiments and scalable, data-driven implementations in digitally enabled supply chain systems.
3.2.3. Systematic Sensor-Based Formulation
To illustrate the impact of IoT-driven adjustments within complex adaptive supply chains, a systematic transformation from sensor data to δ(t) is established using multiple IoT data streams that capture real-time operational conditions.
Two representative sensor streams are considered:
Machine availability sensor : represented as a binary state indicator reflecting normal operation () or downtime (). Downtime events lead to sudden increases in lead times associated with processing activities.
Congestion sensor : represented as a continuous-valued signal reflecting queue length or transportation delays, capturing gradual system degradation resulting from workload accumulation.
Each sensor measurement is standardized relative to its baseline (planned) value
, resulting in proportional deviations:
The normalized deviations are combined through a weighted aggregation:
where
are time-varying weights, reflecting the relative importance of each sensor stream. Unlike static formulations, the weights are dynamically updated using RL to adapt to changing system conditions:
where
is the learning rate and
is a reward function defined based on prediction accuracy or system performance (e.g., minimizing MAE or RMSE). This enables the model to learn optimal sensor contributions over time, transforming the adjustment mechanism into an adaptive decision process.
To capture temporal variations and attenuate high-frequency noise, an exponentially weighted moving average (EWMA) filter is employed:
where
regulates the sensitivity to new observations.
To further enhance predictive capability, the adjustment factor can be directly learned from IoT data streams using a TFT:
where
represents multivariate IoT inputs and
is a deep learning model capturing temporal dependencies and feature interactions. This formulation enables multi-horizon prediction and improves robustness in non-linear and non-stationary environments.
Additionally, to handle latent uncertainty and evolving system states, a particle filtering-based estimation of the adjustment factor can be employed:
where
are particle-based estimates and
are their associated weights. This probabilistic formulation supports robust state estimation under uncertainty and complements the learning-based components.
The resulting adjustment factor is constrained within operationally realistic bounds (e.g., to ) to ensure physical plausibility and consistency with industrial settings.
Within the digital twin-enabled DES framework, these components operate cohesively to model system-level interactions, feedback loops, and disruption propagation. This enables continuous synchronization between physical and virtual supply chain states, supporting predictive analytics and adaptive decision-making.
In this study, Equations (9)–(16) provide a structured and interpretable baseline. This layered formulation ensures both experimental control and scalability to real-world implementations, contributing to the development of resilient, sustainable, and intelligent supply chain systems operating as complex adaptive socio-technical environments.
3.4. Digital Twin-Enabled Adaptive Framework for Lead-Time Prediction
The proposed framework is formulated as a digital twin-enabled, multi-layer structure in which each methodological component is selected to address a specific aspect of lead-time uncertainty, system dynamics, and adaptive decision-making in complex adaptive supply chains.
First, the statistical modeling layer establishes a probabilistic foundation for representing lead-time uncertainty. Instead of relying solely on parametric assumptions, this layer integrates quantile regression and EVT to capture both central tendencies and extreme disruptions in supplier lead times. This enables a more flexible representation of heavy-tailed and non-stationary behaviors observed in real-world supply chains.
Second, the IoT-driven adaptation layer introduces real-time adaptability by integrating operational inputs captured from distributed sensing infrastructures (e.g., machine status, logistics disruptions, congestion levels). These inputs are converted into a dynamic deviation factor
, as defined in
Section 3.2, enabling continuous alignment between physical system states and their digital representation within the digital twin environment.
Third, the AI-driven learning layer enhances predictive capability through the integration of TFT models, RL, and online adaptive mechanisms. The TFT captures temporal dependencies and multi-horizon forecasting patterns, while RL enables adaptive decision-making by learning optimal responses to disruptions. In parallel, particle filtering and online learning mechanisms continuously update model parameters, ensuring robustness under evolving system conditions.
This layered architecture is embedded within a digital twin-based DES framework, enabling system-level modeling of interactions, feedback loops, and disruption propagation across the supply chain.
Within this architecture, the feedback loop operates through continuous interaction among sensing, prediction, adaptive correction, and simulated system evolution. First, operational inputs derived from IoT-linked signals are used to characterize the current system state. Second, the statistical and AI-driven layers generate baseline and uncertainty-aware lead-time predictions. Third, the RL-based adaptation layer uses the current state and predictive information to determine an updated corrective action through . This adjustment is then applied within the digital twin environment, where DES-based event propagation simulates the resulting operational consequences, including congestion, delay accumulation, and machine-state transitions. The updated system response is subsequently fed back into the next decision cycle through revised state variables, forecast error, and reward information. In this way, the framework functions as a closed adaptive loop in which prediction influences system response, and system response in turn informs subsequent prediction and control.
The proposed framework combines statistical modeling, digital twin-enabled simulation, and AI-driven adaptive learning to support resilient lead-time prediction. Quantile regression and EVT provide a baseline representation of lead-time uncertainty, while IoT data streams enable real-time system awareness within the digital twin environment. Building on this foundation, TFT models capture complex temporal dependencies, and RL supports adaptive decision-making under dynamic conditions. Online learning and particle filtering further ensure continuous model updating and robustness under non-stationary environments.
Together, these capabilities provide a system-oriented and uncertainty-aware predictive framework for simulation-based evaluation of resilient and adaptive lead-time prediction in complex adaptive supply chains.
3.4.1. Semi-Synthetic Data-Generating Process
To support controlled experimentation while preserving realistic supply chain variability, this study employs a semi-synthetic data-generating process designed to emulate supplier lead-time behavior under both routine and disruption-prone operating conditions. The reference lead-time variable is generated from a log-normal distribution, selected because supplier lead times are non-negative and commonly exhibit right-skewed and multiplicative variability in practice. Specifically, the baseline lead time is generated as
where
controls the central tendency of supplier lead times and
controls dispersion and uncertainty severity. These parameter combinations define multiple operating regimes ranging from stable to highly disruption-prone conditions.
To represent IoT-observed operational variability, the baseline lead time is dynamically adjusted using an IoT-driven deviation factor
, producing
Two scenario families are considered. In Scenario A, representing routine variability, . In Scenario B, representing disruption-intensive conditions, . This formulation allows the generated data to reflect both negative and positive operational deviations while preserving bounded and interpretable perturbations.
To emulate disruption effects beyond routine variability, rare delay events are injected into a subset of generated samples through positive tail perturbations consistent with heavy-tailed behavior. These disruptions are represented through EVT-consistent exceedance behavior and are propagated within the digital twin environment through simulated events such as congestion buildup, machine downtime, and supplier delay cascades. As a result, the final semi-synthetic dataset captures baseline stochastic variability, IoT-driven operational fluctuations, and disruption-induced tail behavior within a unified framework.
For each
configuration and each variability scenario, 10,000 lead-time realizations are generated. These samples are then partitioned into model-development and evaluation subsets for training, validation, and testing. In addition to the main log-normal reference DGP, robustness is further examined using alternative out-of-distribution DGPs, including a mixed log-normal-Gamma distribution, a heavy-tailed Weibull distribution, and a bimodal mixture distribution, as described in
Section 3.6. All stochastic sampling procedures are initialized using fixed random seeds to ensure reproducibility.
3.4.2. GAN-Based Data Augmentation and Validation
To enhance the diversity and realism of the semi-synthetic dataset while preserving controlled experimental conditions, the baseline data generated through the procedure described in
Section 3.4.1 were further enriched using a GAN-based augmentation strategy. The purpose of this step was not to replace the underlying statistical data-generating process, but to expand the range of plausible lead-time patterns and IoT-linked operational variations available for model development and evaluation. In this way, the final dataset retained analytical traceability while incorporating additional heterogeneity consistent with realistic supply chain behavior.
The GAN was trained on the semi-synthetic samples produced from the reference log-normal data-generating process after incorporation of IoT-driven deviations and disruption effects. The training input therefore consisted of multivariate records including baseline lead time, adjusted lead time, variability regime indicators, scenario labels, and representative IoT-derived features associated with machine availability, congestion, and logistics conditions. This configuration enabled the GAN to learn joint dependencies between lead-time realizations and operational context variables, thereby generating augmented samples that remained consistent with the broader dynamics of the proposed digital twin environment.
For implementation, a standard fully connected GAN architecture was adopted, composed of a generator and a discriminator trained in adversarial fashion. The generator received latent noise vectors of dimension 32, sampled from a standard normal distribution, and transformed them into synthetic tabular records matching the dimensionality of the training data. The discriminator received either real or generated samples and learned to distinguish between them. Both networks were trained using mini-batch stochastic optimization for 300 epochs with a batch size of 64. The Adam optimizer was used for both the generator and discriminator with a learning rate of 0.0002 and momentum parameters β1 = 0.5 and β2 = 0.999. To support reproducibility, the GAN training process used fixed random seeds, and model training was conducted under fixed hyperparameter settings across all augmentation runs.
The generator network consisted of three dense hidden layers with 128, 256, and 128 units, respectively, each followed by ReLU activation. The output layer used a linear activation for continuous variables and bounded transformation where needed to preserve physically meaningful ranges. The discriminator employed a mirrored dense architecture with hidden layers of 256, 128, and 64 units, using LeakyReLU activation with a negative slope of 0.2, followed by a sigmoid output layer for binary real-versus-generated classification. During training, the adversarial objective encouraged the generator to produce samples that matched the marginal and joint statistical structure of the original semi-synthetic dataset, while the discriminator enforced realism through continuous classification feedback. The augmented samples were then filtered using operational plausibility constraints to ensure consistency with valid lead-time ranges and bounded IoT perturbation behavior.
The GAN-based augmentation process was validated using both statistical and distributional checks. First, summary statistics of the generated samples, including mean, standard deviation, skewness, and quantile structure, were compared with those of the original semi-synthetic dataset across all major scenario regimes. Second, empirical distributions of generated and original samples were compared using the two-sample Kolmogorov–Smirnov test and Wasserstein distance, enabling assessment of distributional similarity. Third, the generated samples were checked for consistency with scenario-specific variability bounds, particularly with respect to the operational deviation factor , to ensure that the augmented dataset did not introduce implausible observations outside the assumed simulation design. Only samples satisfying these validity checks were retained for downstream use.
For model-development purposes, the original semi-synthetic dataset was divided into 80% training, 10% validation, and 10% testing subsets before augmentation. The GAN was trained using the training subset only, while the validation subset was used to monitor convergence and guard against overfitting. After training, the generator was used to produce augmented samples equal to 50% of the original training-set size, and these accepted synthetic records were merged with the original training data to form the final model-development dataset. The test subset remained untouched and was used exclusively for final evaluation. This augmentation step improved data diversity, increased exposure to heterogeneous operational conditions, and reduced the risk of overfitting to narrowly parameterized simulation outputs. Accordingly, the GAN-based augmentation layer served as a data enrichment mechanism that complemented, rather than replaced, the underlying probabilistic and simulation-based design of the study.
3.4.3. Workflow and Architecture of the Proposed Framework
The proposed framework follows a structured workflow that integrates data generation, predictive modeling, adaptive correction, digital twin simulation, and feedback-based updating within a unified architecture. The workflow begins with input generation, which includes historical lead-time observations, scenario-specific operating conditions, and IoT-linked operational signals such as machine availability and congestion indicators. These inputs are combined with the semi-synthetic and GAN-augmented data pipeline described in
Section 3.4.1 and
Section 3.4.2 to construct the model-development dataset.
The second stage consists of the statistical uncertainty modeling block, where quantile regression and EVT are used to estimate baseline lead-time behavior, asymmetric variability, and tail-risk characteristics. This block provides the initial probabilistic representation of supplier lead-time uncertainty and serves as the analytical foundation for subsequent adaptive modeling.
The third stage corresponds to the IoT-RL adaptation block. In this stage, real-time operational signals are processed to characterize the current system state, and the RL agent selects an adaptive correction action through the adjustment factor or its associated sensor-weight configuration. This block introduces context-aware and feedback-driven adaptation to the baseline prediction.
The fourth stage is the AI-driven predictive learning block, in which the TFT model processes multivariate temporal inputs to produce uncertainty-aware lead-time forecasts. Online learning and particle filtering further refine these predictions by continuously updating model parameters and latent state estimates under non-stationary conditions.
The fifth stage is the digital twin-enabled simulation block, where the adjusted predictions are embedded into the DES environment. This block simulates system-level interactions, disruption propagation, queue dynamics, supplier-delay effects, and operational transitions under varying scenarios. The resulting outputs include updated lead-time predictions, forecast errors, resilience metrics, and comparative performance indicators such as MAE and RMSE.
Finally, the framework incorporates a closed feedback loop in which system outputs, forecast errors, and simulated operational responses are fed back into the RL and predictive layers for subsequent updating. In this way, the architecture functions as an adaptive cycle rather than a one-pass forecasting pipeline. The overall framework therefore links inputs, statistical modeling, adaptive control, AI-based prediction, digital twin simulation, outputs, and feedback updating within a coherent system-oriented workflow.
The integrated nature of this architecture necessarily involves methodological trade-offs. Its principal advantage is that each component addresses a distinct limitation of static lead-time prediction: quantile regression-EVT improves uncertainty and tail-risk representation, the IoT-RL layer supports adaptive correction, the TFT-based learning layer captures temporal dependencies under non-stationary conditions, and the digital twin environment enables system-level evaluation of disruption propagation and feedback effects. However, the combination of these components also increases model complexity, calibration effort, data dependency, and implementation difficulty relative to simpler forecasting approaches. In particular, the full framework requires richer data streams, tighter synchronization between analytical and operational layers, and greater technical effort for maintenance, interpretation, and deployment. Accordingly, the present study should be understood as a simulation-based proof of concept designed to evaluate the potential value of layered integration under complex operating conditions, rather than as a claim that all supply chain settings require the full methodological stack. In lower-complexity or lower-data environments, simpler statistical or IoT-adaptive variants may be more practical, whereas the full AI-enhanced framework is most justified in disruption-prone and digitally mature settings where the added predictive capability can outweigh the additional complexity.
3.4.4. Digital Twin Architecture, Data Flow, and Synchronization Mechanism
To clarify the operational logic of the proposed digital twin framework,
Figure 1 presents the overall architecture and information flow among its major components. The framework is organized into five interacting layers: (i) the input and data layer, (ii) the data synchronization and state-construction layer, (iii) the predictive modeling layer, (iv) the adaptive decision layer, and (v) the digital twin simulation layer. Together, these layers form a closed-loop architecture in which sensing, prediction, simulation, and corrective adaptation are continuously linked.
The input and data layer contains the streams used to represent the observed or simulated supply chain state. These inputs include historical supplier lead-time records, scenario-specific operating conditions, and IoT-linked operational signals such as machine availability, congestion status, and disruption indicators. In the present study, these streams are represented using semi-synthetic data and simulated IoT-linked signals; however, the same architecture is intended to accommodate real sensor and event data in future deployment-oriented implementations.
The second layer performs data synchronization and state construction. At each decision epoch , incoming operational signals and lead-time observations are time-aligned, preprocessed, and transformed into a consistent system-state representation. This synchronization step ensures that heterogeneous inputs from different sources are integrated into a common state vector before prediction and control are performed. The synchronized state includes the current operational condition, recent lead-time history, the current forecast, recent forecast error, and relevant adaptive variables. In this way, the framework explicitly links data updating to state updating, which is a defining feature of the digital twin logic used in this study.
The third layer is the predictive modeling layer. Here, the quantile regression-EVT model provides the baseline probabilistic representation of supplier lead-time uncertainty, while the TFT-based model produces uncertainty-aware temporal forecasts using synchronized multivariate inputs. Online learning and particle filtering continuously refine parameter estimates and latent state representations as new information becomes available. This layer therefore supports both baseline uncertainty characterization and adaptive predictive updating.
The fourth layer is the adaptive decision layer. Based on the synchronized system state and predictive outputs, the RL component determines the corrective adjustment through the adaptive factor or its associated sensor-weight configuration. This stage transforms prediction into decision-aware adjustment by selecting context-sensitive actions that respond to evolving operating conditions. The adaptive output is then passed to the digital twin environment for execution and evaluation.
The fifth layer is the digital twin simulation layer, implemented through DES. In this layer, the adjusted lead-time prediction is embedded into the virtual representation of the supply chain, where operational events such as congestion buildup, machine downtime, and supplier-delay propagation are simulated. The digital twin updates the virtual system state in response to the selected adaptive action and produces simulated outputs including revised lead-time realizations, forecast error, resilience indicators, and reward information.
The feedback loop operates as follows. After each simulation cycle, the resulting system response is returned to the predictive and adaptive layers in the form of updated state variables, realized prediction error, and reward signals. These outputs are then used to update the RL policy, refine the online learning process, and support particle-filter-based state correction. Accordingly, the output of one cycle becomes part of the input to the next cycle. This repeated synchronization of sensing, prediction, simulation, and adaptation is what allows the framework to function as a digital twin rather than as a static simulation model.
Thus, the proposed digital twin is defined in this study not simply as a virtual model of supply chain operations, but as a feedback-based and state-updating simulation environment in which data flows, predictive models, and adaptive control mechanisms are continuously synchronized at each decision epoch.
3.5. Digital Twin-Based Simulation Design
To assess the robustness and adaptive capabilities of the proposed framework, a digital twin-enabled DES environment is developed. This environment replicates supplier behavior and operational dynamics under varying conditions, enabling controlled experimentation and system-level analysis.
The simulation design is built on the semi-synthetic data-generating process described in
Section 3.4.1 and spans realistic procurement conditions in complex adaptive supply chains. Lead-time behavior is evaluated across multiple parameter regimes to reflect diverse operational scenarios.
The variability parameter represents increasing uncertainty regimes:
: stable suppliers
: moderate variability
: high uncertainty and disruption-prone conditions
IoT-driven deviations are modeled using two operational scenarios:
Scenario A: , routine variability
Scenario B: , disruption-intensive conditions
Within the digital twin environment, simulation events (e.g., machine downtime, congestion buildup, supplier delays) are processed using DES logic, while stochastic sampling is used internally to generate variability, ensuring both realism and statistical rigor.
Each parameter configuration is evaluated using 10,000 simulated lead-time realizations, ensuring robustness and convergence of results.
For each scenario, predictions are generated using a multi-layer framework comprising: a statistical baseline based on quantile regression and EVT, an IoT-enabled adaptation layer within a digital twin environment, and an AI-driven learning layer integrating TFT, RL, and online learning. Model performance is evaluated using:
where
is the examined lead time and
is the forecast value.
To ensure reproducibility, all stochastic processes are initialized using static random seeds.
Calibration and Validation Workflow
To ensure methodological transparency and reproducibility, the study follows a structured calibration and validation workflow spanning data preparation, model development, calibration, testing, and robustness analysis. The full dataset generated through the semi-synthetic and GAN-augmented pipeline is first partitioned into 70% training, 15% validation, and 15% testing subsets. The training subset is used for model fitting, the validation subset is used for hyperparameter tuning and convergence monitoring, and the test subset is reserved exclusively for final performance evaluation.
For the baseline statistical model, calibration consists of estimating the quantile regression coefficients at the predefined quantile levels and fitting the EVT tail model using exceedances above the selected threshold. For the IoT-adaptive model, calibration includes tuning the bounded deviation structure and optimizing the RL-based adjustment mechanism within the digital twin environment. For the AI-enhanced model, calibration includes training the TFT architecture on the training subset, selecting hyperparameters using validation loss, and monitoring convergence through early stopping. In parallel, the RL component is calibrated through repeated simulation episodes, where policy updates are guided by reward stabilization and validation-based performance monitoring.
Following calibration, model validation is performed in three stages. First, in-sample validation is conducted on the validation subset to assess convergence stability and to compare alternative parameter settings. Second, out-of-sample testing is performed on the held-out test subset using MAE and RMSE as the primary predictive performance metrics. Third, scenario-based validation is conducted across both Scenario A and Scenario B to assess whether calibrated models maintain stable performance under moderate and disruption-intensive conditions.
To further strengthen methodological validity, the workflow also includes distributional validation and robustness testing. The calibrated models are evaluated not only on samples generated from the reference log-normal DGP, but also on alternative DGPs, including mixed log-normal-Gamma, heavy-tailed Weibull, and bimodal mixture distributions. This enables assessment of both within-distribution robustness and out-of-distribution generalization without re-tuning the trained models. In addition, a focused calibration experiment is conducted to compare baseline, IoT-adaptive, and AI-enhanced predictive outputs under synchronized digital twin conditions, thereby examining the consistency between IoT-driven empirical behavior and model-based predictions.
Overall, this calibration and validation workflow is designed to ensure that the reported results reflect not only model fit under idealized conditions, but also convergence quality, generalization performance, and robustness under heterogeneous and non-stationary operating regimes. This structured process strengthens the reproducibility of the study and clarifies how the proposed framework is systematically evaluated from data generation through final validation.
Despite the structured calibration, robustness testing, and out-of-distribution evaluation used in this study, the exclusive reliance on semi-synthetic data remains an important limitation for external validity. The current design was adopted because datasets that jointly provide historical supplier lead times, disruption records, and sufficiently granular IoT-linked operational signals are difficult to access in public form and are often restricted in industrial settings. Accordingly, the present methodology is intended to provide controlled proof-of-concept validation rather than direct empirical verification of deployment performance. The use of alternative data-generating processes, scenario-based testing, and calibration experiments strengthens internal validity and distributional robustness, but it does not substitute for validation on real-world or publicly available supply chain datasets. Future work should therefore test the proposed framework using industrial case data or suitable public benchmarks in order to assess transferability, calibration behavior, and predictive performance under operational conditions.