Next Article in Journal
WISCA: A Consensus-Based Approach to Harmonizing Interpretability in Tabular Datasets
Previous Article in Journal
KS-VAE: A Novel Variational Autoencoder Framework for Understanding Alzheimer’s Disease Progression Using Kolmogorov–Smirnov Guidance
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Personalized Blood Glucose Prediction Using Physiology- Informed Machine Learning

1
Centre for e-Health, Department of Information and Communication Technologies, University of Agder, 4879 Grimstad, Norway
2
Centre for Artificial Intelligence Research (CAIR), Department of Information and Communication Technologies, University of Agder, 4879 Grimstad, Norway
3
Centre for Artificial Intelligence and Innovation Pingla Institute Ltd., Sydney, NSW 2032, Australia
*
Author to whom correspondence should be addressed.
Mach. Learn. Knowl. Extr. 2026, 8(4), 96; https://doi.org/10.3390/make8040096
Submission received: 24 February 2026 / Revised: 26 March 2026 / Accepted: 2 April 2026 / Published: 10 April 2026

Abstract

Data-driven approaches to blood glucose predictive modeling face significant challenges due to the inherent variability in biological systems. While these methods efficiently capture statistical patterns through automated processes, they often lack the biological interpretability necessary to link model behavior with underlying physiological mechanisms. In contrast, physiological models offer accurate mechanistic representations but require complex parameterization and specialized domain expertise. In this work, we present an approach for predicting blood glucose levels (BGLs) leveraging the concept of physiology-informed neural networks (PINNs). This approach addresses the challenge of BGL prediction by incorporating the parameters of insulin and meal dynamics within the architecture of a predictive network. It employs a two-stage learning approach for modeling physiology and predicting BGLs. The neural network is pretrained to approximate the solutions of the physiological dynamics, and the output of this pretrained model, representing the insulin and glucose concentration states, is then fed as input into a predictive model, enabling simultaneous optimization of predictive accuracy and physiological parameter estimation, offering advantages over traditional modeling approaches in terms of personalized prediction and interpretability. The results highlight the model’s ability to estimate physiological parameters while maintaining strong predictive performance that aligns with the underlying physiological principles. This framework offers significant potential for personalized predictive modeling where precise and efficient understanding of individual metabolism is essential.

Graphical Abstract

1. Introduction

Diabetes is a chronic condition that requires careful management to maintain blood glucose levels (BGLs) within a safe range [1]. Several factors, including diet, physical activity, insulin administration, stress, and illness, influence glucose fluctuations, making regulation challenging [2]. Therefore, self-care, adherence to lifestyle recommendations, and timely blood glucose monitoring play a crucial role in effective diabetes management [3,4]. To facilitate glucose monitoring and minimize the risk of both short- and long-term complications, continuous glucose monitoring (CGM) systems have become increasingly prevalent [5]. These systems measure glucose levels in the interstitial fluid beneath the skin, estimating plasma glucose concentrations with a high sampling rate. The vast amount of data generated by CGM can be leveraged in both physiological and data-driven models for blood glucose prediction, each offering distinct advantages that aid early intervention and complication prevention [6]. Accurate and timely predictions enable proactive decision-making to mitigate the risks of hyperglycemia and hypoglycemia while optimizing dietary choices, exercise routines, and treatment plans [7,8].
Physiological models mathematically describe glucose metabolism and kinetics. However, they require detailed knowledge of an individual’s physiological processes and prior configuration of numerous parameters [9,10]. Estimating and fine-tuning these parameters is often complex, error-prone, and time-consuming due to limited observed data. In contrast, data-driven models rely solely on self-monitored historical data, requiring minimal knowledge of glucose metabolism [10,11,12]. These models, often referred to as black-box approaches, have demonstrated superior performance over physiological models. However, their lack of physiological grounding results in less generalizable predictions, making interpretation difficult. Another challenge with these models is their dependence on large amounts of labeled data, which is often scarce in clinical settings [9].
A variety of machine learning (ML) architectures have been explored for blood glucose prediction, ranging from traditional regression-based algorithms such as autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA), random forest (RF), extreme gradient boosting (XGBoost), and support vector regression (SVR) to more advanced models [9,10,11,12,13]. These range from basic feedforward neural networks (FNNs) to deep learning architectures such as recurrent neural networks (RNNs), convolutional neural networks (CNNs), temporal CNNs (TCNs), and attention-based networks. Due to their high predictive accuracy, deep learning works have rapidly emerged as a powerful tool for blood glucose prediction. Among these, long short-term memory (LSTM) networks are widely used in different time series applications [14,15,16,17] due to their architecture, which includes memory cells and input, forget, and output gates. These components dynamically regulate information flow, preserving critical patterns and insights across extended sequences, making LSTMs particularly effective for time series predictions like BGLs [12,13].
Attention mechanisms, originally designed for natural language processing and computer vision tasks [18], have also been applied to blood glucose prediction [19]. Transformers, in particular, have demonstrated significant success in handling sequential data. In the context of BGL prediction, studies [20,21,22] have utilized attention-based recurrent networks to enhance predictive performance. Additionally, other studies [7,23] have implemented and evaluated the effectiveness of self-attention networks in predicting BGLs.
Even though CNNs have not been extensively used for time series forecasting or BGL prediction, there are instances where they have been used, either independently in some studies [7,24] or for feature extraction in conjunction with other models [2,14]. These applications have demonstrated CNNs to be suitable for BGL prediction or any other time series forecasting. Given their ability to address long-term dependencies within time series data, TCNs are also particularly useful for predicting BGL. BGLs are responsive to factors that manifest hours or even days prior, necessitating models capable of learning such extended dependencies, a capability TCNs possess [23]. Few studies have considered TCN for BGL prediction [11,25]; nevertheless, these networks have been successfully applied for other time series forecasting tasks.
A potential limitation in existing data-driven BGL predictive models is their ability to accurately represent insulin and meal dynamics. When bolus insulin is administered, its effect on BGLs is not immediate; rather, it follows a time-dependent pattern, entering the bloodstream, peaking at a certain point, and gradually declining [26]. A simplistic model that relies solely on insulin dose and administration time might not fully capture this gradual influence on BGLs. Instead, a more detailed approach that considers fluctuations in insulin concentrations over time could offer a clearer representation of glucose responses. Likewise, meal-related glucose appearance rates could play a more significant role than just the total meal amount and timing. Incorporating insulin kinetics, meal-related glucose appearance rates, and other physiological parameters could enable predictive models to better reflect real-world glucose dynamics.
Several studies have developed models that integrate physiological models of blood glucose dynamics with data-driven predictive models [1,27,28,29,30,31,32]. These studies have demonstrated performance improvements through the integration of various physiological knowledge. For example, Ref. [1] enhanced the performance of the SVR model by incorporating features such as the rate of glucose appearance from meal intake, plasma insulin levels, and cumulative glucose appearance over a specific period. These features, obtained from meal and insulin models of blood glucose dynamics, serve as inputs, providing a straightforward way to incorporate domain knowledge into learning models. Similarly, Ref. [31] included plasma insulin and glucose appearance rate from meals as input features through insulin kinetics and meal models. Additional features such as insulin on board, carbohydrates on board, glucose appearance rate from carbs, rate of glucose appearance in the blood from the gut, and activity on board were employed in [27,28,29,32] to refine machine learning models. Ref. [30] proposed a hybrid method that sequentially combines predictions from both mathematical and machine learning models, where the machine learning model predicts residuals based on phenotypic features, which are then subtracted from the predictions made by the mathematical model. The results demonstrated that personalized physiological models consistently outperformed data-driven and hybrid model approaches. Here, personalized physiological models may inherently capture individualized critical processes and features that data-driven models struggle to individualize or interpret effectively. While these studies have demonstrated performance enhancements through the integration of various physiological insights specifically as static input features, they have yet to explore integration at a personalized level and lack interpretability in this context, a limitation that this study aims to address. Furthermore, they have not systematically analyzed the potential benefits of such integration, representing an additional gap that this study seeks to fill.
Hybrid modeling approaches combining physiological ordinary differential equation (ODE) knowledge with neural networks have emerged as a promising direction for glucose–insulin dynamics modeling. Ref. [33] proposed systems biology-informed neural networks (SBINNs), a general methodological framework for inferring hidden dynamics and unknown ODE parameters from sparse noisy observations by embedding ODE residuals into the neural network loss. Applied to three benchmark biological systems including an ultradian glucose–insulin model, the study demonstrates that unobserved states and unknown parameters can be recovered from minimal observations and that the model can additionally infer hidden inputs such as unknown meal content. However, the approach is validated exclusively on synthetic data generated from known models and does not explicitly address patient-specific personalization, making it primarily a methodological contribution rather than a clinically validated glucose management tool. The authors of [34] introduce a biology-informed recurrent neural network (BI-RNN) that employs a gated recurrent unit (GRU) trained with a three-component loss function enforcing data fidelity, ODE-based state consistency, and auxiliary constraints such as non-negativity. While the model predicts glucose dynamics, its primary goal is system identification within a model predictive control framework, enabling the reconstruction of unmeasured physiological states such as insulin-on-board and rate of glucose appearance. Physiological parameters are pre-identified and kept fixed, contributing to the loss formulation as constraints rather than being learned by the network. The authors of [35] apply physics-informed neural networks to estimate physiological parameters of the Bergman minimal model, recovering hidden insulin dynamics from glucose-only intravenous glucose tolerance test (IVGTT) data via ODE residual minimization and biologically informed parameter bounds. Most parameters are estimated with reasonable accuracy, though some exhibit notable bias. The approach is validated on simulations from a single subject with multiple noise realizations and is not evaluated on real-world or free-living data.
Ref. [36] proposes hybrid graph sparsification (HGS), a method for automatically pruning redundant latent states and interactions from hybrid neural ODEs that combine mechanistic physiological models with neural networks, particularly useful in data-scarce healthcare settings. Blood glucose forecasting in Type 1 diabetes (T1D) patients is used as an application to demonstrate the method’s effectiveness. However, the method focuses on structural sparsification rather than parameter estimation and does not explicitly address patient-specific personalization. Ref. [37] focuses on glucose simulation rather than pure prediction, proposing physiologically constrained neural network digital twins. Their approach constructs a neural state-space model aligned with a system of biological ODEs, where each equation is approximated by a dedicated neural network restricted to the same inputs as the original equation. The model is trained to both accurately simulate glucose dynamics and maintain physiologically consistent internal states. Rather than estimating explicit physiological parameters, the neural network weights implicitly encode the system’s dynamics. Personalization is achieved by augmenting the population model with an additional individual-specific network that learns residual dynamics from each person’s data.
A physiology-informed glucose–insulin neural network (PIGNN) is proposed in [38], a method for 30 min ahead glucose prediction that integrates a five-state ODE into an LSTM via physiology-inspired input windows and an ODE-consistency loss penalty. The model incorporates a physiological ODE as prior knowledge but does not explicitly estimate patient-specific physiological parameters; personalization is achieved primarily through data-driven training rather than system identification. The work in [39] proposes a blood glucose prediction method targeting T1D management by constructing a dual-channel physiology-informed neural network (PINN) architecture constrained by the Bergman Minimal Model. The framework combines a data-driven LSTM/GRU module with a physiology-informed module, integrating physiological knowledge both through architectural design and by incorporating a differential-equation residual penalty into the loss function. The approach shows consistent improvements in prediction accuracy. It is among the few reviewed works that simultaneously perform glucose prediction while enforcing Bergman-model consistency, although physiological parameters are not explicitly identified or reported and are only indirectly constrained during training.
Collectively, these works suggest that physiology-informed hybrid approaches can improve performance, particularly in data-scarce settings, while enabling more physiologically consistent and interpretable outputs and supporting richer functionalities such as prediction, simulation, control-oriented state reconstruction, and parameter inference. While Ref. [33] shows that joint parameter and hidden state recovery is theoretically possible from minimal observations, subsequent applied works do not fully realize this potential. Most either treat parameters as fixed constants in loss function and focus solely on prediction or estimate parameters only under controlled synthetic conditions. Even where latent physiological states are predicted, as in [39], individual parameter estimation remains unrealized. The present work addresses these limitations and extends the state of the art in three meaningful directions. First, it moves beyond existing approaches that either embed physiological parameters as fixed constraints to enforce physiological plausibility in loss function or use them merely as static input features by making key parameters trainable and subject-specific, enabling individual-level parameter estimation from real-time data. Second, the proposed Neural ODE (NODE)-based approximation of insulin and meal kinetics offers greater flexibility than fixed-parameter ODE solvers, enabling adaptation to inter-individual variability through a two-stage transfer learning strategy that pre-trains on population-level data to capture general physiological dynamics and subsequently fine-tunes on individual patient BGL data. Third, the sensitivity and counterfactual analyses provide a structured framework for clinical interpretability that most competing methods, including MTL-LSTM, SAN, and GAN-based data-driven approaches, entirely lack. Together, these contributions address the gap that no existing framework simultaneously achieves: interpretable parameter estimation from free-living CGM data, robust individual-level personalization, and counterfactual analysis.
To overcome the problems with state-of-the-art methods, we propose a novel methodology combining physiological and data-driven models to leverage their complementary strengths. In particular, we leverage NODE [40] to approximate the evolution of insulin and glucose dynamics. These dynamics play a crucial role in blood glucose regulation and are thus integrated into the predictive model. The result is a synergistic framework that harmonizes data and physiological principles, enabling learning from both empirical observations and established physiological models. The proposed physiology-informed blood glucose prediction network (PIBGN) simultaneously predicts BGLs and identifies individualized physiological parameters using data-driven insights, facilitating the development of personalized predictive models. The system follows a structured two-stage transfer learning approach. In the first stage, it approximates the evolution of insulin and meal dynamics. The second stage refines this by learning to identify or infer the optimal physiological parameters that best correspond to the predicted glucose levels, ensuring an accurate and meaningful representation of the observed data. The main contributions of this work are summarized as follows:
  • Unlike prior methods that treat physiological parameters as static inputs or restrict physiological modeling to fixed training constraints, the proposed approach embeds trainable, subject-specific parameters and physiology into the network architecture, enabling interpretable estimation of physiological parameters from free-living CGM data.
  • A NODE-based approximation of insulin and meal kinetics is proposed that combines the structural knowledge of physiological models with the flexibility of data-driven learning. A two-stage transfer learning strategy is introduced, first pre-training on population-level data to capture general physiological dynamics, then fine-tuning on individual patient data by integrating the resulting representations into a BGL predictive network, allowing the model to adapt to inter-individual variability.
  • The framework incorporates sensitivity and counterfactual analyses, demonstrating how changes in insulin dosing and carbohydrate intake affect predicted glucose trajectories, offering both predictive accuracy and physiological explainability.
The remainder of this article is organized as follows. The proposed physiology-informed BGL prediction method is described in Section 2. We discuss the experimental results and performance analysis in Section 3. The limitations of the proposed model are demonstrated in Section 4. Finally, this article is concluded in Section 5.

2. Physiology-Informed BGL Prediction Method

The proposed PIBGN for predicting BGL is designed in two distinct stages, as shown in Figure 1. In the first stage, all physiological parameters are frozen at their literature-derived values (see Section 2.1.3), and NODEs are trained solely to approximate the underlying physiological dynamics by solving the governing differential equations for insulin and meal kinetics, separately. Keeping the parameters fixed at this stage ensures that the network first establishes a physiologically consistent solution to the differential equations, without the added complexity of simultaneous parameter estimation destabilizing the learning process. Once the NODEs have converged, the second stage begins by unfreezing all trainable physiological parameters of NODEs and connecting their outputs as inputs to the downstream BGL predictive model. Both models are then trained jointly end-to-end, allowing the predictive model to learn from the physiologically informed representations while simultaneously refining the physiological parameters away from their initial literature values through backpropagation. This transfer learning strategy ensures that parameter refinement is guided by meaningful gradients from an already well-initialized network, rather than being estimated from a randomly initialized state, leading to more stable and physiologically grounded convergence.

2.1. Approximating Solution of Insulin and Meal Absorption Kinetics

2.1.1. Insulin Absorption Kinetics

The subcutaneous insulin absorption model [41,42] is utilized in this study, where the single-compartment model of insulin absorption kinetics in plasma is described as follows:
d Q s c 1 ( t ) / d t = ( k a 1 + k d ) Q s c 1 ( t ) + U ( t τ )
d Q s c 2 ( t ) / d t = k a 2 Q s c 2 ( t ) + k d Q s c 1 ( t )
d Q p ( t ) / d t = k e Q p ( t ) + k a 1 Q s c 1 ( t ) + k a 2 Q s c 2 ( t )
I p ( t ) = Q p ( t ) / V I
where U ( t ) is the rate of insulin infusion, and Q s c 1 ( t ) and Q s c 2 ( t ) denote the different states of infused insulin in the subcutaneous tissue, non-monomeric and monomeric, respectively. τ represents a time delay in the appearance of insulin in subcutaneous tissue after infusion. Q p ( t ) corresponds to the insulin amount in plasma. The constant rates of insulin absorption from the two subcutaneous compartments into plasma are denoted by k a 1 and k a 2 , respectively. k d represents the rate of insulin dissociation into monomers, and k e denotes the fractional clearance rate of insulin in plasma. Additionally, V I is the volume of insulin distribution in plasma.

2.1.2. Meal Absorption Kinetics

The appearance of postprandial glucose in the bloodstream is modeled using the oral glucose absorption model [41,43], which represents the different stages of a meal, from solid to liquid and ultimately to a form that enters the bloodstream, as described by:
d Q s t o 1 ( t ) / d t = k m a x Q s t o 1 ( t ) + D δ ( t )
d Q s t o 2 ( t ) / d t = k e m t Q s t o 2 ( t ) + k m a x Q s t o 1 ( t )
d Q g u t ( t ) / d t = k a b s Q g u t ( t ) + k e m t Q s t o 2 ( t )
R a ( t ) = f k a b s Q g u t ( t ) / B W
where D represents the amount of carbohydrate or glucose intake, δ t is the impulse function, Q s t o 1 ( t ) and Q s t o 2 ( t ) denote the glucose content in the stomach in different forms, solid and liquid, respectively. Q g u t ( t ) represents the glucose present in the intestine. The parameters k m a x , k a b s and k e m t correspond to the rates of meal grinding, intestinal glucose absorption, and gastric emptying, respectively. Additionally, f represents the fraction of glucose absorbed into plasma, and B W denotes body weight. The gastric emptying rate varies depending on the total glucose content in the stomach:
Q s t o ( t ) = Q s t o 1 ( t ) + Q s t o 2 ( t )
which is further estimated by:
k e m t ( t ) = k m i n + ( k m a x k m i n ) 2 tanh [ α ( Q s t o ( t ) c D ) ] tanh [ β ( Q s t o ( t ) d D ) ] + 2
where
α = 5 / [ 2 D ( 1 c ) ] , β = 5 / ( 2 D d ) ,
The parameters c and d define the inflection points in the curve that characterize the relationship between k e m t and Q s t o . Meanwhile, k m i n and k m a x represent the lower and upper limits of the stomach emptying rate, respectively.

2.1.3. Approximating Solutions Using Neural Networks

The differential equations governing the insulin and meal models discussed earlier can be represented in the general form of:
d Q p ( t ) d t = F ( Q p ( t ) ) .
In this equation, Q p ( t ) is the state variable that evolves over time, t is the independent variable, and F ( Q p ( t ) ) defines the system’s dynamics, parameterized by p. Such equations are typically solved using numerical methods, such as Euler’s method or the Runge–Kutta method. To identify the optimal parameter p that generates a solution Q p ( t ) best matching the observed data Q ( t j ) , the model is optimized by minimizing the following objective function:
arg min p 1 M j = 1 M Q p ( t j ) Q ( t j ) 2 .
Thus, in the context of our insulin and meal models, we obtain:
Q p i = Q s c 1 Q s c 2 Q p
F p i = ( k a 1 + k d ) Q s c 1 ( t ) + U ( t τ ) k a 2 Q s c 2 ( t ) + k d Q s c 1 ( t ) k e Q p ( t ) + k a 1 Q s c 1 ( t ) + k a 2 Q s c 2 ( t )
and
Q p m = Q s t o 1 Q s t o 2 Q g u t
F p m = k m a x Q s t o 1 ( t ) + D δ ( t ) k e m t Q s t o 2 ( t ) + k m a x Q s t o 1 ( t ) k a b s Q g u t ( t ) + k e m t Q s t o 2 ( t ) ,
where p i = ( k a 1 , k d , k e , k a 2 ) and p m = ( k a b s , k m i n , k m a x ) are the parameters of the insulin and meal models, respectively.
Alternatively, recent advancements have explored the use of neural networks as a powerful tool for approximating solutions to differential equations by directly learning the system’s dynamics [40,44,45]. In this framework, a neural network is trained to predict the time derivative d Q p N ( t ) d t , which is then employed to estimate the evolution of the state variable Q p N ( t ) . The training process involves minimizing the discrepancy between the network’s predicted gradients d Q p N ( t ) d t and the true gradients d Q p P ( t ) d t , thereby ensuring that the model’s solution faithfully approximates the underlying system behavior.
Drawing from this principle, the neural network for insulin model (IODE) predicts d Q p i N ( t ) / d t , using insulin doses as input, while the neural network for meal model (MODE) predicts d Q p m N ( t ) / d t , using carbohydrate amount as input. The physiological parameters associated with these models are categorized into fixed and trainable sets based on their variability across individuals. Parameters that are well-established at the population level are held fixed, namely the dimeric insulin absorption rate k a 1 = 1.34 × 10 4 min 1 and the diffusion/transition rate k d = k a 2 + 0.0155 min 1 [41,46]. In contrast, parameters recognized in the literature as subject-specific and critically influential in blood glucose regulation are designated as trainable, specifically the insulin elimination rate ( k e ) , monomeric insulin absorption rate ( k a 2 ) of the insulin model, and the intestinal absorption rate ( k a b s ) , minimum and maximum glucose absorption rates ( k m i n , k m a x ) of the meal model. These trainable parameters are kept fixed during the training of these models and are initialized from literature-established values ( k e = 0.112 min 1 , k a 2 = 0.0136 min 1 , k abs = 0.071 min 1 , k min = 0.006 min 1 , k max = 0.054 min 1 ) [43,46]. Conversely, these parameters are allowed to adapt during end-to-end training of the predictive model, enabling the model to capture individual physiological differences and facilitate personalized, physiologically interpretable glucose prediction.
Since the physiological equations are more complex and challenging for a neural network to learn, blended gradients are utilized instead of using predicted gradients directly for the evolution of state variables. The blended gradients are obtained by combining the predicted gradients from the neural network with the physiological gradients to allow the predictions to be in physiologically plausible bounds. This blending ensures the model respects physiological laws while allowing the neural network to capture additional dynamics. Using the insulin kinetics model as an example, the complete training procedure is illustrated in Figure 2, highlighting how blended gradients are utilized, how the state variables evolve and how the model is optimized. The blended gradient is expressed as:
d Q p B ( t ) / d t = α d Q p N ( t ) / d t + ( 1 α ) d Q p P ( t ) / d t ,
where α is the weight, which controls how much influence to give to the final prediction from the physiological and neural prediction, d Q p B ( t ) / d t represents the blended gradient, d Q p N ( t ) / d t is the gradient predicted by the neural network, and d Q p P ( t ) / d t denotes the physiology-based gradient.
The value of α here is set to be dynamically changing within the training period. Initially, α is set to a smaller value to make the prediction rely more on the physiological gradients while still incorporating neural predictions. After the model has learned a good approximation of the system dynamics (upon convergence), the α value gradually increases to the next value. α is scheduled using a mechanism similar to ReduceLROnPlateau, but applied in reverse. Instead of decreasing when training stagnates, α is increased with a step size to the next stage once convergence is detected at the current level. In practice, the scheduler monitors the validation loss at each iteration, and convergence is assumed when no significant improvement is observed for a specified number of consecutive epochs. This threshold is controlled by a patience number of 10, which is determined through grid search. The step size of 0.1 is similarly selected via grid search over the range [0.1–0.2]. The initial value of α = 0.4 is determined empirically by evaluating values in the range [0.1, 0.9]. Lower values (<0.4) resulted in very slow convergence, while higher values (>0.6) prevented the model from learning effectively. Mid-range values yielded the best trade-off, with α = 0.4 emerging as optimal in terms of both convergence speed and learning quality. The neural network is optimized to minimize the loss function given in Equation (14) that compares the predicted gradients to the gradients derived from physiological model, which are formulated based on the evolved states with blended gradients Q p B ( t )
arg min 1 M j = 1 M d Q p N ( t ) / d t F p ( Q p B ( t ) ) 2

2.2. BGL Predictive Model

The Bergmann minimal model, known as the simplest model, was developed in the 1980s [47]. It includes glucose and insulin dynamic equations, given by Equations (15) and (16), which help in understanding how different factors interact to influence blood glucose levels over time. For instance, it describes how quickly glucose is absorbed from food and how effectively insulin is utilized for glucose uptake by the body.
d G ( t ) / d t = p 1 ( G ( t ) G b ) X ( t ) G ( t ) + R a ( t ) / V G
d X ( t ) / d t = p 3 ( I p ( t ) I b ) p 2 X ( t )
where G ( t ) is plasma glucose and X ( t ) is insulin action in a remote compartment. G b and I b are the basal levels of glucose and insulin in plasma. V G is the volume of glucose distribution, p 1 is the fractional glucose effectiveness, and p 2 is the rate constant describing the dynamics of insulin action. I p ( t ) is the plasma insulin concentration, representing the concentration obtained after insulin infusion, as expressed in Equation (4). Similarly, R a ( t ) is the postprandial glucose rate of appearance in plasma, as expressed in Equation (8). This model describes how glucose and insulin regulate blood sugar and how they interact through a feedback loop to maintain stability. To replicate this physiology, the proposed BGL predictive model takes into account the insulin dynamics following an insulin dose and the plasma glucose response after meal intake. The insulin and glucose concentrations generated by the IODE and MODE models are incorporated as inputs to the predictive network, along with historical data. When training the network, all trainable parameters of IODE and MODE remain unfrozen with lower learning rates for stable learning.

2.2.1. Loss Function

To ensure the optimized parameters and predicted outputs remain physiologically meaningful, the model incorporates physiological equations as constraints during optimization. Thus, the model is trained to satisfy both the available data and the physiological constraints imposed by the mathematical model. The loss function consists of two components.
Data Loss
Data loss quantifies how well the network fits with the BGL data, for which Huber loss [48] is employed, as expressed in (17). The Huber loss is applied to min–max normalized BGL data over the range [40, 400] mg/dL, with the transition parameter δ = 0.1 employed from our previous study [26]. In the original glucose scale, this corresponds to approximately 0.1 × ( 400 40 ) = 36 mg/dL, defining the boundary between the quadratic and linear regimes of the loss function.
Loss G W , b = 1 M j = 1 M L δ ( y , y ^ )
where
L δ ( y , y ^ ) = 1 2 ( y y ^ ) 2 if | y y ^ |     δ δ | y y ^ | 1 2 δ otherwise
y = G t r u e ( t j )
y ^ = G p r e d ( t j )
Residual Loss
The residual loss acts as a physiological constraints that ensures the network is physiologically plausible, as described below:
MSE F p W , b = 1 M j = 1 M R ( t j ) 2
where R ( t ) is the residual, estimated as:
R ( t ) = d Q p N ( t ) / d t F p ( Q p N ( t ) )
Thus, the total loss function is given by:
L ( W , b , p ) = Loss G W , b + ω · MSE F p W , b
where ω is a weighting factor that balances the importance of fitting the data versus satisfying the physics.

2.3. Dataset

This study utilizes the OhioT1DM dataset [49], available upon request from [50], which comprises eight weeks of continuous monitoring data collected from twelve individuals diagnosed with Type 1 diabetes (seven males and five females), ranging in age from 20 to 80 years, as shown in Table 1. The dataset contains 166,532 blood glucose measurements recorded at 5 min intervals using the Medtronic Enlite CGM sensors. In addition to CGM data (BGLs), the dataset includes comprehensive insulin delivery records including both bolus and basal doses, administered via Medtronic 530G insulin pumps. Supplementary contextual information, such as self-reported meal intake and physical activity, was gathered through smartphone applications and wearable fitness trackers. For this study, BGLs, insulin, and meal intake data are utilized.
For model development and evaluation, the dataset provides predefined training and testing splits for each participant, which are adopted as is. The training set is further partitioned into 80% for training and 20% for validation. Model training is performed using the training subset, and final evaluations are conducted on the held-out test data. To ensure data quality and consistency, several preprocessing steps are applied. All time-series signals are resampled to a uniform 5 min interval. Segments containing missing values with gaps exceeding 30 consecutive minutes are excluded to preserve the temporal integrity of the data. Missing values in the training set are imputed using linear interpolation, while the test set remains unaltered to maintain the integrity of performance evaluation. Finally, all features are standardized using min–max normalization.

2.4. Hyperparameter Optimization and Model Training

Hyperparameter tuning for the IODE, MODE, and predictive models is performed using the grid search methodology, with candidate parameters detailed in Table 2. The optimal configuration is determined by identifying parameter sets that minimize the respective loss functions, thereby maximizing model performance. Following optimization, each model is trained on the designated training dataset and subsequently evaluated using separate test data to assess generalization capabilities. The IODE and MODE models are provided with preprocessed insulin and meal time series as inputs, while their outputs, combined with blood glucose measurements, are used as inputs for the predictive model. A batch size of 64 is implemented for both ODEs and the predictive model. Based on our previous study [20], a 24 h sliding window of historical data is utilized to generate predictions 30 and 60 min in advance. Loss functions, combined with residual loss as previously described, are employed as cost functions. The weighting factor ω in the loss function is considered a hyperparameter and is optimized through grid search, ultimately yielding a final value of 1, assigning equal weight to data and residual loss. The Adam optimizer [51], well-suited for non-stationary blood glucose data, is used to minimize loss values, with parameter updates based on batch-averaged losses per epoch. The best-performing model configuration is retained for subsequent evaluation with unseen test data.
Training incorporates optimization parameters including a learning rate decay of 0.1, decay patience of 10, and early stopping patience of 30. All performance assessments are conducted using entirely separate test datasets. Model efficacy is evaluated by comparing the predicted blood glucose levels G ^ t + p at time horizon p minutes with the corresponding ground truth measurement G t + p . The entire development pipeline, including model implementation, hyperparameter optimization, training, and evaluation, is executed in JupyterLab (v4.1.1) using Python (v3.10.12) and PyTorch (v2.1.2+cu121).

2.5. Performance Evaluation Criteria

2.5.1. Qualitative Assessment

To assess the efficacy of the IODE and MODE models, their outputs are benchmarked against simulated reference data derived from forward-simulations of the corresponding physiological insulin and meal models. The models are trained exclusively using data from one subject, and this trained architecture is subsequently employed to generate data for all other subjects, eliminating the need for subject-specific model training protocols. For qualitative assessment, the comparison between predicted and simulated outputs is visualized through time-series plots. For qualitative assessment, the comparison between predicted and simulated outputs is visualized through time-series plots. For quantitative evaluation, the mean squared error (MSE), defined as the average of the squared differences between predicted and simulated outputs, is used to assess model performance. Strong agreement between predicted and simulated output suggests that the latent dynamics encapsulated by the ODE-based models accurately reflect expected physiological behavior, thereby validating the underlying modeling approach.

2.5.2. Quantitative Assessment

To evaluate the performance of the predictive model, two regression-based metrics are used to assess the agreement between predicted and reference blood glucose levels. These metrics serve as standard benchmarks in the evaluation of BGL predictive models.
Root Mean Square Error (RMSE)
RMSE measures the standard deviation of the prediction errors, reflecting how closely predictions match the reference values. Lower RMSE values indicate higher accuracy.
R M S E = 1 N i = 1 N ( G t + p G ^ t + p ) 2
Here, G t + p and G ^ t + p are the reference and predicted blood glucose levels, and N is the number of test samples.
Mean Absolute Error (MAE)
MAE computes the average absolute difference between predicted and actual values. It treats all errors equally and provides an intuitive measure of overall accuracy.
M A E = 1 N i = 1 N | G t + p G ^ t + p |
Lower MAE values reflect better predictive performance, while higher values suggest greater deviation from the ground truth.

2.5.3. Clinical Assessment

Regression metrics offer general insight but may overlook clinically critical errors. To address this, Clarke error grid (CEG) analysis [52] is employed for a more clinically relevant assessment. The CEG visualizes the relationship between predicted and reference BGLs using a scatterplot divided into five zones, each reflecting the potential clinical impact of the prediction error:
  • Region A: Predictions within 20% of reference values (accurate).
  • Region B: Outside 20% but clinically acceptable.
  • Region C: May lead to unnecessary treatment.
  • Region D: Potentially dangerous failure to detect critical events.
  • Region E: Errors leading to harmful opposite treatments.

2.5.4. Sensitivity Analysis

To validate the robustness of the estimated parameters, sensitivity analysis is conducted by training the model under varying constraint weights and learning rates. This evaluates how variations in model parameters affect outcomes. This involves training the model with different constraint weights and learning rates to assess parameter stability and robustness.

2.5.5. Counterfactual Analysis

To enhance interpretability, counterfactual analysis is employed to examine how changes in input features affect model predictions, highlighting key influencing factors [53]. By generating alternative scenarios (“what if” questions), it reveals causal relationships between inputs and model outputs, explaining which factors most significantly influence predictions [54]. This provides insights into the model’s decision-making process and actionable feedback for end users.

3. Experimental Results and Discussion

3.1. Qualitative Assessment

Figure 3 and Figure 4 depict the evolution of the Q p ( t ) and Q g u t ( t ) output over time obtained from the IODE and MODE models, comparing model output (blue line) against ground truth values (orange line). Spanning approximately 600 time steps, it shows an agreement between predicted and actual values throughout the time series, with only minor deviations primarily at peak points where sometimes the models slightly overestimate the ground truth and underestimate other times. This close alignment demonstrates the models’ strong capability in capturing the physiology of insulin and glucose dynamics with overall trend and nuanced fluctuations across various temporal phases. These findings are further supported by the results in Table 3, where both models yield consistently low MSE values across all subjects, reinforcing their capacity to faithfully represent the corresponding physiological dynamics.
Additionally, Figure 5 illustrates the dynamic evolution of the gastric emptying rate, k e m t ( t ) , as a function of the amount of glucose present in the stomach, denoted by Q s t o ( t ) . This behavior is quantitatively described by Equation (10), which models the nonlinear regulation of gastric emptying in response to the ingested meal size. As indicated by the equation, the emptying rate k e m t reaches its maximum value k m a x when the stomach is full, corresponding to the peak glucose load. Following this peak, the rate gradually decreases toward the minimum value k m i n as the stomach empties over time. This sigmoidal transition reflects physiological processes and shows how gastric emptying progresses. The visual comparison of predicted and reference k e m t in Figure 5 highlights this progression, illustrating the dynamic adaptation of the rate over time in response to the remaining glucose content. The predicted k e m t value closely follows the trajectory of the reference k e m t , with the primary difference being a lower minimum value k m i n observed in the predicted profile compared to the reference.
Similarly, Figure 6 illustrates the complex interplay between gastric emptying and glucose metabolism. When a meal is consumed (black X marks), the gastric emptying rate (orange line) determines how quickly food moves into the intestine, directly influencing intestinal glucose concentration (purple line). We can see that when gastric emptying temporarily slows around time step 150, glucose accumulates in the intestine before being absorbed, creating a distinct spike in the intestinal glucose curve. This is followed by gradual increases in blood glucose levels (blue line) and corresponding insulin responses (green line) due to insulin dose administration (gray dots). The physiological significance is profound: gastric emptying acts as a rate-limiting step in postprandial glucose appearance, functioning as a natural mechanism to modulate glucose absorption and prevent extreme blood sugar spikes. The model accurately captures the clinically observed relationships between digestion timing and glucose regulation, showing how variations in emptying rates directly affect the timing and magnitude of blood glucose fluctuations. This relationship also helps explain why gastric function is an important consideration in understanding glucose metabolism and blood glucose patterns.

3.2. Quantitative and Clinical Assessment

Table 4 presents the performance of the BGL predictive model across two prediction horizons, 30 min and 60 min, evaluated using RMSE, MAE, and Clarke Error Grid (CEG) metrics. RMSE, which captures the average deviation between predicted and actual values, indicates a moderate error range of 13.37 to 25.50 for the 30 min horizon, with the lowest error observed for subject 570 and the highest for subject 584. For the 60 min horizon, RMSE values increase, ranging from 24.75 (subject 552) to 43.55 (subject 567), reflecting reduced predictive accuracy over longer intervals. MAE follows a similar trend, increasing with the prediction horizon, which is consistent with the expected degradation in model accuracy over time. Despite this, the model demonstrates strong reliability, with CEG A + B percentages remaining high, between 97.21% and 99.89% for 30 min and above 93% for 60 min, indicating most predictions fall within clinically acceptable boundaries. Additionally, CEG C+D+E values remain low, suggesting minimal critical prediction errors. Overall, Table 4 confirms that while prediction accuracy declines slightly with longer horizons, the model remains robust and clinically acceptable across both intervals, as further supported by Figure 7 and Figure 8.

3.3. State-of-the-Art Comparison

Table 5 and Table 6 compare the performance of the proposed approach against state-of-the-art models across 30 and 60 min prediction horizons, respectively. As shown in Table 5, for the 30 min horizon, MTL-LSTM achieves the lowest RMSE for most patients; however, the proposed approach consistently ranks second and even outperforms all models in some cases (e.g., patients 570 and 596). This highlights its high prediction accuracy, often rivaling or surpassing models like MTL-LSTM and SAN, while clearly outperforming traditional baselines (Regression, GAN) and standard deep models (LSTM, Feed-Forward, Dilated RNN). Overall, Table 5 demonstrates that the proposed approach is a strong competitor at the 30 min horizon, delivering accuracy close to or better than the best-performing models. For the 60 min horizon, Table 6 shows that the proposed approach performs even better, achieving the lowest RMSE for patients 552, 570, and 596, with strong results elsewhere. Notably, MTL-LSTM’s performance drops significantly at this horizon, whereas SAN performs comparatively better. While models like MTL-LSTM and SAN may dominate in specific scenarios, they tend to fluctuate across horizons. In contrast, while our approach shows some variability at the 60 min horizon, it maintains competitive performance across both horizons overall, avoiding the significant performance drops seen in models like MTL-LSTM at longer prediction intervals. Furthermore, it offers more interpretable and physiologically grounded predictions, a quality often lacking in previous methods.

3.4. Parameter Estimation and Sensitivity Analysis

The estimated parameter values across different prediction horizons are summarized in Table 7. Since models trained for different horizons start with distinct initializations, the convergence of their parameter estimates to similar values (less than one percentage change between 30 min and 60 min predictions) demonstrates consistent physiological representations regardless of forecast timeframe. To further investigate parameter stability, we trained the model using various learning rates (0.01, 0.001, and 0.0001). As shown in Figure 9, the parameter values consistently converge to the same point, regardless of the learning rate. This convergence suggests that the parameters tend to possess well-constrained and physiologically meaningful values that the optimization process can reliably identify across varying initializations and learning rates, indicating that the available data and model constraints help anchor them to consistent estimates. As a result, the refined parameters exhibit physiological plausibility and internal consistency, supporting their potential use in physiological interpretation and predictive tasks.
Further inter-patient variability across all model parameters reveals low to moderate variability. ke shows low inter-patient variability (CV = 11.55%); ka2 exhibits moderate variability (CV = 26.01%), with one patient (patient 544: 0.0305) showing notably reduced absorption; kabs displays moderate variability (CV = 19.33%); kmax shows moderate variability (CV = 22.72%); and kmin exhibits the highest variability (CV = 37.53%), with patient 559 (0.0104) showing extremely slow minimal emptying and patient 588 (0.1129) showing elevated rates. Based on mixed-effects meal and simulator studies [60,61,62], where variability is encoded through parameter distributions, the observed ke and kabs show slightly lower variability (compared to the ranges 20–50% [60,61,62]; however, other parameters are broadly consistent with reported ranges. This slight compression likely reflects our limited sample size (n = 12). Unlike black-box models, our interpretable parameters enable clinical actionability. As our model achieves higher RMSE of 16.08 mg/dL compared to 13.27 mg/dL for state-of-the-art deep learning (Table 5), this modest 2.81 mg/dL trade-off is clinically justifiable: clinicians can identify that patient 559’s hyperglycemia stems from slow gastric emptying (kmin = 0.0104) and adjust insulin timing accordingly, an insight impossible with black-box models.
The sensitivity analyses examining variations in individual parameters are presented in Figure 10, Figure 11 and Figure 12, demonstrating physiologically plausible response patterns. The model’s predictive output exhibits a direct relationship with the insulin elimination rate from the body ( k e ), with an inverse relationship with insulin absorption rate into plasma ( k a 2 ). Despite variations in the insulin clearance rate ( k e ) ranging from 0.16 to 1, the predicted glucose levels remain relatively consistent compared to the small fluctuations observed due to changes in the rate of insulin absorption in plasma k a 2 . This indicates that the model’s predicted outcomes are more sensitive to variations in k a 2 than to those in k e . Additionally, Figure 12 illustrates the predicted blood glucose levels over 100 time steps under varying intestinal absorption rates ( k a b s ranging from 0.1 to 1), benchmarked against ground truth data (black line). Notably, despite a tenfold variation in k a b s , the predicted glucose trajectories remain remarkably consistent. This subtle non-monotonic behavior indicates that the model output is relatively insensitive to changes in this parameter. Among the tested values, intermediate absorption rates, particularly k a b s = 0.640 (red), consistently produce slightly elevated glucose levels compared to both lower and higher k a b s values. These results suggest that lower k a b s values reduce the intestinal glucose absorption, resulting in a prolonged but moderate elevation in blood glucose, while higher k a b s values accelerate glucose appearance in the bloodstream, producing sharper peaks followed by faster returns to baseline.

3.5. Counterfactual Analysis

Figure 13 depicts a detailed interpretability analysis of the model’s predictive behavior by illustrating how its blood glucose level (BGL) estimates respond to variations in key input features, namely, meal intake and bolus insulin dosage. For this experiment, two random patients, 570 and 575 are selected. Beginning from the original input configuration of (meal, bolus) = (0, 0), a series of counterfactual scenarios such as (0, 5), (30, 5), and (40, 5) are evaluated to simulate hypothetical treatment choices. The x-axis in the figure represents these alternative input conditions, while the y-axis displays the corresponding BGL predicted by the model. Under baseline conditions, the model predicts BGLs of 136 mg/dL for patient 570 and 102 mg/dL for patient 575. Adding 5 units of bolus insulin without carbohydrates minimally reduces patient 570’s BGL to 134 mg/dL, while patient 575 shows substantial decrement to 87 mg/dL, demonstrating insulin’s glucose-lowering effect. When meal intake increases to 30–40 g with 5 units bolus, BGLs rise to 135–136 mg/dL for patient 570 and 95–99 mg/dL for patient 575, reflecting carbohydrate impact. For 100 g meal intake, patient 575 exhibits markedly greater glucose elevation (102–131 mg/dL) than patient 570 (136–140 mg/dL). Overall, patient 575 shows significantly higher variability across all scenarios, attributable to differences in individual physiological parameters, highlighting the model’s ability to capture patient-specific metabolic responses. A higher insulin clearance rate k e of patient 575 (0.6880) compared to patient 570 (0.5981), as shown in Table 7, indicates faster insulin clearance resulting in elevated BGL predictions, suggesting that the model accounts for reduced insulin efficacy due to shorter insulin activity. Also, significant early insulin activity is seen in the graph due to rapid insulin absorption with higher insulin absorption rate k a 2 . Additionally, under identical meal and bolus conditions, individuals with slower gastric emptying exhibited a delayed and lower glycemic response; this is reflected in reduced predicted BGLs for 570, as shown in Figure 14.
In the context of postprandial glucose dynamics, gastric emptying plays a critical role in determining the rate at which ingested glucose appears in the bloodstream [63]. Analysis of these two patients’ gastric emptying profiles following varying meal sizes, as illustrated in Figure 14, revealed that patient 575 maintained a relatively high emptying rate even after a large glucose load (100 g), as indicated by the orange dotted line, whereas patient 570 exhibited a marked suppression in gastric emptying under the same condition (blue dotted line). As a result, patient 575 is likely to experience a more rapid and pronounced postprandial rise in blood glucose due to faster delivery of glucose to the intestine and subsequent absorption. In contrast, patient 570’s slower gastric emptying suggests a more gradual glucose appearance, potentially leading to a flatter glycemic response. Overall, the model not only produces accurate numerical predictions but also encodes physiologically coherent relationships between input features and blood glucose levels. It captures the complex interplay between insulin action, carbohydrate absorption, and metabolic parameters, thereby explaining its internal decision-making process and enhancing trust in its predictions.

3.6. Ablation Study

Table 8 presents an ablation study comparing two model configurations, the baseline BGL predictive network (BGN) and the proposed PIBGN, across 12 subjects at 30 and 60 min prediction horizons, where lower RMSE values indicate better performance. The baseline BGN consists of only the predictive network with BGL as a single input, without the NODEs. The results reveal a consistent yet nuanced pattern: at the 30 min horizon, PIBGN outperforms BGN for all 12 subjects, with improvements ranging from modest (0.34 for subject 570) to substantial (3.98 for subject 575). At the 60 min horizon, the advantage of PIBGN becomes less uniform: PIBGN still outperforms BGN in 8 of 12 subjects (544, 552, 559, 575, 584, 588, 591, 596), often by a large margin, while BGN performs slightly better in 4 subjects (540, 563, 567, 570), though the differences are minimal. Overall, Table 8 confirms that the inclusion of physiological components in PIBGN consistently improves prediction accuracy at the 30 min horizon, with results remaining generally competitive at the 60 min horizon, albeit with some variability.
Taken together, the state-of-the-art comparison and ablation study demonstrate that PIBGN achieves consistently competitive performance at the 30 min prediction horizon (Table 5). Performance at the 60 min horizon is variable (Table 6), with PIBGN showing advantages over MTL-LSTM for some subjects but not others. For a small number of subjects, the baseline BGN model achieves superior performance at this longer horizon (Table 8). This suggests that the advantages of physiological integration are more pronounced at shorter prediction horizons, while longer-horizon predictions are mixed. While predictive accuracy at longer horizons is limited, PIBGN predictions remain physiologically plausible and interpretable, which offer value in clinical settings where mechanistic consistency is relevant alongside numerical accuracy.

4. Limitations and Future Directions

Several limitations of the present study should be acknowledged. As the primary research question focuses on the value of integrating physiological knowledge into a deep learning framework, rather than replacing it entirely with physiological modeling alone, the baseline deep learning model is considered the most appropriate comparator. As such, a standalone physiological forward simulation baseline is not included in the current study. Beyond this, although the ablation study confirms that embedding physiological knowledge into the framework enhances predictive performance, the individual contribution of the transfer learning component remains unquantified. Both omissions are acknowledged as limitations and represent concrete directions for future investigation, as they would more fully disentangle the respective contributions of the physiological model structure, the deep learning architecture, and the transfer learning component.
The present study leverages the OhioT1DM dataset as an initial investigation within a well-characterized cohort of 12 Type 1 diabetes subjects, yielding meaningful insights into parameter identifiability and inter-patient variability. Expanding the validation framework to encompass larger and more diverse cohorts, alongside expert clinical evaluation, represents a natural and important direction for future research to further strengthen the generalizability of the findings to the broader Type 1 diabetes population. It is further worth noting that the sensitivity analysis revealed limited influence of certain parameters, particularly kabs, on predictive output. This observation may reflect inherent characteristics of the model formulation under current assumptions, potential parameter interactions, or the scope of patient diversity within the cohort, each representing an informative avenue for the investigation. Additionally, the current framework leverages CGM readings, insulin doses, and meal intake as primary inputs, offering a focused and clinically accessible configuration. The integration of additional contextual variables available within the dataset presents a promising opportunity to further enhance predictive performance in subsequent work.

5. Conclusions

In this study, we present a novel physiology-informed blood glucose predictive model leveraging neural ODE to learn and feed the insulin and meal dynamics to the predictive model. The proposed model delivers accurate and physiologically interpretable predictions of blood glucose levels, which are validated with different validation methodologies. The ODE results demonstrate the model’s strong capability to represent the underlying physiology of insulin and glucose dynamics, accurately reproducing clinically observed relationships between digestion timing and glucose regulation. The findings also show that variations in gastric emptying rates directly affect both the timing and amplitude of blood glucose fluctuations.
Furthermore, analytical and clinical evaluations confirm the model’s robust and clinically acceptable predictive performance. Parameter identifiability analysis suggests that the model tends to converge to well-constrained parameter values across different initializations and learning rates, providing preliminary evidence of solution stability. While certain parameters contribute less directly to predictive accuracy, sensitivity analysis reveals that model outputs remain responsive to variations across these parameters, affirming their physiological relevance. The model’s interpretability is further substantiated through counterfactual analysis, which demonstrates its capacity not only to generate precise numerical forecasts but also to capture meaningful and explainable relationships between insulin, meals, and BGLs, a capability that supports clinical decision-making and is notably absent in most state-of-the-art methods. Collectively, these analyses provide preliminary evidence that the adjusted parameters preserve physiological interpretability, yielding plausible and clinically coherent outputs. While the model achieved marginally higher RMSE compared to some state-of-the-art deep learning methods, the physiological interpretability, mechanistic consistency, personalization, and clinical transparency it offers represent meaningful advantages over purely data-driven approaches, qualities that justify this modest accuracy trade-off in real-world medical applications, where physiological consistency and explainability are as important as numerical accuracy.

Author Contributions

Conceptualization, S.G.; methodology, S.G.; software, S.G. and T.C.; validation, S.G., T.C., M.G. and C.W.O.; formal analysis, S.G. and C.W.O.; investigation, S.G. and T.C.; resources, S.G.; data curation, S.G.; writing—original draft preparation, S.G.; writing—review and editing, S.G., T.C., M.G. and C.W.O.; visualization, T.C., M.G. and C.W.O.; supervision, T.C., M.G. and C.W.O.; project administration, C.W.O. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The Ohio datasets utilized in this study are publicly available upon request via the following link: https://ohio.qualtrics.com/jfe/form/SV_02QtWEVm7ARIKIl (accessed on 19 September 2023).

Conflicts of Interest

This work was conducted while C.W. Omlin was affiliated with the University of Agder. The author is currently affiliated with Pingla Institute, Australia. Pingla Institute had no role in the design, execution, interpretation, or writing of this study. S.G. is also employed by University of Oslo (UiO), where she works on neural network methods for engineering applications. The work in UiO is unrelated to the present study.

Abbreviations

The following abbreviations are used in this manuscript:
BGLBlood glucose level
PINNPhysiology-informed neural network
CGMContinuous glucose monitoring
LSTMLong short-term memory
CNNConvolutional neural networks
TCNTemporal CNN
NODENeural ordinary differential equations
MODEMeal neural ordinary differential equations
IODEInsulin neural ordinary differential equations
PIBGNPhysiology-informed blood glucose prediction network
RMSERoot mean square error
MAEMean absolute error
CEGClarke error grid

References

  1. Georga, E.I.; Protopappas, V.C.; Ardigo, D.; Marina, M.; Zavaroni, I.; Polyzos, D.; Fotiadis, D.I. Multivariate prediction of subcutaneous glucose concentration in type 1 diabetes patients based on support vector regression. IEEE J. Biomed. Health Inform. 2012, 17, 71–81. [Google Scholar] [CrossRef]
  2. Li, K.; Daniels, J.; Liu, C.; Herrero, P.; Georgiou, P. Convolutional recurrent neural networks for glucose prediction. IEEE J. Biomed. Health Inform. 2019, 24, 603–613. [Google Scholar] [CrossRef]
  3. Zhu, T.; Li, K.; Herrero, P.; Georgiou, P. Personalized blood glucose prediction for type 1 diabetes using evidential deep learning and meta-learning. IEEE Trans. Biomed. Eng. 2022, 70, 193–204. [Google Scholar] [CrossRef]
  4. Zhu, T.; Li, K.; Herrero, P.; Georgiou, P. Deep learning for diabetes: A systematic review. IEEE J. Biomed. Health Inform. 2020, 25, 2744–2757. [Google Scholar] [CrossRef] [PubMed]
  5. Aliberti, A.; Pupillo, I.; Terna, S.; Macii, E.; Di Cataldo, S.; Patti, E.; Acquaviva, A. A multi-patient data-driven approach to blood glucose prediction. IEEE Access 2019, 7, 69311–69325. [Google Scholar] [CrossRef]
  6. Zhu, T.; Li, K.; Herrero, P.; Georgiou, P. Glugan: Generating personalized glucose time series using generative adversarial networks. IEEE J. Biomed. Health Inform. 2023, 27, 5122–5133. [Google Scholar] [CrossRef]
  7. Deng, Y.; Lu, L.; Aponte, L.; Angelidi, A.M.; Novak, V.; Karniadakis, G.E.; Mantzoros, C.S. Deep transfer learning and data augmentation improve glucose levels prediction in type 2 diabetes patients. NPJ Digit. Med. 2021, 4, 109. [Google Scholar] [CrossRef]
  8. Kovatchev, B. Automated closed-loop control of diabetes: The artificial pancreas. Bioelectron. Med. 2018, 4, 14. [Google Scholar] [CrossRef]
  9. Oviedo, S.; Vehí, J.; Calm, R.; Armengol, J. A review of personalized blood glucose prediction strategies for T1DM patients. Int. J. Numer. Methods Biomed. Eng. 2017, 33, e2833. [Google Scholar] [CrossRef] [PubMed]
  10. Woldaregay, A.Z.; Årsand, E.; Walderhaug, S.; Albers, D.; Mamykina, L.; Botsis, T.; Hartvigsen, G. Data-driven modeling and prediction of blood glucose dynamics: Machine learning applications in type 1 diabetes. Artif. Intell. Med. 2019, 98, 109–134. [Google Scholar] [CrossRef] [PubMed]
  11. Xie, J.; Wang, Q. Benchmarking machine learning algorithms on blood glucose prediction for type I diabetes in comparison with classical time-series models. IEEE Trans. Biomed. Eng. 2020, 67, 3101–3124. [Google Scholar] [CrossRef] [PubMed]
  12. Afsaneh, E.; Sharifdini, A.; Ghazzaghi, H.; Ghobadi, M.Z. Recent applications of machine learning and deep learning models in the prediction, diagnosis, and management of diabetes: A comprehensive review. Diabetol. Metab. Syndr. 2022, 14, 196. [Google Scholar] [CrossRef]
  13. Liu, K.; Li, L.; Ma, Y.; Jiang, J.; Liu, Z.; Ye, Z.; Liu, S.; Pu, C.; Chen, C.; Wan, Y.; et al. Machine learning models for blood glucose level prediction in patients with diabetes mellitus: Systematic review and network meta-analysis. JMIR Med. Inform. 2023, 11, e47833. [Google Scholar] [CrossRef]
  14. Lara-Benítez, P.; Carranza-García, M.; Riquelme, J.C. An experimental review on deep learning architectures for time series forecasting. Int. J. Neural Syst. 2021, 31, 2130001. [Google Scholar] [CrossRef]
  15. Sezer, O.B.; Gudelek, M.U.; Ozbayoglu, A.M. Financial time series forecasting with deep learning: A systematic literature review: 2005–2019. Appl. Soft Comput. 2020, 90, 106181. [Google Scholar] [CrossRef]
  16. Lim, B.; Zohren, S. Time-series forecasting with deep learning: A survey. Philos. Trans. R. Soc. A 2021, 379, 20200209. [Google Scholar] [CrossRef]
  17. Gasparin, A.; Lukovic, S.; Alippi, C. Deep learning for time series forecasting: The electric load case. CAAI Trans. Intell. Technol. 2022, 7, 1–25. [Google Scholar] [CrossRef]
  18. Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; pp. 38–45. [Google Scholar]
  19. Lee, S.M.; Kim, D.Y.; Woo, J. Glucose transformer: Forecasting glucose level and events of hyperglycemia and hypoglycemia. IEEE J. Biomed. Health Inform. 2023, 27, 1600–1611. [Google Scholar] [CrossRef]
  20. Cui, R.; Hettiarachchi, C.; Nolan, C.J.; Daskalaki, E.; Suominen, H. Personalised Short-Term Glucose Prediction via Recurrent Self-Attention Network. In Proceedings of the 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), Aveiro, Portugal, 7–9 June 2021; pp. 154–159. [Google Scholar] [CrossRef]
  21. Armandpour, M.; Kidd, B.; Du, Y.; Huang, J.Z. Deep personalized glucose level forecasting using attention-based recurrent neural networks. In Proceedings of the 2021 IEEE International Joint Conference on Neural Networks (IJCNN), Shenzhen, China, 18–22 July 2021; pp. 1–8. [Google Scholar]
  22. Mirshekarian, S.; Shen, H.; Bunescu, R.; Marling, C. LSTMs and neural attention models for blood glucose prediction: Comparative experiments on real and synthetic data. In Proceedings of the 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Berlin, Germany, 23–27 July 2019; pp. 706–712. [Google Scholar]
  23. Ghimire, S.; Celik, T.; Gerdes, M.; Omlin, C.W. Deep learning for blood glucose level prediction: How well do models generalize across different data sets? PLoS ONE 2024, 19, e0310801. [Google Scholar] [CrossRef] [PubMed]
  24. Seo, W.; Park, S.W.; Kim, N.; Jin, S.M.; Park, S.M. A personalized blood glucose level prediction model with a fine-tuning strategy: A proof-of-concept study. Comput. Methods Programs Biomed. 2021, 211, 106424. [Google Scholar] [CrossRef]
  25. Bhargav, S.; Kaushik, S.; Dutt, V. Temporal Convolutional Networks Involving Multi-Patient Approach for Blood Glucose Level Predictions. In Proceedings of the 2021 IEEE International Conference on Computational Performance Evaluation (ComPE), Shillong, India, 1–3 December 2021; pp. 288–294. [Google Scholar]
  26. Ghimire, S.; Çelik, T.; Gerdes, M.; Omlin, C.W. Physiology-Guided Blood Glucose Predictive Model Using Minimal Blood Glucose Dynamics. In Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2025), Porto, Portugal, 20–22 February 2025; pp. 984–990. [Google Scholar]
  27. Bertachi, A.; Biagi, L.; Contreras, I.; Luo, N.; Vehí, J. Prediction of blood glucose levels and nocturnal hypoglycemia using physiological models and artificial neural networks. In Proceedings of the 3rd International Workshop on Knowledge Discovery in Healthcare Data Co-Located with the 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence (IJCAI-ECAI 2018), Stockholm, Schweden, 13–19 July 2018; pp. 85–90. [Google Scholar]
  28. Contreras, I.; Bertachi, A.; Biagi, L.; Vehí, J.; Oviedo, S. Using grammatical evolution to generate short-term blood glucose prediction models. In Proceedings of the 3rd International Workshop on Knowledge Discovery in Healthcare Data Co-Located with the 27th International Joint Conference on Artificial Intelligence and the 23rd European Conference on Artificial Intelligence (IJCAI-ECAI 2018), Stockholm, Schweden, 13–19 July 2018; pp. 91–96. [Google Scholar]
  29. Contreras, I.; Oviedo, S.; Vettoretti, M.; Visentin, R.; Vehí, J. Personalized blood glucose prediction: A hybrid approach using grammatical evolution and physiological models. PLoS ONE 2017, 12, e0187754. [Google Scholar] [CrossRef]
  30. Erdos, B.; van Sloun, B.; Goossens, G.H.; O’Donovan, S.D.; de Galan, B.E.; van Greevenbroek, M.M.; Stehouwer, C.D.; Schram, M.T.; Blaak, E.E.; Adriaens, M.E.; et al. Quantifying postprandial glucose responses using a hybrid modeling approach: Combining mechanistic and data-driven models in The Maastricht Study. PLoS ONE 2023, 18, e0285820. [Google Scholar] [CrossRef]
  31. Mougiakakou, S.G.; Prountzou, A.; Iliopoulou, D.; Nikita, K.S.; Vazeou, A.; Bartsocas, C.S. Neural network based glucose-insulin metabolism models for children with type 1 diabetes. In Proceedings of the 2006 International Conference of the IEEE Engineering in Medicine and Biology Society, New York, NY, USA, 31 August–3 September 2006; pp. 3545–3548. [Google Scholar]
  32. Sun, X.; Rashid, M.M.; Sevil, M.; Hobbs, N.; Brandt, R.; Askari, M.R.; Shahidehpour, A.; Cinar, A. Prediction of Blood Glucose Levels for People with Type 1 Diabetes using Latent-Variable-based Model. In Proceedings of the 5th International Workshop on Knowledge Discovery in Healthcare Data Co-Located with 24th European Conference on Artificial Intelligence, KDH@ECAI 2020, Santiago de Compostela, Spain, 29–30 August 2020; pp. 115–119. [Google Scholar]
  33. Yazdani, A.; Lu, L.; Raissi, M.; Karniadakis, G.E. Systems biology informed deep learning for inferring parameters and hidden dynamics. PLoS Comput. Biol. 2020, 16, e1007575. [Google Scholar] [CrossRef]
  34. De Carli, S.; Licini, N.; Previtali, D.; Previdi, F.; Ferramosca, A. Integrating biological-informed recurrent neural networks for glucose-insulin dynamics modeling. IFAC-PapersOnLine 2025, 59, 91–96. [Google Scholar] [CrossRef]
  35. Multerer, L.; Acquistapace, M.; Forgione, M.; Azzimonti, L. Physics-Informed Neural Networks for Hidden Insulin Dynamics Estimation from Glucose Data. In Proceedings of the International Conference on Artificial Intelligence in Medicine; Springer: Berlin/Heidelberg, Germany, 2025; pp. 283–287. [Google Scholar]
  36. Zou, B.J.; Tian, L. Automatic and Structure-Aware Sparsification of Hybrid Neural ODEs with Application to Glucose Prediction. In Proceedings of the Fourteenth International Conference on Learning Representations, Rio de Janeiro, Brazil, 23–27 April 2026. [Google Scholar]
  37. Roquemen-Echeverri, V.; Kushner, T.; Jacobs, P.G.; Mosquera-Lopez, C. A Physiologically-Constrained Neural Network Digital Twin Framework for Replicating Glucose Dynamics in Type 1 Diabetes. arXiv 2025, arXiv:2508.05705. [Google Scholar] [CrossRef]
  38. Wang, W.; Pei, R.; Li, D.; Liu, S.; Geng, Y.; Wang, S. A physics-informed glucose-insulin neural network model for glucose prediction. Tsinghua Sci. Technol. 2025, 31, 9010140. [Google Scholar] [CrossRef]
  39. Wang, M.; Zhang, H.; Song, R. Blood glucose prediction algorithm via pinn under bergman’s minimal model constraints. In Proceedings of the 2025 IEEE 14th Data Driven Control and Learning Systems (DDCLS), Wuxi, China, 9–11 May 2025; pp. 1797–1802. [Google Scholar]
  40. Chen, R.T.; Rubanova, Y.; Bettencourt, J.; Duvenaud, D.K. Neural ordinary differential equations. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
  41. Faggionato, E.; Schiavon, M.; Ekhlaspour, L.; Buckingham, B.A.; Dalla Man, C. The minimally-invasive oral glucose minimal model: Estimation of gastric retention, glucose rate of appearance, and insulin sensitivity from type 1 diabetes data collected in real-life conditions. IEEE Trans. Biomed. Eng. 2023, 71, 977–986. [Google Scholar] [CrossRef] [PubMed]
  42. Schiavon, M.; Dalla Man, C.; Cobelli, C. Modeling subcutaneous absorption of fast-acting insulin in type 1 diabetes. IEEE Trans. Biomed. Eng. 2017, 65, 2079–2086. [Google Scholar] [CrossRef]
  43. Dalla Man, C.; Camilleri, M.; Cobelli, C. A system model of oral glucose absorption: Validation on gold standard data. IEEE Trans. Biomed. Eng. 2006, 53, 2472–2478. [Google Scholar] [CrossRef] [PubMed]
  44. Su, X.; Ji, W.; An, J.; Ren, Z.; Deng, S.; Law, C.K. Kinetics parameter optimization via neural ordinary differential equations. arXiv 2022, arXiv:2209.01862. [Google Scholar] [CrossRef]
  45. Raissi, M.; Perdikaris, P.; Karniadakis, G.E. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. J. Comput. Phys. 2019, 378, 686–707. [Google Scholar] [CrossRef]
  46. Faggionato, E.; Schiavon, M.; Dalla Man, C. Modeling between-subject variability in subcutaneous absorption of a fast-acting insulin analogue by a nonlinear mixed effects approach. Metabolites 2021, 11, 235. [Google Scholar] [CrossRef]
  47. Bergman, R.N. Toward physiological understanding of glucose tolerance: Minimal-model approach. Diabetes 1989, 38, 1512–1527. [Google Scholar] [CrossRef]
  48. Tong, H. Functional linear regression with Huber loss. J. Complex. 2023, 74, 101696. [Google Scholar] [CrossRef]
  49. Marling, C.; Bunescu, R. The OhioT1DM dataset for blood glucose level prediction: Update 2020. CEUR Workshop Proc. 2020, 2675, 71. [Google Scholar]
  50. Ohio University. Requesting a Data Use Agreement for the OhioT1DM Dataset. Available online: https://ohio.qualtrics.com/jfe/form/SV_02QtWEVm7ARIKIl (accessed on 13 March 2026).
  51. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
  52. Clarke, W.L. The original Clarke error grid analysis (EGA). Diabetes Technol. Ther. 2005, 7, 776–779. [Google Scholar] [CrossRef]
  53. Wachter, S.; Mittelstadt, B.; Russell, C. Counterfactual explanations without opening the black box: Automated decisions and the GDPR. Harv. J. Law Technol. 2017, 31, 841. [Google Scholar] [CrossRef]
  54. Mothilal, R.K.; Sharma, A.; Tan, C. Explaining machine learning classifiers through diverse counterfactual explanations. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency, Barcelona, Spain, 27–30 January 2020; pp. 607–617. [Google Scholar]
  55. Martinsson, J.; Schliep, A.; Eliasson, B.; Meijner, C.; Persson, S.; Mogren, O. Automatic blood glucose prediction with confidence using recurrent neural networks. In Proceedings of the 3rd International Workshop on Knowledge Discovery in Healthcare Data, KDH@ IJCAI-ECAI 2018, Stockholm, Sweden, 13–19 July 2018; pp. 64–68. [Google Scholar]
  56. Zhu, T.; Li, K.; Chen, J.; Herrero, P.; Georgiou, P. Dilated recurrent neural networks for glucose forecasting in type 1 diabetes. J. Healthc. Inform. Res. 2020, 4, 308–324. [Google Scholar] [CrossRef] [PubMed]
  57. Nemat, H.; Khadem, H.; Elliott, J.; Benaissa, M. Data fusion of activity and CGM for predicting blood glucose levels. In Proceedings of the 5th International Workshop on Knowledge Discovery in Healthcare Data, Santiago de Compostela, Spain, 29–30 August 2020; CEUR Workshop Proceedings. Volume 2675, pp. 120–124. [Google Scholar]
  58. Zhu, T.; Yao, X.; Li, K.; Herrero, P.; Georgiou, P. Blood glucose prediction for type 1 diabetes using generative adversarial networks. CEUR Workshop Proc. 2020, 2675, 90–94. [Google Scholar]
  59. Shuvo, M.M.H.; Islam, S.K. Deep Multitask Learning by Stacked Long Short-Term Memory for Predicting Personalized Blood Glucose Concentration. IEEE J. Biomed. Health Inform. 2023, 27, 1612–1623. [Google Scholar] [CrossRef]
  60. Wilinska, M.E.; Chassin, L.J.; Acerini, C.L.; Allen, J.M.; Dunger, D.B.; Hovorka, R. Simulation environment to evaluate closed-loop insulin delivery systems in type 1 diabetes. J. Diabetes Sci. Technol. 2010, 4, 132–144. [Google Scholar] [CrossRef]
  61. Visentin, R.; Dalla Man, C.; Cobelli, C. One-day Bayesian cloning of type 1 diabetes subjects: Toward a single-day UVA/Padova type 1 diabetes simulator. IEEE Trans. Biomed. Eng. 2016, 63, 2416–2424. [Google Scholar] [CrossRef] [PubMed]
  62. Alskär, O.; Bagger, J.I.; Røge, R.M.; Knop, F.K.; Karlsson, M.O.; Vilsbøll, T.; Kjellsson, M.C. Semimechanistic model describing gastric emptying and glucose absorption in healthy subjects and patients with type 2 diabetes. J. Clin. Pharmacol. 2016, 56, 340–348. [Google Scholar] [CrossRef] [PubMed]
  63. Jalleh, R.J.; Jones, K.L.; Rayner, C.K.; Marathe, C.S.; Wu, T.; Horowitz, M. Normal and disordered gastric emptying in diabetes: Recent insights into (patho) physiology, management and impact on glycaemic control. Diabetologia 2022, 65, 1981–1993. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Schematic diagram of multi-stage training network of proposed physiology-informed BGL predictive network with parameter freezing: Stage 1 freezes parameters θ k 1 while training θ k 2 for output prediction; Stage 2 unfreezes both θ k 1 and θ k 2 and is trained end to end with the BGL predictive network. Here, k I , M indicates the parameters of the IODE and MODE models. The framework integrates multiple validation methods (predictive performance, clinical validation, comparative analysis, sensitivity analysis, explainability/interpretability analysis), allowing visualization and interpretability to ensure reliable prediction and parameter estimation.
Figure 1. Schematic diagram of multi-stage training network of proposed physiology-informed BGL predictive network with parameter freezing: Stage 1 freezes parameters θ k 1 while training θ k 2 for output prediction; Stage 2 unfreezes both θ k 1 and θ k 2 and is trained end to end with the BGL predictive network. Here, k I , M indicates the parameters of the IODE and MODE models. The framework integrates multiple validation methods (predictive performance, clinical validation, comparative analysis, sensitivity analysis, explainability/interpretability analysis), allowing visualization and interpretability to ensure reliable prediction and parameter estimation.
Make 08 00096 g001
Figure 2. Overall training process of the insulin kinetics model using neural ordinary differential equation.
Figure 2. Overall training process of the insulin kinetics model using neural ordinary differential equation.
Make 08 00096 g002
Figure 3. Comparison of the temporal evolution of plasma insulin amount ( Q p ( t ) ground truth) obtained with simulation results of insulin mathematical model and predicted by the IODE model ( Q p ( t ) ) to evaluate the IODE’s accuracy in capturing physiological insulin kinetics.
Figure 3. Comparison of the temporal evolution of plasma insulin amount ( Q p ( t ) ground truth) obtained with simulation results of insulin mathematical model and predicted by the IODE model ( Q p ( t ) ) to evaluate the IODE’s accuracy in capturing physiological insulin kinetics.
Make 08 00096 g003
Figure 4. Comparison of the temporal evolution of glucose present in the intestine ( Q g u t ( t ) ground truth) obtained with simulation results of meal mathematical model and predicted by the MODE model ( Q g u t ( t ) ) to evaluate the MODE’s accuracy in capturing meal absorption kinetics.
Figure 4. Comparison of the temporal evolution of glucose present in the intestine ( Q g u t ( t ) ground truth) obtained with simulation results of meal mathematical model and predicted by the MODE model ( Q g u t ( t ) ) to evaluate the MODE’s accuracy in capturing meal absorption kinetics.
Make 08 00096 g004
Figure 5. Comparison of the temporal evolution of gastric emptying (Reference k e m t ( t ) ) obtained with simulation results of meal mathematical model and predicted by MODE model (Predicted k e m t ( t ) ).
Figure 5. Comparison of the temporal evolution of gastric emptying (Reference k e m t ( t ) ) obtained with simulation results of meal mathematical model and predicted by MODE model (Predicted k e m t ( t ) ).
Make 08 00096 g005
Figure 6. Graph illustrating the complex interplay between gastric emptying and glucose metabolism over time. The plot shows BGLs (blue line), gastric emptying rate (orange line), intestinal glucose concentration (purple dashed line), plasma insulin levels (green line), meal timing (black X) and insulin doses (gray dots). The plot demonstrates how variations in gastric emptying rate or insulin amount modulate glucose appearance in the intestine and subsequent blood glucose responses.
Figure 6. Graph illustrating the complex interplay between gastric emptying and glucose metabolism over time. The plot shows BGLs (blue line), gastric emptying rate (orange line), intestinal glucose concentration (purple dashed line), plasma insulin levels (green line), meal timing (black X) and insulin doses (gray dots). The plot demonstrates how variations in gastric emptying rate or insulin amount modulate glucose appearance in the intestine and subsequent blood glucose responses.
Make 08 00096 g006
Figure 7. BG trajectory plot comparing predicted and ground truth values over a two-day period at a 30 min prediction horizon, with glycemic zones indicated and extreme glycemic events highlighted by red and purple circles.
Figure 7. BG trajectory plot comparing predicted and ground truth values over a two-day period at a 30 min prediction horizon, with glycemic zones indicated and extreme glycemic events highlighted by red and purple circles.
Make 08 00096 g007
Figure 8. Clarke error grid plot for subject 540 at the 30 min prediction horizon, illustrating the clinical accuracy of predicted blood glucose values relative to reference measurements. The grid divides predictions into Zones A–E, ranging from clinically accurate or acceptable (A–B) to increasingly inaccurate and potentially harmful (C–E).
Figure 8. Clarke error grid plot for subject 540 at the 30 min prediction horizon, illustrating the clinical accuracy of predicted blood glucose values relative to reference measurements. The grid divides predictions into Zones A–E, ranging from clinically accurate or acceptable (A–B) to increasingly inaccurate and potentially harmful (C–E).
Make 08 00096 g008
Figure 9. Visualization of the evolution of parameters ke and ka2, demonstrating the convergence pattern with different initial learning rates (0.01 to 0.0001).
Figure 9. Visualization of the evolution of parameters ke and ka2, demonstrating the convergence pattern with different initial learning rates (0.01 to 0.0001).
Make 08 00096 g009
Figure 10. Sensitivity analysis of parameter k e (insulin elimination rate) describing the relationship between k e and predicted blood glucose levels.
Figure 10. Sensitivity analysis of parameter k e (insulin elimination rate) describing the relationship between k e and predicted blood glucose levels.
Make 08 00096 g010
Figure 11. Sensitivity analysis of parameter k a 2 (insulin absorptio rate) describing the relationship between k a 2 and predicted blood glucose levels.
Figure 11. Sensitivity analysis of parameter k a 2 (insulin absorptio rate) describing the relationship between k a 2 and predicted blood glucose levels.
Make 08 00096 g011
Figure 12. Sensitivity analysis of parameter k a b s (intestinal absorption rate of glucose) describing the relationship between k a b s and predicted blood glucose levels.
Figure 12. Sensitivity analysis of parameter k a b s (intestinal absorption rate of glucose) describing the relationship between k a b s and predicted blood glucose levels.
Make 08 00096 g012
Figure 13. Visualization of predicted BGL values for counterfactual scenarios with varying meals and boluses for two patients, 570 and 575, represented by blue and orange lines, respectively. The red dashed line indicates the current observed value of BGL.
Figure 13. Visualization of predicted BGL values for counterfactual scenarios with varying meals and boluses for two patients, 570 and 575, represented by blue and orange lines, respectively. The red dashed line indicates the current observed value of BGL.
Make 08 00096 g013
Figure 14. Visualization of gastric emptying rates over time (pre- and post-meal) for patients 570 (blue lines) and 575 (orange lines) with varying meal sizes (0 to 100) and insulin amounts (0 to 10).
Figure 14. Visualization of gastric emptying rates over time (pre- and post-meal) for patients 570 (blue lines) and 575 (orange lines) with varying meal sizes (0 to 100) and insulin amounts (0 to 10).
Make 08 00096 g014
Table 1. Overview of sample composition across years, genders, and age groups.
Table 1. Overview of sample composition across years, genders, and age groups.
YearGenderAge (yrs)PIDTrain Sample SizesTest Sample Sizes
2018Female40–6059110,8472760
58812,6402791
57511,8662590
55910,7962514
Male40–6057010,9822745
56312,1242570
2020Male40–6054410,6232704
58412,1502653
60–8059610,8772731
20–4054011,9472884
55290802352
Female20–4056710,8582377
Table 2. Summary of hyperparameter tuning results and optimal model configuration.
Table 2. Summary of hyperparameter tuning results and optimal model configuration.
ModelHyperparametersRangeOptimal Value
MODENumber of layers/sub models1–64/3
Number of neurons16, 32, 64, 128, 256128, 64
Dropout0.1–10
Learning rate0.01–0.000010.0001
α —Initial value0.1–0.90.4
α —Convergence patience5–2010
α —Step size0.1–0.20.1
IODENumber of layers/sub models1–64–5/4
Number of neurons16, 32, 64, 128, 256128, 64
Dropout0.1–10
Learning rate0.01–0.000010.0001
α —Initial value0.1–0.90.4
α —Convergence patience5–2010
α —Step size0.1–0.20.1
PredictiveNumber of hidden units32, 64, 128, 192, 256 128 × 3
Dropout0.1–0.50
Learning rate0.01–0.000010.0001
Batch size16–102464
Weighting factor ω 0.001–11
Table 3. Summary of prediction error (mean squared error (MSE)) for insulin model (IODE) and meal model (MODE) evaluated across all individual patients.
Table 3. Summary of prediction error (mean squared error (MSE)) for insulin model (IODE) and meal model (MODE) evaluated across all individual patients.
Subject ID
Models 540 544 552 567 584 596 559 563 570 575 588 591
IODE Model0.01300.0460.0160.0460.0260.00330.00590.01460.0910.0120.00640.0034
MODE Model0.00480.0200.00830.0110.0110.00380.00380.0240.0140.0120.00260.0048
Table 4. Performance evaluation of the BGL predictive model across 30 and 60 min prediction horizons using RMSE, MAE, and CEG metrics.
Table 4. Performance evaluation of the BGL predictive model across 30 and 60 min prediction horizons using RMSE, MAE, and CEG metrics.
Subject ID
PH (min) Evaluation Metrics 540 544 552 559 563 567 570 575 584 588 591 596
30AnalyticalRMSE20.5614.6515.9920.1518.1324.6815.5621.5225.5017.2220.9716.46
MAE15.5211.8111.4912.8612.7515.1610.5513.7316.3512.3514.8311.41
CEGA + B (%)98.1899.6699.1499.3199.4498.3899.8598.7198.5299.8997.2198.41
C + D + E (%)1.810.330.850.680.551.610.141.281.480.102.781.58
60AnalyticalRMSE39.7625.6424.8433.8732.1733.722.3134.5931.9629.2829.7723.83
MAE29.1918.4818.8123.6322.7426.1915.6524.2221.5920.2321.4617.01
CEGA + B (%)93.6099.5893.7295.6497.9894.1399.4094.8396.7699.0294.2399.43
C + D + E (%)6.390.416.274.352.015.860.595.163.230.985.760.56
Table 5. RMSE-based performance comparison of the proposed PIBGN against state-of-the-art methods for the 30 min prediction horizon.
Table 5. RMSE-based performance comparison of the proposed PIBGN against state-of-the-art methods for the 30 min prediction horizon.
Subject ID
Studies 540 544 552 567 584 596 559 563 570 575 588 591
LSTM [55]------19.519.016.524.219.222.0
Feed-Forward [27]------18.8319.4315.8822.8617.8421.12
Dilated RNN [56]------18.618.015.322.717.621.1
Regression [57]20.9817.6616.3020.5221.6217.45------
GAN [58]20.1416.2816.0820.0020.9116.63------
SAN [20]19.4915.7515.6818.9619.5215.9417.5718.3414.8421.6916.0220.08
MTL-LSTM [59]17.3514.0713.1716.8914.8920.1713.2718.5220.5613.7717.0113
Our PIBGN20.5614.6514.8718.6325.1115.2316.0818.1313.3721.4117.319.01
Green, blue, and orange indicate the best, second-best, and third-best scores, respectively.
Table 6. RMSE-based performance comparison of the proposed PIBGN against state-of-the-art methods for the 60 min prediction horizon.
Table 6. RMSE-based performance comparison of the proposed PIBGN against state-of-the-art methods for the 60 min prediction horizon.
Subject ID
Studies 540 544 552 567 584 596 559 563 570 575 588 591
LSTM [55]------34.429.928.637.333.136
Feed-Forward [27]------32.5231.3327.4835.2830.1233.6
Regression [57]39.0530.4229.3836.5237.0128.92------
GAN [58]38.5427.6429.0335.6534.3128.1------
SAN [20]32.7925.2325.7130.8731.225.1429.5628.3725.0432.3426.7329.56
MTL-LSTM [59]36.1327.2527.7133.1127.8437.6927.6232.9336.6427.1931.3525.26
Our PIBGN39.7725.6424.7533.731.9623.8328.1632.5122.3134.5929.2829.77
Green, blue, and orange indicate the best, second-best, and third-best scores, respectively.
Table 7. Summary of estimated model parameters per patient for both 30 and 60 min prediction horizons.
Table 7. Summary of estimated model parameters per patient for both 30 and 60 min prediction horizons.
Subject ID
PH (min) Parameters 540 544 552 559 563 567 570 575 584 588 591 596 SD CV%
30ke0.64870.69560.62440.69130.57270.42930.59810.68800.63530.70060.66640.66740.073311.55
ka20.06680.03050.05170.06430.05930.02410.05050.05680.05170.04170.06350.06260.013526.01
kabs0.10650.13080.15200.11900.13110.13110.11590.16470.09800.17210.16810.18110.026919.33
kmax0.20600.15890.17460.11580.09750.18550.1280.12150.17790.14190.12670.10260.032922.74
kmin0.04640.08140.09140.01040.08460.07540.07370.08980.0460.11290.08570.08730.027737.53
60ke0.64880.69560.6250.69130.57270.42930.59800.68800.63530.70060.66640.66740.073311.54
ka20.06680.03010.05110.06430.05930.02410.05080.05680.05170.04170.06350.06260.013526.01
kabs0.10640.13070.15190.11900.13110.13110.11590.16470.09750.17210.16810.18110.026919.34
kmax0.20600.15890.17460.11580.09750.18500.12850.12150.1790.14180.12720.1030.032922.71
kmin0.04630.08130.09170.01040.08450.07540.07340.08990.0460.11290.08570.08730.027737.58
Table 8. Ablation study on baseline BGL predictive network (BGN) with no physiological inputs and PIBGN with physiological inputs for 30 and 60 min prediction horizons. Best scores are shown with bold fonts.
Table 8. Ablation study on baseline BGL predictive network (BGN) with no physiological inputs and PIBGN with physiological inputs for 30 and 60 min prediction horizons. Best scores are shown with bold fonts.
Studies/Subject IDs540544552567584596559563570575588591
BGN—30 min21.8616.9215.6619.2826.2316.2117.0820.413.7125.3917.9819.48
PIBGN—30 min20.5614.6514.8718.6325.1115.2316.0818.1313.3721.4117.319.01
BGN—60 min39.0829.5529.2933.1735.0126.3830.631.4621.8735.429.3730.51
PIBGN—60 min39.7725.6424.7533.731.9623.8328.1632.5122.3134.5929.2829.77
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Ghimire, S.; Celik, T.; Gerdes, M.; Omlin, C.W. Personalized Blood Glucose Prediction Using Physiology- Informed Machine Learning. Mach. Learn. Knowl. Extr. 2026, 8, 96. https://doi.org/10.3390/make8040096

AMA Style

Ghimire S, Celik T, Gerdes M, Omlin CW. Personalized Blood Glucose Prediction Using Physiology- Informed Machine Learning. Machine Learning and Knowledge Extraction. 2026; 8(4):96. https://doi.org/10.3390/make8040096

Chicago/Turabian Style

Ghimire, Sarala, Turgay Celik, Martin Gerdes, and Christian W. Omlin. 2026. "Personalized Blood Glucose Prediction Using Physiology- Informed Machine Learning" Machine Learning and Knowledge Extraction 8, no. 4: 96. https://doi.org/10.3390/make8040096

APA Style

Ghimire, S., Celik, T., Gerdes, M., & Omlin, C. W. (2026). Personalized Blood Glucose Prediction Using Physiology- Informed Machine Learning. Machine Learning and Knowledge Extraction, 8(4), 96. https://doi.org/10.3390/make8040096

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop