Noninvasive In Vivo Estimation of Blood-Glucose Concentration by Monte Carlo Simulation

Continuous monitoring of blood-glucose concentrations is essential for both diabetic and nondiabetic patients to plan a healthy lifestyle. Noninvasive in vivo blood-glucose measurements help reduce the pain of piercing human fingertips to collect blood. To facilitate noninvasive measurements, this work proposes a Monte Carlo photon simulation-based model to estimate blood-glucose concentration via photoplethysmography (PPG) on the fingertip. A heterogeneous finger model was exposed to light at 660 nm and 940 nm in the reflectance mode of PPG via Monte Carlo photon propagation. The bio-optical properties of the finger model were also deduced to design the photon simulation model for the finger layers. The intensities of the detected photons after simulation with the model were used to estimate the blood-glucose concentrations using a supervised machine-learning model, XGBoost. The XGBoost model was trained with synthetic data obtained from the Monte Carlo simulations and tested with both synthetic and real data (n = 35). For testing with synthetic data, the Pearson correlation coefficient (Pearson’s r) of the model was found to be 0.91, and the coefficient of determination (R2) was found to be 0.83. On the other hand, for tests with real data, the Pearson’s r of the model was 0.85, and R2 was 0.68. Error grid analysis and Bland–Altman analysis were also performed to confirm the accuracy. The results presented herein provide the necessary steps for noninvasive in vivo blood-glucose concentration estimation.


Introduction
Glucose is a type of sugar that serves as the primary source of energy in the body. Thus, blood glucose refers to the sugar carried by the bloodstream to all the cells in the body for supplying energy. This glucose is derived by consuming food and drink, and the body also releases stored glucose from the liver and muscles. The human body controls blood-glucose concentrations to maintain adequate levels to supply energy uniformly throughout the day. While the amount of blood glucose available should be enough to fuel cells, excessive amounts can strain the circulatory system. Diabetes mellitus is a serious disease that is directly related to the amount of glucose present in the blood. Internationally, blood-glucose levels are specified in terms of the molar concentration and measured in either millimoles per liter or milligrams per deciliter. Excessive amounts of glucose (>140 mg/dL) in the blood causes hyperglycemia, whereas low amounts of glucose (<71 mg/dL) causes hypoglycemia [1].
Continuous blood-glucose measurement is essential as it allows diabetics and normal people to plan a healthy lifestyle. However, invasive and continuous blood-glucose measurements may cause pain and burden to individuals and deter continuous monitoring [2]. Therefore, noninvasive in vivo blood-glucose measurements can overcome the above limitations and prevent pain. However, noninvasive blood-glucose evaluation frameworks for self-monitoring have plenty of room for improvement and are still far from suitable for at-home use.
Photoplethysmography (PPG) is the primary optical method used to distinguish changes in blood volume in the peripheral circulation; it is a cost-effective and noninvasive strategy to estimate biological parameters from the skin surface. PPG is a promising technique for early detection of various atherosclerotic pathologies [3], and the signals require only a few electronic components for recording: a light source that illuminates the skin and a photodetector (PD) that is placed on the same or opposite side to the light source for receiving the light. Depending upon the light source and PD placement, the PPG signals can be divided into two types: reflection and transmission. If the light source and PD are placed on the same side, it is regarded as the reflection mode; if the light source and PD are placed on the opposite side, it is regarded as the transmission mode [4]. The PPG signal consists of AC and DC components. The AC components change with each heartbeat, whereas the DC components, which are also called slow-varying baselines, are composed of low-frequency fluctuations [5]. Over the past few decades, PPG signals have been used for various clinical evaluations, such as monitoring blood oxygenation, blood pressure, and heart rate variability. Studies on measurements of arterial blood oxygenation (SpO 2 ) [6] and blood pressure [7] using PPG signals have been reported. Recently, wearable devices have become promising methods of collecting PPG signals to evaluate health information. In [8], the authors successfully developed a wearable printed circuit board (PCB) to collect reflection-type PPG signals and to measure heart rate and SpO 2 . Sen Gupta et al. [9] developed a PPG data acquisition device to record both transmission-and reflection-type signals, and a set of features related to blood glucose levels were extracted from the signals to estimate blood-glucose concentration using their machine-learning algorithm. Furthermore, other health-related parameters such as glycated hemoglobin (HbA1c) can be measured from the PPG signals. In [10], the authors developed gray-box models to estimate HbA1c using digital volume pulse waveforms, which are also called fingertip PPG signals. Monte-Moreno et al. [11] used various physiological parameters, such as heart rate, vascular compliance, blood viscosity, and respiratory frequency, to analyze PPG waveforms such that these signals could later be used for estimating blood glucose; they used support vector machine (SVM), random forest, linear regression, and a neural network classifier for blood-glucose level classifications and reported that the random forest approach produced the best results with an R 2 value of 0.9.
When a photon is ejected from a light source, it is reflected by the tissue components and returns to the PD or is transmitted through the tissue and reaches the PD placed on the opposite side of the photon source. The PPG signal follows the transmission or reflection mode and detects changes in the blood volume. In our experiments, we examined photon propagation through a finger model via Monte Carlo (MC) simulation by considering blood as a static component and studied the changes in blood volume.
MC simulation is a computational technique that involves random sampling of an actual amount. It is an adaptable technique for simulating photon propagation in biological tissues. The simulation depends on the random walks that photons make as they travel through the tissue, which is chosen by sampling the probability distributions for step size and angular deflection per scattering event. Within the common strategy of MC modeling, light transport in tissues is simulated by following a random walk process that each photon packet undergoes within the tissue model [12]. For each dispatched photon packet, an initial weight is allotted before entering the tissue model. The absorption coefficient µ a (cm −1 ) and scattering coefficient µ s (cm −1 ) are used to depict the probability of absorption and scattering, respectively, for a unit path length [13]. The anisotropy factor g, which is characterized as the normal cosine of the scattering angle, determines the probability distribution of the scattering angles for first-order approximation. Moreover, the refractive index n change between any two regions in the tissue model or at the air-tissue interface determines the angle of refraction. After traveling within a given medium, a fraction of the photon packet exits from the same side of the tissue model; this fraction is calculated as the part of the incident light that is scored as the received light intensity (weight). Interestingly, the negligible portion of the photon packet weight that travels through the medium and exits on the opposite side of the model is scored as the transmittance [14]. In this study, MC simulations were used to infer photon transport within the finger tissue model, and the amount of photons that reach the PD was analyzed to deduce the relationship between the received light intensity and blood-glucose concentration.
In recent years, light transport in a tissue medium has been used for in vivo estimation of health-related parameters, such as blood pressure, blood-glucose concentration, and oxygen saturation in the blood [15]. MC simulations are considered the gold standard for photon migration in the tissue model to noninvasively estimate the health parameters [16]. Noninvasive blood-glucose estimation is a new research trend, and only a few works have reported results with the MC method. Liu et al. [17] introduced a backpropagation MC (BpMC) approach to retrieve the bio-optical properties of multilayered tissues from transmitted and reflected light signals. These properties were used to estimate the bloodglucose concentrations through two types of models: BpMC-DEE and BpMC-CNN. In [18], frequency-modulated continuous wave (FMCW) LIDAR technology was proposed to estimate blood-glucose concentrations, and MC simulations were performed to investigate the feasibility of the method; this approach mainly comprised a near-infrared tunable semiconductor laser and an integrated detector. The glucose concentration was deduced from the slope of the FMCW signal spectrum by analyzing the relationship between the signal intensity and light transit time depending on beat frequency. Enejder et al. [19] reported the use of Raman spectroscopy for quantitative, noninvasive blood-glucose measurements; they tested 17 healthy human subjects and collected 461 sets of Raman spectra transcutaneously along with glucose reference values. Further, a partial least-squares calibration and leave-one-out cross-validation were used for each subject. They reported an R 2 value of 0.83 ± 0.1. Fluorescence-based glucose sensors were introduced in [20] for in vivo blood-glucose estimation by new receptor systems for glucose recognition and utilization of transduction schemes. According to a mathematical model in this strategy, the acquired optical signals were applied to evaluate glucose concentrations, and the assessments were performed on raw optical signals. However, a simple mathematical model cannot consider the logical relationships between optical signals and physiological parameters; this often creates obstacles for the proposed method to be implemented in clinical scenarios.
In this paper, we propose an MC photon simulation-based model for estimating blood-glucose concentrations through PPG in a finger model. The finger model was used with light at wavelengths of 660 nm and 940 nm, as well as the corresponding bio-optical properties of the finger layers and calculated intensity of the received light. Thereafter, a supervised machine-learning algorithm was used to estimate the bloodglucose concentration using the calculated light intensities and SpO 2 values. We collected PPG data from 35 volunteers for light at both wavelengths, along with the blood-glucose concentrations and SpO 2 values as a reference to evaluate the proposed model.
The remainder of this paper is organized as follows. In Section 2, the methodology of this study, including the finger model, deduction of the bio-optical properties, MC photon simulation model, data acquisition procedure, and machine-learning model for estimating blood-glucose concentration are presented. In Section 3, the bio-optical properties of the finger layers, MC photon simulation results, and machine-learning model results are presented. Finally, in Sections 4 and 5, the discussion and conclusion of the study are presented, respectively.

Methodology
The workflow diagram for the proposed approach is shown in Figure 1 and discussed sequentially.

Proposed Finger Model
For photon transport by MC simulations, a heterogeneous finger model was considered. In the finger model, the spatial distribution of blood and chromophores vary with depth. However, the anatomical constituents of the skin, including skin cell composition, chromophore content, and blood concentration, are generally constant and can be identified. This permits approximation of the skin as a multilayered medium. Before dividing the skin into the subdermal layers, four general layers are considered for the finger model, as shown in Figure 2a. The skin layer, as shown in Figure 2b is divided into six subdermal layers, namely, stratum corneum, epidermis, papillary dermis, upper blood net dermis, reticular dermis, and deep blood net dermis. The first layer of the skin, which is approximately 20 µm thick, is known as the stratum corneum. The second layer is the epidermis and is approximately 80 µm thick. The epidermis contains primarily living cells, a fraction of dehydrated cells, laden cells with keratohyalin granules, columnar cells, melanin dust, small melanin granules, and melanosomes, which are not specifically perfused into the blood [21]. The other four dermal layers with varying volumes of blood are the papillary dermis (150 µm thick), upper blood net dermis (80 µm thick), reticular dermis (1500 µm thick), and deep blood net dermis (100 µm thick). Fat, muscle, and bone are structured beneath the skin layer; the fat layer is 0.55 mm thick, and blood components are absent in this layer. Table 1 summarizes the thicknesses of the layers, including the subdivided skin layers considered in this work.

Proposed Finger Model
For photon transport by MC simulations, a heterogeneous finger model was considered. In the finger model, the spatial distribution of blood and chromophores vary with depth. However, the anatomical constituents of the skin, including skin cell composition, chromophore content, and blood concentration, are generally constant and can be identified. This permits approximation of the skin as a multilayered medium. Before dividing the skin into the subdermal layers, four general layers are considered for the finger model, as shown in Figure 2a. The skin layer, as shown in Figure 2b is divided into six subdermal layers, namely, stratum corneum, epidermis, papillary dermis, upper blood net dermis, reticular dermis, and deep blood net dermis. The first layer of the skin, which is approximately 20 µm thick, is known as the stratum corneum. The second layer is the epidermis and is approximately 80 µm thick. The epidermis contains primarily living cells, a fraction of dehydrated cells, laden cells with keratohyalin granules, columnar cells, melanin dust, small melanin granules, and melanosomes, which are not specifically perfused into the blood [21]. The other four dermal layers with varying volumes of blood are the papillary dermis (150 µm thick), upper blood net dermis (80 µm thick), reticular dermis (1500 µm thick), and deep blood net dermis (100 µm thick). Fat, muscle, and bone are structured beneath the skin layer; the fat layer is 0.55 mm thick, and blood components are absent in this layer. Table 1 summarizes the thicknesses of the layers, including the subdivided skin layers considered in this work.

Proposed Finger Model
For photon transport by MC simulations, a heterogeneous finger model was considered. In the finger model, the spatial distribution of blood and chromophores vary with depth. However, the anatomical constituents of the skin, including skin cell composition, chromophore content, and blood concentration, are generally constant and can be identified. This permits approximation of the skin as a multilayered medium. Before dividing the skin into the subdermal layers, four general layers are considered for the finger model, as shown in Figure 2a. The skin layer, as shown in Figure 2b is divided into six subdermal layers, namely, stratum corneum, epidermis, papillary dermis, upper blood net dermis, reticular dermis, and deep blood net dermis. The first layer of the skin, which is approximately 20 µm thick, is known as the stratum corneum. The second layer is the epidermis and is approximately 80 µm thick. The epidermis contains primarily living cells, a fraction of dehydrated cells, laden cells with keratohyalin granules, columnar cells, melanin dust, small melanin granules, and melanosomes, which are not specifically perfused into the blood [21]. The other four dermal layers with varying volumes of blood are the papillary dermis (150 µm thick), upper blood net dermis (80 µm thick), reticular dermis (1500 µm thick), and deep blood net dermis (100 µm thick). Fat, muscle, and bone are structured beneath the skin layer; the fat layer is 0.55 mm thick, and blood components are absent in this layer. Table 1 summarizes the thicknesses of the layers, including the subdivided skin layers considered in this work.

Bio-Optical Properties of the Finger Model
When light enters a tissue, photons are either absorbed in the media, scattered from the surface, or scattered within the tissue medium. Therefore, the bio-optical properties of the proposed finger model layers are needed for the MC photon simulations and to determine the photon intensities. Accounting for the blood concentration in the tissue is another vital step in measuring the optical properties of the model. The blood is concentrated in a layer measuring 0.05-0.1 mm; thus, the dermis must be divided into four layers in addition to the stratum corneum and epidermis [22]. Generally, the absorption coefficient µ a , scattering coefficient µ s , anisotropy g, and refractive index n are considered as the optical properties of tissues.
The general equation of the absorption coefficient µ a for a heterogeneous biological tissue can be represented as Equation (1).
where V i is the volume fraction of the i-th skin layer, and k is the total number of layers; a (λ) is the baseline absorption coefficient and can be expressed as Equation (2).
Given µ (0) a (λ), the absorption coefficient of the i-th dermal sublayers at a wavelength λ can be written as where V Art and V Ven are the arterial and venous blood-volume fractions, respectively; µ a Art , µ a Ven , and µ a wat represent the absorption coefficients of arterial blood, venous blood, and water, respectively. Only melanin and water were considered to calculate the absorption coefficients of the stratum corneum and epidermis because these layers do not contain any blood cells. The absorption coefficient of the epidermis can be expressed as follows [23]: where V m is the melanin volume fraction, which is considered as 10%, µ am (λ) is the absorption coefficient of melanin at wavelength λ, V w is the water volume fraction in the epidermis, and µ a wat (λ) is the absorption coefficient of water at wavelength λ. The equation for calculating the absorption coefficient of the stratum corneum is adopted from [24] and expressed as The absorption coefficients of arterial and venous blood can be calculated from oxygen saturation [25,26]. In the finger model, the four subdivided dermal layers contain blood, and hence, glucose. Therefore, glucose is present in the arterial and venous blood of these four dermal layers, whose absorption coefficients can be expressed as follows: where µ a HbO and µ a HHb represent the absorption coefficients of oxyhemoglobin and deoxyhemoglobin, respectively; SaO 2 and SvO 2 are the arterial and venous oxygen saturation values, respectively; g is the molar absorption coefficient (L·mol −1 ·cm −1 ), and c g is the molar concentration (mol·L −1 ) of glucose in the arterial and venous blood of the dermal layers. To calculate the absorption coefficients of the venous and arterial blood, SvO 2 is considered to be 10% lower than SaO 2 [27]. The baseline blood volume fraction (V b ) and water volume fraction (V wat ) of the proposed finger layers for calculating the bio-optical properties are obtained from [28][29][30] and listed in Table 2. The absorption coefficients of water, oxyhemoglobin, and deoxyhemoglobin are adopted from other studies [31][32][33] and listed in Table 3. As we used two wavelengths of light (660 nm and 940 nm) in the simulation model, the absorption coefficients of water, oxyhemoglobin, and deoxyhemoglobin are listed for these two wavelengths. The molar absorption coefficient ( g ) of glucose for 660 nm and 940 nm are 0.0002 and 0.001 L·mol −1 ·cm −1 , respectively [34].

Monte Carlo Simulation Model for Photon Propagation
Photon propagation with MC simulations is an adaptable yet thorough method of handling photon transport simulations. In our experiment, the MC simulation was used to deduce light propagation in the finger model. The voxel-based MC algorithm [35] was considered for accurate and efficient photon transport modeling. The 3D multilayered volume, where a single integer was assigned to each voxel to indicate the index of the layer, was designed for two light wavelengths (660 nm and 940 nm). The bio-optical properties of the layers considered for the finger model for 660 nm and 940 nm were assigned to each layer of the volume separately for photon propagation in the model. Once a photon package enters the heterogeneous finger model, it is randomly scattered or absorbed by the layers. The remaining photons that reach the detector are collected, and their intensity is analyzed to estimate the blood-glucose concentration. The flowchart of the MC simulation for light propagation in the heterogeneous finger model is shown in Figure 3. volume, where a single integer was assigned to each voxel to indicate the index of the layer, was designed for two light wavelengths (660 nm and 940 nm). The bio-optical properties of the layers considered for the finger model for 660 nm and 940 nm were assigned to each layer of the volume separately for photon propagation in the model. Once a photon package enters the heterogeneous finger model, it is randomly scattered or absorbed by the layers. The remaining photons that reach the detector are collected, and their intensity is analyzed to estimate the blood-glucose concentration. The flowchart of the MC simulation for light propagation in the heterogeneous finger model is shown in Figure 3. Initially, a photon packet was launched into the proposed finger model, and the initial weight was considered as 1 (one). The step size (l) was calculated by random sampling of the probability of photon scattering [36]. In this experiment, after the photon entered the finger model, we evaluated whether the photon contacted the surface of the finger. Once the photon hits the surface of the finger, it is reflected, and its variables are updated. The position vector, direction vector, and weight of the photon were considered as the photon variables and were updated after reflection. The intensities of the remaining photons were recorded after detection by the photon detector. Absorption and scattering were expected to have occurred in the case of free photon propagation and that the photon packet was oriented through randomly generated deflections and azimuth angles. The Henyey-Greenstein phase function [37] was used to calculate the scattering angles θ while generating the azimuth angles randomly in the range of 0 to 2 π. The cosine of the scattering angle can be expressed as Initially, a photon packet was launched into the proposed finger model, and the initial weight was considered as 1 (one). The step size (l) was calculated by random sampling of the probability of photon scattering [36]. In this experiment, after the photon entered the finger model, we evaluated whether the photon contacted the surface of the finger. Once the photon hits the surface of the finger, it is reflected, and its variables are updated. The position vector, direction vector, and weight of the photon were considered as the photon variables and were updated after reflection. The intensities of the remaining photons were recorded after detection by the photon detector. Absorption and scattering were expected to have occurred in the case of free photon propagation and that the photon packet was oriented through randomly generated deflections and azimuth angles. The Henyey-Greenstein phase function [37] was used to calculate the scattering angles θ while generating the azimuth angles randomly in the range of 0 to 2 π. The cosine of the scattering angle can be expressed as where g is known as the scattering anisotropy; g = 0 indicates isotropic scattering, whereas g = 1 indicates scattering that is primarily in the forward direction. The step size (l) depends on a random number ξ (0 < ξ < 1), absorption coefficient (µ a ), and scattering coefficient (µ s ) and can be expressed as Equation (10).
Our model was examined in reflection mode. The optical source and detector were placed 0.4 mm apart, and the intensities of the detected photons were recorded, which is referred to as the mean weight of the detected photon packets. The relationship among penetration depth, optical path, and source-detector separation in the reflectance geometry were scrutinized using this model. The mean of the total simulated path length of the photon packets from the source to the detector was used to calculate the mean optical path. This simulation was repeated until the desired number of photon packets were detected to estimate the blood-glucose concentration from the photon intensity.

Machine-Learning Model and Blood-Glucose Concentration Estimation
As shown in Figure 4, after calculating the photon intensity from MC simulations at 660 nm and 940 nm, the values were used as one of the inputs to a supervised machinelearning regression model. A powerful approach for the supervised regression model, XGBoost, was used in this study. Along with the light intensity, the SpO 2 value was also used as the input to the regression model, with blood-glucose concentration as the target of the model. The ranges of SpO 2 and glucose values used in the regression model are listed in Table 4.
where is known as the scattering anisotropy; = 0 indicates isotropic scattering, whereas = 1 indicates scattering that is primarily in the forward direction. The step size (l) depends on a random number (0 < < 1), absorption coefficient ( ), and scattering coefficient ( ) and can be expressed as Equation (10).
Our model was examined in reflection mode. The optical source and detector were placed 0.4 mm apart, and the intensities of the detected photons were recorded, which is referred to as the mean weight of the detected photon packets. The relationship among penetration depth, optical path, and source-detector separation in the reflectance geometry were scrutinized using this model. The mean of the total simulated path length of the photon packets from the source to the detector was used to calculate the mean optical path. This simulation was repeated until the desired number of photon packets were detected to estimate the blood-glucose concentration from the photon intensity.

Machine-Learning Model and Blood-Glucose Concentration Estimation
As shown in Figure 4, after calculating the photon intensity from MC simulations at 660 nm and 940 nm, the values were used as one of the inputs to a supervised machinelearning regression model. A powerful approach for the supervised regression model, XGBoost, was used in this study. Along with the light intensity, the SpO2 value was also used as the input to the regression model, with blood-glucose concentration as the target of the model. The ranges of SpO2 and glucose values used in the regression model are listed in Table 4.    After simulation, approximately 1581 data samples were obtained for the stated range of glucose concentrations and SpO 2 , and these samples were considered as the synthetic data for training the machine-learning model.
XGBoost parameters used for the regression include learning rate = 0.3, maximum depth = 6, and number of estimators = 100. Five evaluation matrices were used to check the accuracy of the glucose concentration estimations, namely, mean-squared error (MSE), mean absolute error (MAE), root mean-squared error (RMSE), coefficient of determination (R 2 ), and Pearson correlation coefficient (Pearson's r). Clark's error grid analysis (EGA) [38] and the Bland-Altman plot were used to visualize the estimated values and corresponding estimation errors, respectively.

Data Acquisition
For collecting PPG signals from the subjects, a hardware system was proposed. The ESP32-PICO-V3 [39] was used as a processing unit for controlling the entire system. This microcontroller has an on-chip RF communication system so that the data can be transmitted to a remote server without any external communication module. A surface-mounted device (SMD) module, SFH 7050 [40], was used for collecting the reflection mode PPG signal. This module contains three different light emitting diodes: green, red, and infrared and a photodetector (PD). This photodetector can respond in the spectral range of sensitivity of 400 nm to 1100 nm. A bio-sensing analog front end (AFE), AFE 4404 [41], was used to control the LEDs and the PD. The three transmitter pins (Tx1, Tx2, Tx3) of AFE 4404 were designated for the three LEDs. In our study, we considered the PPG signals that were collected with the red and infrared LED. Figure 5 illustrates the block diagram of our proposed hardware for data acquisition.
Sensors 2021, 21, x FOR PEER REVIEW 9 of 18 [38] and the Bland-Altman plot were used to visualize the estimated values and corresponding estimation errors, respectively.

Data Acquisition
For collecting PPG signals from the subjects, a hardware system was proposed. The ESP32-PICO-V3 [39] was used as a processing unit for controlling the entire system. This microcontroller has an on-chip RF communication system so that the data can be transmitted to a remote server without any external communication module. A surfacemounted device (SMD) module, SFH 7050 [40], was used for collecting the reflection mode PPG signal. This module contains three different light emitting diodes: green, red, and infrared and a photodetector (PD). This photodetector can respond in the spectral range of sensitivity of 400 nm to 1100 nm. A bio-sensing analog front end (AFE), AFE 4404 [41], was used to control the LEDs and the PD. The three transmitter pins (Tx1, Tx2, Tx3) of AFE 4404 were designated for the three LEDs. In our study, we considered the PPG signals that were collected with the red and infrared LED. Figure 5 illustrates the block diagram of our proposed hardware for data acquisition. Thirty-five subjects voluntarily provided data for this study. From each subject, we collected 240 s of PPG data in the reflection mode. The MC simulation model was designed for 660 nm and 940 nm light, considering the PPG data of red and infrared wavelengths. Along with the PPG signals of the subjects, we also measured their blood-glucose concentration using the Caresens II Plus [42] and their SpO2 using another clinical device [43]. Figure 6 illustrates the histograms of the measured blood-glucose concentrations and SpO2 values of the subjects.  Thirty-five subjects voluntarily provided data for this study. From each subject, we collected 240 s of PPG data in the reflection mode. The MC simulation model was designed for 660 nm and 940 nm light, considering the PPG data of red and infrared wavelengths. Along with the PPG signals of the subjects, we also measured their blood-glucose concentration using the Caresens II Plus [42] and their SpO 2 using another clinical device [43]. Figure 6 illustrates the histograms of the measured blood-glucose concentrations and SpO 2 values of the subjects.  [38] and the Bland-Altman plot were used to visualize the estimated values and corresponding estimation errors, respectively.

Data Acquisition
For collecting PPG signals from the subjects, a hardware system was proposed. The ESP32-PICO-V3 [39] was used as a processing unit for controlling the entire system. This microcontroller has an on-chip RF communication system so that the data can be transmitted to a remote server without any external communication module. A surfacemounted device (SMD) module, SFH 7050 [40], was used for collecting the reflection mode PPG signal. This module contains three different light emitting diodes: green, red, and infrared and a photodetector (PD). This photodetector can respond in the spectral range of sensitivity of 400 nm to 1100 nm. A bio-sensing analog front end (AFE), AFE 4404 [41], was used to control the LEDs and the PD. The three transmitter pins (Tx1, Tx2, Tx3) of AFE 4404 were designated for the three LEDs. In our study, we considered the PPG signals that were collected with the red and infrared LED. Figure 5 illustrates the block diagram of our proposed hardware for data acquisition. Thirty-five subjects voluntarily provided data for this study. From each subject, we collected 240 s of PPG data in the reflection mode. The MC simulation model was designed for 660 nm and 940 nm light, considering the PPG data of red and infrared wavelengths. Along with the PPG signals of the subjects, we also measured their blood-glucose concentration using the Caresens II Plus [42] and their SpO2 using another clinical device [43]. Figure 6 illustrates the histograms of the measured blood-glucose concentrations and SpO2 values of the subjects.  After collecting the PPG data from the subjects, we calibrated these data with the synthetic data from the MC simulations. Our collected PPG data had a 22-bit resolution and for the red and infrared wavelengths, the data were stored in the range of 399,000 and 425,000, respectively, using the AFE 4404. We used another XGBoost regressor model for calibrations, where the calibrated PPG data were used in the XGBoost model trained with the synthetic data to estimate the blood-glucose concentrations. Figure 7 depicts the calibration stage and testing steps for estimating blood-glucose concentration. After collecting the PPG data from the subjects, we calibrated these data with the synthetic data from the MC simulations. Our collected PPG data had a 22-bit resolution and for the red and infrared wavelengths, the data were stored in the range of 399,000 and 425,000, respectively, using the AFE 4404. We used another XGBoost regressor model for calibrations, where the calibrated PPG data were used in the XGBoost model trained with the synthetic data to estimate the blood-glucose concentrations. Figure 7 depicts the calibration stage and testing steps for estimating blood-glucose concentration.

Bio-Optical Properties
The bio-optical properties, i.e., the absorption coefficient, scattering coefficient, anisotropy, and refractive index of the layers of the proposed finger model are necessary to design the MC simulation model for the experiments. The absorption coefficients of the skin layers were calculated using Equations (1)- (8) and Tables 1-3 by adopting information from [13,22,27,41]. Figure 8 shows the absorption coefficients of the skin layers in the wavelength range of 500 nm to 1000 nm. Other optical properties of the finger model layer, i.e., scattering coefficient , anisotropy g, and refractive index n were adopted from the literature [29,44]. The absorption coefficients of fat, muscle, and bone were also adopted from this literature. As we simulated photons in two wavelengths (660 nm and 940 nm) by MC simulation to estimate blood-glucose concentrations, Table 5 represents the optical properties of the proposed finger model layers for these two wavelengths.

Bio-Optical Properties
The bio-optical properties, i.e., the absorption coefficient, scattering coefficient, anisotropy, and refractive index of the layers of the proposed finger model are necessary to design the MC simulation model for the experiments. The absorption coefficients of the skin layers were calculated using Equations (1)- (8) and Tables 1-3 by adopting information from [13,22,27,41]. Figure 8 shows the absorption coefficients of the skin layers in the wavelength range of 500 nm to 1000 nm. After collecting the PPG data from the subjects, we calibrated these data with the synthetic data from the MC simulations. Our collected PPG data had a 22-bit resolution and for the red and infrared wavelengths, the data were stored in the range of 399,000 and 425,000, respectively, using the AFE 4404. We used another XGBoost regressor model for calibrations, where the calibrated PPG data were used in the XGBoost model trained with the synthetic data to estimate the blood-glucose concentrations. Figure 7 depicts the calibration stage and testing steps for estimating blood-glucose concentration.

Bio-Optical Properties
The bio-optical properties, i.e., the absorption coefficient, scattering coefficient, anisotropy, and refractive index of the layers of the proposed finger model are necessary to design the MC simulation model for the experiments. The absorption coefficients of the skin layers were calculated using Equations (1)- (8) and Tables 1-3 by adopting information from [13,22,27,41]. Figure 8 shows the absorption coefficients of the skin layers in the wavelength range of 500 nm to 1000 nm. Other optical properties of the finger model layer, i.e., scattering coefficient , anisotropy g, and refractive index n were adopted from the literature [29,44]. The absorption coefficients of fat, muscle, and bone were also adopted from this literature. As we simulated photons in two wavelengths (660 nm and 940 nm) by MC simulation to estimate blood-glucose concentrations, Table 5 represents the optical properties of the proposed finger model layers for these two wavelengths. Other optical properties of the finger model layer, i.e., scattering coefficient µ s , anisotropy g, and refractive index n were adopted from the literature [29,44]. The absorption coefficients of fat, muscle, and bone were also adopted from this literature. As we simulated photons in two wavelengths (660 nm and 940 nm) by MC simulation to estimate bloodglucose concentrations, Table 5 represents the optical properties of the proposed finger model layers for these two wavelengths.

Monte Carlo Simulation Results
The voxel-based MC simulation model, described in Section 2.3, was designed for photon propagation in the proposed finger model, and the scattering events occurring within the finger model for the two wavelengths were recorded. In Figures 9 and 10, the photon fluence within the heterogeneous finger model is shown for the bio-optical properties at 660 nm and 940 nm. The MC photon migration was performed in a 5.02-mmthick slab layered in nine parts, and the photon source and detector were placed 4 mm apart. A 64-bit operating system (Windows OS) with 20 GB RAM and an Intel Core i7 CPU was used for the simulations. The designed finger model was contained in 6 × 6 × 6 mm 3 voxels and required about 5 h to complete the MC simulation. The color bars in the figures illustrate the intensities of the fluence rates throughout the voxel-based finger model.

Model Evaluation with Synthetic Data
The following results were obtained after blood-glucose concentration estimations using the synthetic data obtained from the MC simulations; the fitted plot after XGBoost regression is shown in Figure 11a. The orange line in the figure indicates a perfect fit. Most

Model Evaluation with Synthetic Data
The following results were obtained after blood-glucose concentration estimations using the synthetic data obtained from the MC simulations; the fitted plot after XGBoost regression is shown in Figure 11a. The orange line in the figure indicates a perfect fit. Most of the predicted values are observed to be scattered around the perfect line. Figure 11b illustrates the error grid analysis (EGA), with Zone A (clinically accurate data) containing most of the samples, and Zone B (data outside 20% of the reference but would not lead to inappropriate treatment), and Zone D (indicating potential failure in detecting hyperglycemia or hypoglycemia) containing a negligible number of samples. There are no samples in Zones C and E. Table 6 lists the zonal accuracies of the EGA plots. The unit of estimated blood-glucose concentrations is mol/L but is converted to mg/dL for the EGA plot using Equation (11).

Model Evaluation with Synthetic Data
The following results were obtained after blood-glucose concentration estimations using the synthetic data obtained from the MC simulations; the fitted plot after XGBoost regression is shown in Figure 11a. The orange line in the figure indicates a perfect fit. Most of the predicted values are observed to be scattered around the perfect line. Figure 11b illustrates the error grid analysis (EGA), with Zone A (clinically accurate data) containing most of the samples, and Zone B (data outside 20% of the reference but would not lead to inappropriate treatment), and Zone D (indicating potential failure in detecting hyperglycemia or hypoglycemia) containing a negligible number of samples. There are no samples in Zones C and E. Table 6 lists the zonal accuracies of the EGA plots. The unit of estimated blood-glucose concentrations is mol/L but is converted to mg/dL for the EGA plot using Equation (11).   The Bland-Altman analysis shown in Figure 12 indicates that the estimation model provides a bias of −0.09 ± 10.53 and that the limits of agreement (95%, 1.96 SD) range from −20.72 to 20.54. The values of the evaluation metrics are listed in Table 7.  The Bland-Altman analysis shown in Figure 12 indicates that the estimation model provides a bias of −0.09 ± 10.53 and that the limits of agreement (95%, 1.96 SD) range from −20.72 to 20.54. The values of the evaluation metrics are listed in Table 7.  The following results were obtained after blood-glucose concentration estimations using the calibrated PPG data, and the fitted line after XGBoost regression is shown in Figure 13a. The orange line indicates a perfect fit. Figure 13b illustrates the EGA, with Zone A (clinically accurate data) containing most of the samples, and Zone B (data outside of 20% of the reference but would not lead to inappropriate treatment) containing the remaining samples. There are no samples in Zones C, D, and E. Table 8 lists the zonal accuracies of the EGA plot.

Model Evaluation with Real Data
The following results were obtained after blood-glucose concentration estimations using the calibrated PPG data, and the fitted line after XGBoost regression is shown in Figure 13a. The orange line indicates a perfect fit. Figure 13b illustrates the EGA, with Zone A (clinically accurate data) containing most of the samples, and Zone B (data outside of 20% of the reference but would not lead to inappropriate treatment) containing the remaining samples. There are no samples in Zones C, D, and E. Table 8 lists the zonal accuracies of the EGA plot.  The Bland-Altman analysis shown in Figure 14 indicates that the estimation model provides a bias of 3.82 ± 15.63 and that the limits of agreement (95%, 1.96 SD) range from −26.82 to 34.46. The values of the evaluation metrics are listed in Table 9.  The Bland-Altman analysis shown in Figure 14 indicates that the estimation model provides a bias of 3.82 ± 15.63 and that the limits of agreement (95%, 1.96 SD) range from −26.82 to 34.46. The values of the evaluation metrics are listed in Table 9.  The Bland-Altman analysis shown in Figure 14 indicates that the estimation model provides a bias of 3.82 ± 15.63 and that the limits of agreement (95%, 1.96 SD) range from −26.82 to 34.46. The values of the evaluation metrics are listed in Table 9.  We also used our calibrated PPG data for the models presented in [9] for comparisons with the proposed MC simulation model. Table 10 presents the evaluation metrics of this comparison.  We also used our calibrated PPG data for the models presented in [9] for comparisons with the proposed MC simulation model. Table 10 presents the evaluation metrics of this comparison. In [9], the authors used 17 discriminant features from the collected PPG signals. After segmenting the individual PPG signals for about 3 s, they segmented the signals for red and infrared wavelengths for use with the feature extraction modules of their models. The features are a mixture of PPG-based physiological features, signal-oriented characteristics, and physical parameters, such as zero-crossing rate (ZCR), autocorrelation (ACR), Kaiser-Teager energy (KTE), power spectral density (PSD), autoregressive coefficients (ARC), blood oxygen saturation (SpO 2 ), and body mass index (BMI). Further, they considered input features from the spectral analysis of the PPG signals, such as kurtosis (kurt) and skewness (skew) of the frequency distribution. Alongside these features, they calculated the heart rate (HR) and breath rate (BR) to validate the PPG signals. For each frame f of the PPG signal s, the feature vector is expressed as Equation (12) In the case of our model, for each PPG signal s, the feature vector can be expressed as Equation (13) X F = S SpO2 (13) In Table 10, the Pearson's r values represent the performances of the models, the number of features denotes the features used in the model, and the number of estimators represents the number of estimator trees in the models. The run time column in Table 10 represents the average execution time for estimating the blood-glucose concentration for one set of inputs.
From Table 10, it can be observed that our model outperforms the random forest model of [9] for all metrics. The Pearson's r of the XGBoost model [9] is slightly better than ours. However, the XGBoost model of [9] uses more features than ours to estimate blood-glucose concentrations; this renders the model [9] more complex and it requires more time to estimate the blood-glucose concentrations with comparable accuracy. We used only SpO 2 as the feature to estimate blood-glucose concentration, which reduces the model complexity and renders the model independent of different features.

Discussion
A Monte Carlo (MC) photon simulation-based model for estimating blood-glucose concentration via PPG on the fingertip is presented in this paper. The MC method was chosen for the photon simulations in the finger model because of its flexibility in computing optical interactions with biological tissues. A heterogeneous finger model with skin, fat, muscle, and bone layers was designed to propagate photons. To facilitate photon simulations in the model, the skin layer was divided into six sublayers. Two of the sublayers do not contain any blood and the remaining sublayers have blood flow. Biooptical properties such as absorption coefficient, scattering coefficient, anisotropy, and refractive index of these sublayers were obtained at both 660 nm and 940 nm to design the finger model for photon simulations.
After Monte Carlo photon propagation in the finger model, the detected photon intensities were recorded for use in the machine-learning model. The XGBoost regressor was used to analyze the relationships among the parameters. For the inputs to the machine-learning model, we used the detected photon intensities, which were considered as synthetic data, along with specific SpO 2 and glucose values. Four evaluation metrics, as well as the EGA plot and Bland-Altman analysis were considered to evaluate the MC-based model performance. We also collected PPG data from 35 volunteers with red and infrared light to evaluate the proposed model. By analyzing the blood-glucose estimation results, the proposed model was shown to have better performance than a few comparison models from the literature. The Pearson correlation coefficient (Pearson's r) and R 2 values of the synthetic data were 0.91 and 0.83, respectively. For the collected PPG data, the Pearson's r and R 2 values were 0.85 and 0.68, respectively. Moreover, the EGA showed clinically accurate results.
Previous studies in the field of noninvasive estimation of blood-glucose concentrations [17][18][19][20] has focused on deriving mathematical relationships between optical signals and physiological parameters, which often creates obstacles in estimating the desired parameters accurately. Moreover, PPG-based physiological features, signal-oriented characteristics, and physical parameters such as zero-crossing rate, autocorrelation, body mass index (BMI) often fail to estimate the blood-glucose concentration more accurately. Therefore, to reduce the possibility of inaccurate estimation of blood glucose concentration, the proposed finger model has been considered as heterogeneous, which was not considered in previous studies, and the bio-optical properties have also been deduced. The heterogenous finger model proposed in our study provides the scope for optimizing the model errors, i.e., the more precise design of the model and more accurate results. As the earlier studies in this field focused on deriving a mathematical model for estimating the blood glucose concentration, there is little scope for optimizing the model error as well as model complexity. Furthermore, in the case of a mathematical model, the computational time for estimating blood glucose concentration is longer than that of our proposed model. The Monte Carlo (MC) photon simulation-based model provides an estimation model that requires less time to estimate the desired parameters, thus reducing the computational time of the system.
In our study, heterogeneous finger models are considered for each layer thickness, but the thickness can vary from person to person. Therefore, future study may include variations in layer thickness and collecting more diverse datasets for both hypoglycemic and hyperglycemic subjects. Despite some limitations of this study, the proposed method is suitable for research purposes.

Conclusions
In this work, we proposed a Monte Carlo (MC) photon simulation-based model for noninvasive in vivo estimations of blood-glucose concentrations. The proposed approach comprises a finger model for propagating photons with the bio-optical properties of each of the layers at two wavelengths. After photon propagation by MC simulation, we used the detected photon intensities to estimate the blood-glucose concentrations with a supervised machine-learning model. The results were visualized using error grid analysis (EGA) plots and Bland-Altman analyses. Compared with other commonly used noninvasive blood-glucose estimation models, our proposed model was shown to be less complex but more accurate for estimating blood-glucose concentration.  Institutional Review Board Statement: Our research methodology complies with the ethical regulations of the Institutional Review Board (IRB), Kookmin University, Seoul, Korea. This research was conducted in accordance with the guidelines provided by the IRB, Kookmin University. We have also obtained informed consent from all the participants for utilizing the data obtained from them for academic research purposes.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
We have created our own dataset for this study. Since further research are on processing, we cannot publish the dataset right now.