Toward Non-Invasive Estimation of Blood Glucose Concentration: A Comparative Performance

: The present study comprises a comparison of the Mel Frequency Cepstral Coefﬁcients (MFCC), Principal Component Analysis (PCA) and Independent Component Analysis (ICA) as feature extraction methods using ten different regression algorithms (AdaBoost, Bayesian Ridge, Decision Tree, Elastic Net, k-NN, Linear Regression, MLP, Random Forest, Ridge Regression and Support Vector Regression) to quantify the blood glucose concentration. A total of 122 participants— healthy and diagnosed with type 2 diabetes—were invited to be part of this study. The entire set of participants was divided into two partitions: a training subset of 72 participants, which was intended for model selection, and a validation subset comprising the remaining 50 participants, to test the selected model. A 3D-printed chamber for providing a light-controlled environment and a low-cost microcontroller unit were used to acquire optical measurements. The MFCC, PCA and ICA were calculated by an open-hardware computing platform. The glucose levels estimated by the system were compared to actual glucose concentrations measured by venipuncture in a laboratory test, using the mean absolute error, the mean absolute percentage error and the Clarke error grid for this purpose. The best results were obtained for MCCF with AdaBoost and Random Forest (MAE = 11.6 for both).


Introduction
Diabetes is the most probable cause of one in ten deaths in people 20-59 years, and it is catalogued as a national emergency in Mexico, according to the World Health Organization (WHO) [1,2]. Diabetes is a metabolic disorder in which the body is not capable of regulating blood glucose levels. Insulin is a hormone that is essential in the regulation of glucose concentration in the blood and how the body uses it for converting glucose into energy. This disease can be identified as two main types: type 1 diabetes (T1D) takes place when the pancreas barely produces a limited amount of insulin, and even sometimes it is incapable of producing any insulin. On the other hand, type 2 diabetes (T2D) comes about when the body cannot effectively use the insulin it produces [1,3,4].
The adverse effects of diabetes in health have been demonstrated, and to mention only a few it could lead to cardiovascular diseases, since people with T2D are at high mortality risk of cardiovascular illnesses such as coronary heart disease or heart failure [5,6]; diabetesassociated cognitive decline and dementia [7]; kidney alterations [8]; vision impairment in the form of diabetic retinopathy [9]; among others [10][11][12]. Recently, diabetes has been significantly associated with mortality from COVID-19 and also with duplicating the risk of developing more severe COVID-19, as compared to non-diabetics [13][14][15].
Blood glucose measurements can be performed by methods such as the glycated haemoglobin A1c (HbA1c) test or commercial glucometers for self-monitoring; however, the HbA1c laboratory test is considered the gold standard [16]. Both methods, the HbA1c test and glucometers, are invasive, i.e., they need to collect a blood sample by venipuncture or a pinprick, respectively, to carry out the process [17]. This precondition produces distress and discomfort in patients, thus hindering the proper accomplishment of monitoring for disease control and treatment. Although there are rapid tests as alternative methods other than venipuncture for monitoring, they also require a pinprick to collect a blood sample from the finger of patients-a situation that urges the experimental development of noninvasive sensors and systems for blood glucose estimation. To alleviate this predicament, minimally invasive glucometers (MIGs) have emerged as an alternative. Nowadays, some MIGs are commercially available, such as the Gluco Track (Integrity Applications Ltd., Ashdod, Israel), which uses the earlobe as the medium for measuring glucose by applying thermal and ultrasonic technology, with the inconvenience that it needs individual calibration [18], and the FreeStyle Libre (Abbott Diabetes Care, Inc. Alameda, USA) designed to estimate glycemia in adults [19]. Some other MIGs remain in progress, e.g., the SugarBEAT (Nemaura Medical Inc., Loughborough, UK) that employs its own transmitter device synchronized to a disposable patch attached to the user's skin, the GlucoWise (MedWise Ltd., London, UK), which makes use of power radio waves transmitted through the earlobe [19], or smart contact lenses (Google, LLC., Mountain View, USA) using tear fluid for measuring the glucose concentration [20].
Previously reported studies are commonly based on optical and spectroscopy technologies due to their ease and ability to analyze samples without the need of any prior manipulation [21,22]. Some of the preferred techniques for developing new MIGs investigated in the state-of-the-art employ near and mid-infrared spectroscopy [21][22][23][24][25][26], Raman spectroscopy [21,26], and infra-red spectroscopy based on Fourier transform [21]. Additionally, photoplethysmography (PPG) in the mid-infrared and first overtone regions of glucose absorbance [27][28][29] has been widely used for non-invasive glucose measurements, but suffers from drawbacks such as the high cost of the light sources, scattering due to fatty tissue, strong absorption due to water, and the high cost of the experimental setup. Most of the aforementioned studies use high-priced laboratory equipment or software that needs specialized personnel, hindering the development of reliable devices at an affordable price.
Besides signal acquisition techniques, feature extraction methods play an important role. Principal component analysis (PCA) and independent component analysis (ICA) are among the most widely used methods. In the speech recognition area, the Mel frequency cepstral coefficients (MFCC) method is one of the most important. These techniques are typically used to reduce the dimensionality of data before processing them by a pattern recognition algorithm. Numerous studies have reported good performances when using these algorithms, e.g., for EEG, ECG and biomedical applications [30][31][32][33]; nevertheless, only a few have addressed the MFCC [34,35] for glucose monitoring purposes, but PCA and ICA have not been investigated yet.
The aim of this work is to compare the performance of PCA, ICA and MFCC as feature extraction methods in a range of different regression algorithms (AdaBoost, Bayesian Ridge, Decision Tree, KNN, MLP, SVR, Random Forest, Ridge, Elastic Net, Linear Regression) included in the scikit-learn library for the Python programming language. The dataset intended for this purpose was acquired using a cost-effective setup. After the regression comparison, all algorithms were trained and applied to the external validation test with the aim to explore their performance for the further development of a daily-use device.

Materials and Methods
The present work encompasses a non-invasive methodology for estimating the blood glucose concentration by analyzing the light absorption response of a finger when a beam of light is pointed on it. To this end, a laser beam is directed to the fingertip of the user while an LDR (light dependent resistor) sensor captures the transmitted light across the finger. The working principle of the proposal is the Beer-Lambert law, which provides a formulation for calculating how much of a material is present in a sample by obtaining the response of its light absorbance. In other words, the quantity of a material present in a sample is proportional to its light absorption and, consequently, the intensity of transmitted light will decrease while the concentration of the material in a medium increases [36]. Although related past works utilize a variety of body tissues-such as forearm, earlobe or cheek-for measuring light absorbance, we selected the fingertip to take advantage of its high capillary concentration [37]. Furthermore, similar to some of the related works [21,22,[24][25][26]28,36], we hypothesized that variations in transmitted light may correspond to variations in glucose concentration in the blood. The glucose levels estimated by the system were compared to actual glucose concentrations measured by venipuncture in a laboratory test, using the mean absolute error, the mean absolute percentage error and the Clarke error grid for this purpose. A block diagram of the proposal is depicted in Figure 1.

Participants
A total of 122 participants-healthy and diagnosed with type 2 diabetes (T2D)averaging 36.7 ± 14.2 years, within a range of 21-76 years, were invited to be part of this study. All procedures in this study were performed in accordance with the provisions of the Declaration of Helsinki. All participants signed an informed consent form and are aware of the research objectives. Participants with a range of different skin tones were preferentially selected for capturing the common variations in the sample, thus making the dataset representative of our region.
The entire set of participants was divided into two partitions. The first, a training subset of 72 participants (~60%), was intended for model selection, whereas the second validation subset comprised the remaining 50 participants (~40%) and was intended to test the selected model with unknown data. Further details of the participants are described in Table 1. There, the average ± the standard deviation and the (min, max) range regarding the age and glucose concentration of participants are described. The percentages of female and male participants and for those diagnosed with T2D are also reported.

Data Collection
The system implementation considers a primary stage for data collection. In order to collect the data, a 650 nm wavelength laser beam was pointed to the user's fingertip. An LDR sensor, placed below the finger, was in charge of measuring the light that can be transmitted through the finger. Because LDR is highly sensitive to light changes, both components, the laser and the LDR, were enclosed in a 3D-printed chamber to provide a light-controlled environment. A low-cost microcontroller unit (MCU) was in charge of measuring the LDR signal. Data acquisition was performed with an MCU using a 1 kHz sampling frequency for a lapse of 6 s, thus obtaining 6000 different values arranged in a vector. Nevertheless, some variations in data collection can be induced by the effect of placing the finger in and out of the sensor; for this reason, we only considered the 4000 values in the middle and discarded the 1000 initial and final positions of the vector. This signal was transmitted to a Raspberry Pi (RPi) board, responsible for the feature extraction and regression procedure for glucose estimation.

Feature Extraction
Feature extraction techniques are intended for transforming a complex signal to representative variables to be used by prediction algorithms. In this manuscript, three methods for performing feature extraction were addressed: Mel frequency cepstral coefficients (MFCC) [38], principal components analysis (PCA) [39] and independent component analysis (ICA) [40]. Prior to applying the feature extraction algorithms, signal data were filtered using a Hamming window of size 40.
The MFCC method is one of the most used feature extraction techniques, commonly applied in acoustics and speech recognition [41]. In this method, the signal is divided into overlapped frames of n data. A fast Fourier transform (FFT) is applied to each of those frames, resulting in a signal representation in the frequency domain. The resulting signal spectrum is filtered by the Mel filter banks. Finally, the cepstral coefficients are calculated by computing the inverse discrete cosine transform (DCT) on logarithmic values of Mel filters. The MFCC function included in the python_speech_features library's outcomes are the thirteen first coefficients of the inverse DCT, which stands for the MFCC.
PCA is commonly used as an exploratory tool for data analysis; it is a decomposition technique that employs singular value decomposition (SVD) with the aim to project data into a lower dimensional space. The principal components represent the orthogonal projections for which the variance of data is maximum, that is, the directions in the feature space along which the original data are highly variable [42].
The ICA algorithm is typically applied in signal processing for separating superimposed signals rather than for dimensionality reduction. It is supposed to find the projections that decompose a multivariate signal into sources that are statistically independent [43].
For both PCA and ICA, the first three components were selected.

Regression Models
For this study, ten regression models were fit and tested. In order to obtain an enhanced result, a grid search for hyperparameter tuning was performed on the training subset using a 5-fold cross-validation scheme for each algorithm; parameters yielding the least MAE were taken into consideration. All regression algorithms were implemented with the Python programming language using the scikit-learn library. Below, a description of the regression models is presented.

AdaBoost
AdaBoost regressor is one of the most used boosting algorithms. It is a method that combines a number of weak algorithms, fitting them on the training data by adjusting the weights of corresponding instances according to the measured output errors. This process is repeatedly performed until a predefined error condition is met [44].

Bayesian Ridge
This estimates a probabilistic regression model in which the priors for the parameters are given by a spherical Gaussian. This model may include regularization parameters in the estimation procedure and is performed iteratively by trying to maximize the log-likelihood of the instances [45].

Decision Tree
A decision tree is boosted by fitting n + 1 decision trees on a dataset with a small amount of Gaussian noise. The results obtained by the n boosts, i.e., the decision trees, are each compared with a single decision tree regressor. As the number of boosts increases, the regressor is capable of including more details in the model [46].

Elastic Net
This model consists of a linear regression that intuitively includes an L1 and L2 regularization while fitting the coefficients to training data. Due to parameter regularization, the elastic net stands as a sparse model, which in turn is truly convenient when correlated features are present in the data [47,48].

k-Nearest Neighbors Regressor (k-NNr)
This is an instance-based and well-known model in the state-of-the-art. It works by performing two main procedures: first, obtaining a similarity function (commonly Euclidean distance) between the training dataset and the instance we want to predict; then, averaging the output of the k closest observations to give the prediction value. The parameter k is of great importance since its election may yield high error rates if it is a very large value, and overfitting when a lower k value is selected [49].

Linear Regression
This constitutes the simplest regression model, and it is also one of the preferred regression techniques due to its capability for capturing data behavior and its ease of implementation. In the present study, an Ordinary Least Squares (OLS) linear regression was implemented using the gradient descent method for error minimization and a mean squared error as a cost function [42].

Multilayer Perceptron Regressor (MLPr)
This type of neural network employs the backpropagation learning algorithm for training a multi-layer perceptron (MLP). Different from the traditional classification MLP, this implementation uses a linear activation function for giving a set of continuous values as an output [50].

Random Forest
This model fits a variety of decision trees considering random subsamples with replacement from the dataset in such a way that subsamples are always the same size as the original input. After that, it averages the results for overfitting control and helping to reduce the predictive error [51].

Ridge Regression
This model is similar to LASSO due to the fact that both are based on OLS linear regression and support multivariate regression. However, this very model overcomes some problems by using a coefficient regularization penalty given by the L2-norm. As a result, ridge coefficients are prone to minimizing the residual sum of squares for model construction [42,47].

Support Vector Regression (SVR)
This algorithm comes from the statistical learning theory. In its simplest way, it employs a linear kernel for delivering the regression values and a loss parameter denoting the maximum permitted error of predictions in contrast to reference output values. It also can be extended for non-linear predictions by using a kernel trick in which data are mapped to a higher-dimensional space [52,53].

Algorithm Performance
Regression analysis was performed for model selection. With the aim to measure the performance of the algorithms, the experiments were run considering the following aspects: • Evaluation metrics. The mean absolute error (MAE), mean absolute percentage error (MAPE) and the Clarke error grid were considered. MAE, computed as in Equation (1), constitutes the average difference between the estimated values vs. the lab test; in addition, MAPE represents the same difference but expressed in percentage (see Equation (2)). The Clarke error grid is a graph divided into five zones, for which the success of the results depends on where the reference glucose values versus the algorithm outcomes are plotted. That is, zones A and B stand for accurate or acceptable estimation, zone C is commonly associated with unnecessary treatments, but zones D and E are representative of potentially dangerous mistreatment caused by confusing hyperglycemia and hypoglycemia. • Cross-validation. Data from all participants were divided into two mutually exclusive partitions. Here, the k-fold cross-validation model was applied onto 60% of the data, and the remaining 40% were intended for testing the model once the algorithms were fitted. In this study, the five-fold cross-validation scheme was preferred. First, data are divided into five subsets, commonly referred to as folds, and repeatedly perform a procedure for which at the i-th step the corresponding i-th fold is taken as a test subset while the remaining four folds are used to train the regressor [42]. This paragraph is graphically outlined in Figure 2.

Results
As mentioned before, all the regression algorithms used 5-fold cross-validation for hyperparameter grid searching for model selection, and its outcomes are presented in Table 2. The results of the grid search process are depicted in Figure 3. The selected parameters were taken into consideration for fitting the regression algorithms on the training subset. Intuitively, the behavior of the algorithms was measured on the validation subset. As explained above, the performance metrics used in this study were the mean absolute error (MAE), the mean absolute percentage error (MAPE) and the Clarke error grid. As previously mentioned, three feature extraction procedures were tested. The first experiment consisted of fitting the models using the dataset built by computing the MFCC from the acquired signal; the total 13 MFCC were taken. The second and third experiments comprised the calculation of the PCA and ICA from the Hamming-filtered signals. For both PCA and ICA, the first three components were considered for feature construction. The performance results of MAE, MAPE and the Clarke error grid regions over the MFCC, PCA and ICA can be consulted on Table 3, Table 4 and Table 5, respectively.   Table 4. Results on the test subset using the first three components of PCA feature extraction.   Overall, the best scores reflected a MAE of 11.62, 13.98 and 16.73, obtained by the AdaBoost for the MFCC, the Random Forest for PCA and k-Nearest Neighbors when using ICA, respectively. Similarly, the best achieved MAPE was 10.21 and 10.42 for Random Forest with MFCC and PCA; with ICA, 12.87 was the best MAPE, found by k-Nearest Neighbors.

Regression Model MAE MAPE Clark Error Grid Region (%) (A-B-C-D-E)
It is worth mentioning that PCA and ICA seem to privilege those regressors built upon a linear regression, such as Bayesian Ridge, Elastic Net, Ridge Regression and Linear Regression itself. However, the MFCC extractor seems to perform on the contrary, and the linear regressors seem to be penalized. This behavior is reflected on the regions of the Clarke error grid in which AdaBoost and Random Forest achieved more than 90% of points in region A. Regarding PCA, AdaBoost, Random Forest and SVR have the best results, obtaining more than the 80% in region A. The best regressors using ICA, similar to what happens with PCA, were only AdaBoost and SVR, both with 82% of the points in the same region A.
Despite the fact that the family of linear regressors obtained reasonable MAE and MAPE results in PCA, note that their results regarding the Clarke error grid are not the best. They obtained 59-41% in regions A and B for Bayesian Ridge, Elastic Net and Linear Regression in PCA.

Discussion
The results presented in this study showed clinically acceptable prediction errors as established by Clarke grid analysis and regarding MAE and MAPE metrics, but also, they are competitive in comparison to previous related works. The authors in [34] acquired a long photoplethysmography signal and divided it to generate a larger number of samples. They extracted the MFCC and used them as input vectors for classification algorithms obtaining up to 90% of qualitative analysis, i.e., they identified subjects for hypoglycemia and normal or high glucose concentrations rather than obtaining the glucose concentration value; they also reported a correlation value of 0.88 in the Clarke error grid. With respect to [54], the authors took a picture of the subject's fingertip, and after processing the image they calculated descriptors that were the input values for fitting an ordinary least squares linear regression. The predicted values were then compared with two reference tests: a commercial glucometer and a standard laboratory test, reporting root mean squared errors of 15.94 and 9.81, respectively. Similarly, the authors from [55] presented a sophisticated device that incorporates different sensors for data acquisition such as an image sensor as well as diverse monochromatic light sources in the range from blue to infrared, and even a conventional invasive glucometer module. They reported results of MAPE values of 11.2%, 11.6% and 12.7% after conducting experiments in comparison to three commercial glucometers of different brands.
One of the main goals of our proposal is to provide a simple and effective way to monitor blood glucose concentration non-invasively. The main differences in relation to state-of-the-art works are that some involve the use of complex hardware systems, whereas in some others, the reported results do not provide an estimated glucose concentration value, offering a qualitative evaluation instead.

Conclusions
The potential use and comparison of the MFCC, ICA and PCA algorithms as feature extractors have been explored in this work. Here, each of them was applied along with ten well-known regression models in order to estimate the glucose concentration in blood in a non-invasive way. The obtained results and the experimental low-cost setup for data acquisition raise the idea of continuing the development of more affordable devices for glycemia monitoring.
Although the MFCC method has been applied in a range of diverse healthcare applications before, its use as a feature extractor in the estimation of blood glucose concentration had not been reported until now.
The present study constitutes a baseline for the exploration of different sensors with regard to the MFCC features towards the non-invasive estimation of blood glucose concentration. After analyzing the obtained results with the light-dependent resistor used here, we can hypothesize that the use of different kinds of sensors, such as photoplethysmography, could be one of the future directions for exploring the MFCC method as a feature extractor due to its inherent way of obtaining a variety of frequencies. Despite the existence of diversity in the skin color of participants, this topic was not considered for this study; nevertheless, it would be of paramount interest to be included in future research with a larger sample of subjects. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study can be available on request from the authors.