This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

The availability of continuous glucose monitoring (CGM) sensors allows development of new strategies for the treatment of diabetes. In particular, from an on-line perspective, CGM sensors can become “smart” by providing them with algorithms able to generate alerts when glucose concentration is predicted to exceed the normal range thresholds. To do so, at least four important aspects have to be considered and dealt with on-line. First, the CGM data must be accurately calibrated. Then, CGM data need to be filtered in order to enhance their signal-to-noise ratio (SNR). Thirdly, predictions of future glucose concentration should be generated with suitable modeling methodologies. Finally, generation of alerts should be done by minimizing the risk of detecting false and missing true events. For these four challenges, several techniques, with various degrees of sophistication, have been proposed in the literature and are critically reviewed in this paper.

The knowledge of glucose concentration in blood is a key aspect in the quantitative understanding of the glucose-insulin system and in the diagnosis and treatment of diabetes. The use of signal processing techniques on glucose data started some decades ago, when glucose time-series in a given individual could be obtained in laboratories from samples drawn in the blood at a sufficiently high rate. In particular, an important body of literature of the 80s and 90s employed not only linear (e.g., correlation and spectrum analysis, peak detection), but also nonlinear (e.g., approximate entropy) methods to investigate oscillations present in glucose (and insulin) time-series obtained, during hospital monitoring, by drawing blood samples every 10–15 min for up to 48 h [

New scenarios in diabetes treatment were presented in the last ten years, when minimally invasive continuous glucose monitoring (CGM) sensors, able to monitor glucose concentration continuously for several days, entered clinical research [

Most of the commercial minimally invasive CGM systems, e.g., and the CGMS^{®} (Medtronic Minimed Inc, Northridge, CA, USA) [^{®} (Menarini Diagnostics, Florence, Italy) [^{®} (Abbott Diabetes Care, Alameda, CA, USA) [^{®} (DexCom Inc, San Diego, CA, USA) [

The existence of a BG-to-IG kinetics, however, is not able to explain some of the discrepancies which are evident along the y-axis, e.g., in the interval between 18–25 h. This difference is likely due to a change of behavior of the CGM sensor performance after its initial calibration.

The fact that CGM profiles can be affected by calibration problems can be critical in several applications, e.g., alert generation systems and artificial pancreas. For this reason, real-time “recalibration” of CGM data is desirable, where, for recalibration, we intend a step where the sensor output (in mg/dl) is processed by an algorithm (e.g., external to the device) in order to improve its accuracy. After recalibration, the difference between BG and CGM samples should be due to BG-to-IG kinetics only.

Several studies in the literature tried to cope with the recalibration problem, mostly retrospectively. A detailed study has been presented by the DirecNet Study Group [

A more recent recalibration procedure, thought for an off-line application, is that presented by King

None of the above recalibration algorithms consider explicitly the distortion introduced by BG-to-IG kinetics and the possibly time-varying behavior of sensor performance (the sensor gain is estimated only once for the entire monitoring). A first comprehensive description of the CGM measurement process is due to Knobbe and Buckingham [

Under these assumptions, a nonlinear state-space nonlinear dynamic model can be obtained, where the unknown IG(k) and α(k) represent two of the unknown states, which can be estimated by EKF. An example on real data (collected in a type 1 diabetic subject) is shown in

Realistic

In addition to calibration errors, the CGM signal is also corrupted by a random noise component. The noise typically dominates the true signal at high frequency and is usually considered to be additive:
_{k} is the actual, but unknown, glucose level at time t_{k} and v_{k} is the random measurement error. The amount of noise corrupting the true glucose level is dependent on the sensor technology. For instance, some CGM profiles are more “stable” than others, as shown in _{k}} is lower in the first case (top panel). Moreover, a given sensor technology may behave differently in different subjects, e.g., the variance of {v_{k}} may vary among subjects (compare the noise affecting the FreeStyle Navigator® time-series of

Given the expected spectral characteristics of noise, (causal) low-pass filtering [_{k} from the measured signal y_{k} without also distorting the true signal u_{k}. In particular, distortion results in a delay affecting the estimate û_{k} with respect to the true u_{k}: the more the filtering, the larger the delay. It is easily appreciated that having a consistently delayed, even if less noisy, CGM signal could severely limit its use in practice, e.g., for the generation of timely hypo-alerts. A clinically relevant issue is thus the establishment of a suitable compromise between the regularity of û_{k} and its delay with respect to the true u_{k}. Understanding how denoising is done inside commercial CGM devices is often difficult, but evidence inferred from the patent literature indicate that nonlinear pre-filtering and moving-average (MA) filters are very often used [

Nonlinear pre-filtering like signal clipping, median filtering, and hard-bounding, are used in order to deal with e.g., spurious spikes which may occasionally occur during the monitoring (see e.g., the artifacts in the bottom panel of

As far as MA filtering is concerned, having fixed the so-called order M of the filter, the estimate of the signal at the k-th sampling time is a weighted sum of the last M measured samples:
_{1,} c_{2,}… c_{M} are positive real numbers. The order M and the weights {c_{k}} are crucial parameters to be tuned in the filter design. For instance, one choice is to make all the {c_{k}} equal. Another frequent choice is to let c_{i} = μ^{i}, where μ is a positive real (between 0 and 1) which, _{k} cannot track fast changes present in the true sequence u_{k} and/or a significant delay is introduced. On the other hand, undersmoothing may leave the filtered profile hardly usable for making decisions such as whether or not an alert should be generated. _{i}’s equal to 1) to the signals of

It is clear that any optimization made on order and weights of MA filtering cannot be directly transferred from one sensor to another. For instance, CGM time series observed with different sampling rates must be processed by filters with different parameters (even when they exhibit the same SNR, as it happens in some portions of the recordings of

Some methods which can be used for denoising CGM signals with approaches more sophisticated than MA can be found in several works [_{k} is described as a white noise with zero mean and unknown variance σ^{2} (depending on the individual time series and, in general, time-varying). The unknown signal u_{k} is modeled as the realization of a stochastic process obtained by the cascade of a certain number of integrators driven by a zero-mean white noise process with (unknown) variance λ^{2} (these models were already mentioned in Section 2). This is a commonly-used, simple, but versatile way to give an a priori second order probabilistic description of a smooth time-series. For instance, when a single integrator is considered, the random-walk model is used:
_{k}} is a sequence of white noise samples. Assuming a Gaussian setting, the “_{k–1}, then u_{k} will be with probability 99.7% in the range u_{k–1} ± 3λ, _{k}}. Here, λ^{2} is unknown and can be estimated, individually for each time-series, from the data {y_{k}} together with σ^{2}. This is possible using CGM data of a burn-in interval, by using a statistically-based smoothing criterion having a maximum likelihood interpretation. Once λ^{2} and σ^{2} have been estimated, the problem of extracting u_{k} from y_{k} (after the burn-in interval) can be numerically solved, in a computationally efficient manner, by resorting to the equations of the causal Kalman filter (KF). Therefore, a key feature of this method is that it can individualize the filter parameters λ^{2} and σ^{2}, and hence the smoothing amount, according to the SNR of the specific CGM signal.

The variance of the noise component in these two time-series is clearly different. The application of the denoising procedure provides estimates of σ^{2} equal, respectively from top to bottom, to 21.4 and 1.8 mg^{2}/dL^{2}, in line with the intuition that the first time-series is more noisy than the second one, and suggesting that a suitable amount of smoothing is introduced by the filtering method (for sake of completeness, the correspondent estimates of λ^{2} are 0.14 and 0.06 mg^{2}/dL^{2}, respectively). Interestingly, considering the top panel time-series, the time-lag introduced by Kalman filtering in

In Facchinetti ^{2} (and λ^{2}) could be determined continuously on a sliding window [

A natural on-line application of CGM sensors is the prevention of hypo/hyperglycemic events. Only a few years after the appearance of CGM sensors in the market, some methods were proposed to generate alerts when the actual trend of the glucose concentration profile suggested that hypoglycemia was likely to occur within a short time. Such techniques are often termed projection methods. For instance, in Choleau

An improvement can be obtained by generating hypo-/hyper-alerts on the basis of ahead-of-time prediction of glucose concentration, which can be computed from past CGM data and suitable time-series models. The possibility of making a short term prediction of glucose concentration exploiting its past history was originally suggested in Bremer and Gough [_{n} and {w_{i}} is a random white noise process with zero mean and variance equal to σ^{2}. Both models, _{n}, a new value of _{n}, u_{n−1}, u_{n−2}, ... by weighted linear least squares. Once _{s} = PH, where T_{s} is the sensor sampling period). For instance, in the case of

The necessity of having a time-varying ^{k} is the weight of the sample taken k instants before the actual sampling time, with μ, taken in the range (0,1), acting as forgetting factor. If a forgetting factor is not used (which is equivalent to letting μ = 1), glucose samples collected tens of hours, if not days, before the actual sampling time would influence prediction, with a possible deterioration of the algorithm capability to promptly track changes in the signal, in particular those due to perturbations, e.g., meals. From an algorithmic point of view, recursive least squares (RLS) implementations are possible in order to estimate the unknown model parameters

Assessing how much such a kind of predicted profiles can be useful for the prevention hypo-/hyper-glycemia events is critical, but crucial. A straightforward index which could be considered, especially for comparing the relative performance of different models (e.g., polynomial

A useful index is the energy of the second order differences of the predicted profile (ESOD), which reflects the presence of spurious oscillations in the predicted profile (oscillations are obviously undesirable, since they can facilitate the generation of false hypo-/hyper-alerts). Another index is the delay of the predicted profile with respect to the original curve. In fact, the difference between PH and this delay represents a measure of the “gain in time” for alert generation obtained thanks to the use of predicted, instead of measured, CGM time-series. Notably, a delay of the prediction profile comparable to PH (or larger) would make the prediction profile useless in practice. All the above indexes are useful in assessing CGM time-series prediction algorithms. Of note is that, they are all dependent on the chosen PH and forgetting factor μ. For instance, as well visible in

The problem of assessing the performance of a prediction algorithm has been presented by making reference to a specific prediction model, but it is completely general. As a matter of fact, how to optimally design a prediction algorithm for CGM data, e.g., model structure, order, prediction horizon, forgetting factor, is still an open issue. The work by Zanderigo

Another work using a low-order linear time-series modeling to predict CGM is that of Eren-Oruklu

In the approaches above, CGM time-series are described by a model with fixed structure, minimum complexity, but with time-varying parameters which, at each sampling time, are re-adjusted on the basis of the newly collected glucose sample. Since the model has to describe the time-series only “locally”, its complexity can be kept modest, a crucial aspect for using prediction algorithms in real-time. Reifman _{n} is the glucose samples collected at time t_{n} and w_{n} is a random white noise process. The model was first fitted, in each subject, in a burn-in interval and then used within the prediction algorithm for the rest of the time-series. A price to be paid for this choice lies in the increased model complexity which, in turn, requires the use of a rather long burn-in interval (about 2,000 samples, nearly 36 h). Moreover, given the high number of AR parameters to be estimated, the time-series model is overly sensitive to noise. Indeed, a “regularization constraint” was placed on the AR coefficients in order to decrease their sensitivity to the data. In Reifman

In Palerm and Bequette [_{k} is the state space vector at time k, y_{k} is the measurement vector, Φ and C are transition matrices, and v_{k} and w_{k} are state and measurement noise vectors, the Kalman methodology is used to predict glucose level after a given PH. Three different PH values were tested,

Recently, an artificial neural network model (NNM) has also been applied in the prediction of glucose concentrations using CGM data. The use of NNM for glucose prediction is appealing because it could facilitate the exploitation of information on meals and insulin administrations. In Pappada

A critical problem is the generation of alerts, see Heise

Generating alerts accurately is difficult, because CGM data are often inaccurate (

CGM sensors allow the development of new strategies for the treatment of diabetes. In this contribution, we have considered four specific issues which are crucial for a “smart” real-time application of CGM sensors, in both open-loop (alert generators) and closed-loop (artificial pancreas) systems: (re)calibration for enhancing the accuracy of CGM signals, filtering for the enhancement of the SNR, ahead-of-time prediction, and generation of hypo/hyper-alerts. The main achievements of the literature, and also some open issues, have been discussed.

Some of the algorithms for calibration, denoising, prediction and alert generation discussed in this paper have been deposited by the University of Padova [

Representative type 1 diabetic subject. Top: BG references (stars)

Representative type 1 diabetic subject (data taken from Kovatchev

Two representative type 1 diabetic CGM time series. Top: FreeStyle Navigator® time series (1 min sampling), taken from Kovatchev

Same data as in

Application of the Kalman filtering method of Facchinetti

FreeStyle Navigator® time-series in a type 1 diabetic subject taken from Kovatchev

Same data of

Risk of generating a false hypo-alert from the original CGM profile (blue line) at time 19.2 mitigated by employing an on-line filtered profile (green line) together with its confidence interval (shaded area).