Overcoming Individual Discrepancies, a Learning Model for Non-Invasive Blood Glucose Measurement

Liu, Weijie; Huang, Anpeng; Wan, Ping

doi:10.3390/app9010192

Open AccessArticle

Overcoming Individual Discrepancies, a Learning Model for Non-Invasive Blood Glucose Measurement

by

Weijie Liu

¹

,

Anpeng Huang

² and

Ping Wan

^1,3,4,*

¹

School of Software and Microelectronics, Peking University, Beijing 100871, China

²

National Institute of Health Data Science, Peking University, Beijing 100871, China

³

National Engineering Research Center for Software Engineering, Peking University, Beijing 100871, China

⁴

Key Laboratory of High Confidence Software Technologies (Peking University), Ministry of Education, Beijing 100871, China

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2019, 9(1), 192; https://doi.org/10.3390/app9010192

Submission received: 28 November 2018 / Accepted: 29 December 2018 / Published: 7 January 2019

(This article belongs to the Section Mechanical Engineering)

Download

Browse Figures

Versions Notes

Abstract

Non-invasive Glucose Measurement (NGM) technology makes great sense for the blood glucose management of patients with hyperglycemia or hypoglycemia. Individual Discrepancies (IDs), e.g., skin thickness and color, not only block the development of NGM, but also become the reason why NGM cannot be widely used. To solve this problem, our solution is designing an individual customized NGM model that can measure these discrepancies through multi-wavelength and tune parameters for glucose estimating. In this paper, an NGM prototype is designed, and a learning model for glucose estimating with automatically parameters tuning based on Independent Component Analysis (ICA) and Random Forest (RF) is presented. The clinic trial proves that the correlation coefficient between estimation and reference Blood Glucose Concentration (BGC) can reach 0.5 after merely 10 times of learning, and rise to 0.8 after about 60 times of learning.

Keywords:

non-invasive; blood glucose; diabetes; Independent Component Analysis (ICA); random forest

Graphical Abstract

1. Introduction

Diabetes mellitus is a metabolic disorder of multiple aetiologies characterized by chronic hyperglycaemia with disturbances of carbohydrate, fat, and protein metabolism resulting from defects in insulin secretion, insulin action, or both [1]. The original damage caused by diabetes mellitus is due to all kinds of diabetic complications, such as diabetic nephropathy, which is one of the most harmful. The World Health Organization (WHO) estimates that more than 180 million people worldwide have diabetes and this number is likely to be more than 350 million by 2030 [2].

Blood Glucose Measurement technology is necessary for people whose blood glucose is unbalanced to control their glucose level. According to recent researches, continuous glucose monitors (CGM) can greatly improve control of blood glucose [3]. James C. Boyd has pointed out that a blood glucose measurement with an error of less than 10% every 5 min can significantly reduce the harm caused by hyperglycemia or hypoglycemia [4]. However, most of the blood glucose meters currently available on the market are invasive, which needs to puncture the fingers to collect blood as a measurement sample. This measurement method will cause pain to the patient and is particularly not convenient for young children, which can lead to inefficient self-monitoring and is not suitable for continuous glucose monitoring.

In order to reduce the trauma to patients during blood glucose measurement, and increase the frequency of glucose monitoring, some medical device companies began to study mini-invasive or non-invasive blood glucose meters. In recent years, some devices have even obtained Communate Europpene (CE) or U.S. Food and Drug Administration (FDA) certification. The most famous product is Abbott

F r e e S t y l e

™

L i b r e

[5,6], a mini-invasive blood glucose meter attached to the patient’s arm, which was approved by FDA in September 2017. Although

F r e e S t y l e

™ can provide 24-h continuous blood glucose monitoring, it still needs to use a micro-needle to puncture the skin, and long-term wear can easily lead to wound infection. Not only that,

F r e e S t y l e

™ needs to be replaced every 14 days, making the measurement cost too large. In addition, this mini-invasive glucose meter measures the glucose concentrations of the subcutaneous interstitial fluid, instead of blood. Numerous studies have pointed out that there is a time lag between the glucose concentration in the blood and that in the interstitial fluid [7,8], which leads to a lower accuracy in blood glucose fluctuations or hypoglycemia, making it difficult to promptly alert. For these reasons, the research of non-invasive glucose measurement (NGM) technology has become one of the most popular topic in the medical treatment and the sensor field.

The history of NGM research has been more than 30 years, but the methods of NGM are still not systemic. Fortunately, many technologies have been applied to make the measurement more accurate. The most common ways of NGM are optical, thermodynamic, and other chemical or physical methods [9]. Although there is still no method whose accuracy can satisfy the requirements of clinic application, progresses on some important aspects have been made, e.g., the relation between glucose concentration and Raman spectroscopy acquired from physical tissue has been proved in [10]. The major techniques of optical methods are based on spectrum analysis of subcutaneous structures in the near or mid infrared (0.8–2.5

μ

m), in which the absorbance of the glucose bonds (

C - H, O - H

) is strong [11]. With the development of artificial intelligence, there are many researches that combines physical and physiological data with machine learning algorithm to predict glucose levels [12]. In Table 1 and Section 2, we will detail and analyze the mini-invasive and non-invasive blood glucose meters currently available on the market.

Overcoming Individual Discrepancies (IDs) is one of the key technical issues in NGM by spectrum analysis methods. In the light propagation path in skin tissue, there are many individual factors that can affect the final spectrum and glucose measurement results, e.g., the thickness and color of skin tissue, cuticle, epidermis, dermis and subcutaneous tissue, muscle, bone, etc. [13]. In non-invasive optical measurement of other blood components, e.g., blood oxygen, hemoglobin, etc., some methods have been proposed which are able to eliminate the interference of IDs. The dynamic spectroscopy method is able to extract the information of blood’s components, removing background and noise at the same time [14]. Double-sampling is used to improve signal-to-noise ratio of dynamic spectroscopy [15]. In the field of NGM, different measurement methods need different ways to overcome IDs, which needs further research.

In this paper, a learning NGM model that can overcome IDs, to a certain extent, is presented. This model is based on human earlobe light absorbances with five different wavelengths and a glucose estimating algorithm with automatically parameters tuning for individuals. The earlobe was chosen due to the absence of bone tissues, and also because of its relatively small thickness [16]. We choose three wavelengths (

730, 850

, and 930 nm) in the near infrared wavelength range whose absorbance mainly depends on the amount of glucose in blood to measure blood glucose level. In order to overcome the IDs, green (490 nm) and red (660 nm) light is used to estimate the tissue thickness and blood volume separately to compensate the measurements [12]. Based on the linear relation between absorbances of the five lights and blood components, we use Independent Component Analysis (ICA) [17] to extract independent components from transmittances data, and use them as new features data. Glucose concentration is obtained with Random Forest (RF) [18], and test its performance with a reference invasive glucose meter in terms of measurement accuracy. ICA is a method for separating complex signals to independent signals by matrix decomposition which is frequently used in signal processing [17]. RF is a machine learning model, which can be used for black-box modeling in the case of mechanism is not clear [18]. Prototype test result shows the effectiveness of this approach by Clarke Error Grid Analysis (EGA) [19].

The rest of this paper is structured as follows: In Section 2, some products of blood glucose meters are discussed. Section 3 focuses on analyzing the measurement theory that supports the NGM through absorbance spectrum. Section 4 describes the circuit of the prototype for getting light absorbance data. Section 5 presents the glucose estimation model based on ICA and RF. Section 6 shows the results of the measurement accuracy with EGA, and discusses the convergence speed of our proposal. Finally, this paper is concluded in Section 7.

2. Related Works

In this section, we will list the mini-invasive and non-invasive blood glucose meters currently certified by CE or FDA, and introduce them in detail, including their measurement principles and advantages and disadvantages. An overall comparison is shown in Table 1. It is worth mentioning that since

F r e e S t y l e

™

L i b r e

,

S y m p h o n y

™, and

G 5 / G 6

™ measures the interstitial glucose concentrations rather than the blood glucose concentration, some scholars do not consider these devices to be blood glucose meters. In this paper, considering that the original intention of these devices is to monitor the blood glucose levels by monitoring the interstitial glucose concentrations, we classify them as mini-invasive blood glucose meters, but the readers should know that they are not true blood glucose meters.

2.1. Mini-Invasive

F r e e S t y l e

™

L i b r e

: This is a wearable continuous glucose monitoring device from Abbott, and consists of a coin-sized disposable circular sensor that is attached to the upper arm by a miniature needle of

5.0

mm long and

0.4

mm wide and a small adhesive tape. The user can obtain his/her blood glucose level through a handheld device. It should be noted that this product measures the glucose concentration of the interstitial fluid, which exists a time lag compared to blood glucose level, which means that a finger prick test using a blood glucose meter is required during times of rapidly changing glucose levels when interstitial fluid glucose levels may not accurately reflect blood glucose levels, or if hypoglycemia or impending hypoglycemia is reported but the symptoms do not match the system readings [6]. In addition,

F r e e S t y l e

™

L i b r e

needs to be replaced every 14 days, to avoid wound infection.

S y m p h o n y

™: It was developed by Echo Therapeutics, Inc., as a mini-invasive continuous glucose monitoring device. A special skin preparation device is used to permeate the skin before placing the sensor. The device abrades the skin, removing about

0.015

mm of the outer layer of skin using adaptive micro-abrasion technology. This process takes 10–20 s, and removal of the outer layer of skin for allowing the measuring of a number of physiological properties, including interstitial glucose levels. All the measured values are transmitted wirelessly to a remote monitor that is equipped with alarm alerts if the glucose level is outside the normal range. Like

F r e e S t y l e

™

L i b r e

, since

S y m p h o n y

™ measures the interstitial glucose levels, it also does not accurately reflect blood glucose levels.

G 5 / G 6

™: Dexcom

G 5 / G 6

™ is a patch device, and composed of three parts: A sensor that measures glucose in the fluid under the skin, a processor that is embedded on the sensor and transmits the data to the receiver every five minutes, and a receiver that displays the blood glucose level to the user [20]. The measurement principle of this device is similar to

F r e e S t y l e

™ and

S y m p h o n y

™, and is indicated by the FDA for use as both a standalone CGM and for integration into automated insulin dosing (AID) systems [21].

G l u c o W a t c h

™: It is a watch-type glucose meter that contacts the human skin through an electrode, which generates a micro current to the skin. The charged ions and glucose in the subcutaneous interstitial fluid can migrate to the skin surface under the action of the electric field, and the glucose concentration is measured by glucose oxidase. This device has high requirements on the physiological condition of human body. Factors, e.g., skin sweating, environmental temperature, electrostatic interference, etc. will affect the measurement results. Due to its low accuracy,

G l u c o W a t c h

™ has been revoked by the FDA and is required to recall products that have already been sold.

2.2. Non-Invasive

G o o g l e

c o n t a c t

l e n s

: This is a smart contact lens project developed by Google on 16 January 2014. This project aims to assist people with diabetes by constantly measuring the glucose levels in their tears. Tears will flow into the monitor through a small holes in the lens, which containing a sensor that measures the glucose level in the tears. The users obtain their glucose levels by observing the change of the color of the lens. Unfortunately, Google announced the termination of the project on 16 November 2018, due to the large measurement error and difficult sample collection [22].

T e n s o r T i p

™

C o G

: It is a learning non-invasive glucose meter, developed by Cnoga Medical Ltd, that requires more than 100 invasive and more than 50 non-invasive data calibrations before use. The basis for measuring blood glucose is the absorption spectrum of the finger.

HG1-c: MediSensors is San Jose, California, USA based company that developed compact continuous NGM device i.e.,

H G 1 - c

, which is based on Raman spectroscopy [23].

G l u c o T r a c k^{T M}

: It consists of two parts, the handheld monitor and the ear-clip sensor, and integrates three measurement techniques: Ultrasound, electricity, and thermal. Ultrasound technique is used to measure the velocity of sound waves passing through the earlobe, and the velocity is affected by the concentration of blood glucose in the capillaries of the earlobe; electrical technique is used for measuring the change in conductivity of the tissue; and the heat transfer features of the tissue is obtained by the thermal technique.

S u g a r T r a c^{T M}

: This device is based on infrared spectrum. It is inserted in the ear canal, and emitting different wavelengths of light to the eardrum, analyzing the reflected signals to calculate glucose levels.

3. Theoretical Analysis

In the above, some blood glucose meters are analyzed, from which we can find that some mini-invasive glucose meters, despite obtaining the FDA certification, still have many problems. As for non-invasive blood glucose meters (NGM), none of them received FDA certification. One of the key reasons is that individual discrepancies (IDs) affect the measurement signal, making accuracy low. Next, we will introduce our research work on NGM technology, i.e., learning model for overcoming IDs. In this section, an earlobe model for blood glucose concentration estimation is established. Based on this model, the wavelengths selection rules are introduced.

3.1. Earlobe Model

The measuring principle of BGC in this paper is based on the Beer-Lambert law [24], which relates the light attenuation to the properties of the material through. The Beer-Lambert law can be described in

a (λ) = \sum_{i = 1}^{N} a_{i} = \sum_{i = 1}^{N} ε_{i} (λ) \int_{0}^{l} c_{i} (z) d z

(1)

a (λ) = l g \frac{I_{0}}{I}

(2)

where

a (λ)

is the absorbance of light at wavelength of

λ

.

ε_{i} (λ)

is the molar attenuation coefficient for

λ

wavelength light of the attenuating species i in the material sample.

c_{i}

is the amount concentration of the attenuating species i in the material sample. l is the path length of the beam of light through the material sample.

I_{0}

and I represent the intensity of incident and output light.

Taking the earlobe as a material sample, Equation (3) can be simplified as

a (λ) = \sum_{i = 1}^{N} ε_{i} (λ) c_{i} l^{^{'}}

(3)

where

l^{^{'}}

is the equivalent thickness of the earlobe, which varies from person to person.

There are many attenuating species in the earlobe, e.g., glucose, water, fat, hemoglobin, etc. These species can be represented by symbol i in Equation (3), and let 1 represent glucose. It is remarkable that the concentrations of these species are dissimilar from everyone and change with time. These individual discrepancies should be regarded as unknown variables. According to linearity algebra, multi unknowns can not be solved by one equation. To solve the

c_{i} (i = 1, 2, . . . N)

, we need at least N equations, which means that absorbance of N different wavelengths of light have to be measured to set up a system of linear equation as

\{\begin{matrix} a (λ_{1}) = ε_{1} (λ_{1}) c_{1} l^{^{'}} + ε_{2} (λ_{1}) c_{2} l^{^{'}} + . . . + ε_{N} (λ_{1}) c_{N} l^{^{'}} \\ a (λ_{2}) = ε_{1} (λ_{2}) c_{1} l^{^{'}} + ε_{2} (λ_{2}) c_{2} l^{^{'}} + . . . + ε_{N} (λ_{2}) c_{N} l^{^{'}} \\ . . . \\ a (λ_{N}) = ε_{1} (λ_{N}) c_{1} l^{^{'}} + ε_{2} (λ_{N}) c_{2} l^{^{'}} + . . . + ε_{N} (λ_{N}) c_{N} l^{^{'}} \end{matrix}

(4)

which can be established in the form of matrix

A = Θ \cdot C

(5)

where A and C are absorbance vector

{[a (λ_{i})]}_{N \times 1}

and concentrations vector

{[c_{i}]}_{N \times 1}

.

Θ

is a coefficient matrix

{[ε_{j} (λ_{i}) l^{^{'}}]}_{N \times N}

.

In theory, as long as absorbance vector A has been measured, the blood glucose concentration

c_{1}

can be worked out by Cramer’s Rule as

c_{1} = \frac{d e t (Θ_{1})}{d e t (Θ)}

(6)

where

Θ_{1}

is the matrix formed by replacing the 1-th column of

Θ

by the absorbance vector A.

There is still an essential question that is how to determine the coefficient matrix

Θ

which varies from person to person. We will solve this question by ICA and RF in Section 5.

3.2. Wavelength Selection

The choice of wavelength is very important for NGM, and inappropriate wavelength combinations will make Equation (5) an Ill-Conditioned Equation (ICE) [25] whose solution is extremely unstable. The ICE problem will cause the solution of Equation (5) an huge error under a tiny interference in vector A or matrix

Θ

. Unfortunately, this interference often occurs in NGM, reducing the robustness of the measurement.

In order to improve the accuracy and robustness of NGM, ICE should be avoided as much as possible. The solution is making the

k (Θ)

in Equation (7) as large as possible [25].

k (Θ) = ∥\begin{matrix} Θ^{- 1} \end{matrix}∥ \cdot ∥\begin{matrix} Θ \end{matrix}∥

(7)

k (Θ)

will be large enough, if the wavelength selection follows the following principle:

ε_{j} (λ_{i})

in

Θ

should be large as possible if

i = j

, otherwise it should be small. In other word, wavelength selection should ensure that the absorbance of components i is large at the wavelength of

λ_{i}

and should be small at other other wavelengths.

Figure 1 shows the spectrum of glucose from 500 nm to 1000 nm, in which there is an absorption peak at 930 nm. In order to eliminate the interference of other components, the spectrum is obtained from aqueous glucose solution and the water background is subtracted. Other components of the skin tissue have a lower absorbance in the near-infrared region (780–2526 nm) [11]. Therefore, the wavelength

λ_{0}

used to measure glucose is selected as 930 nm. The change of blood volume, oxyhemoglobin (Oxy-Hb) and deoxyhemoglobin (Deoxy-Hb) can be measured by dual-wavelength of 850 nm and 660 nm [26]. The attenuation degree of water is greater and decreased at the wavelength of 730 nm and 490 nm respectively. Meanwhile, fat is the main component part that effects the tissue light attenuation at 490 nm, which can be used to measure the thickness of earlobe. The five different wavelengths (490 nm, 660 nm, 730 nm, 850 nm, and 930 nm) are selected for blood glucose estimation.

4. Circuit of Blood Glucose Sensor

Earlight is the blood glucose sensor used in this paper for collecting absorbance data of different wavelengths. In this section, we focus on the circuit design of the Earlight, which is composed of three parts, i.e., light emitter, light receptor, and controller, as shown in Figure 2a.

The light emitter is used to emit incident light of different wavelengths, and contains five different wavelengths of LED light source (490 nm, 660 nm, 730 nm, 850 nm, and 930 nm), and a power dissipation of 120 mW. These LEDs are connected by using shared anode, Figure 2b.

The light receptor is used to receive transmitted light and measure its intensity. Light receptor includes two parts, current to voltage convertor (I-V converter) and filter amplifier circuit (F-A circuit), Figure 2c. In I-V converter, the silicon photodiode (PD) whose bandwidth spans from 300 nm to 1100 nm is able to produce a current whose intensity is proportional to the received light intensity, and the current signal is transformed to a voltage signal (

V o u t_{1}

) by MAX471 (Maxim, San Jose, San Jose, CA, USA), which is a current sensing IC presented by MAXIM company. In F-A circuit, the

V o u t_{1}

is filtered and amplified to

V_{r e t u r n}

that can be loaded into an analog to digital converter (ADC) via a low-pass first order filter and a proportional amplifier.

The controller is applied to control the LEDs and calculate measurement result. The controller is a SCM STM32f103 (STMicroelectronics, Geneva, Switzerland) minimum system with a small liquid-crystal display. There is a 12-bit ADC in this system, that can be used to measure the voltage signal produced by the light receptor and calculate absorbance of light according to Equation (2). There are two reasons for using a 12-bit ADC: 1. The 12-bit ADC provides 4096-level resolution, and we found that the acquired voltage signal varies from 0 to 3.6 V with an accuracy of 0.001 V, which needs a resolution of

(3.6 - 0) / 0.001 = 3600

. So a 4096 resolution of a 12-bit ADC is sufficient. 2. The STM32f103 microcontroller we use has a 12-bit ADC, so there is no need to add an external ADC, which will reduce the cost.

The demonstration of using the prototype of blood glucose sensor is showed in Figure 3. Using this prototype, the data of absorbance can be collected, and the algorithm to estimate the blood glucose concentration will be presented in Section 5.

5. Algorithm of NGM

This section will introduce the algorithm to build an Independent Feature Converter (IFC) and a Glucose Estimator (GE) that can be used to calculate the Blood Glucose Concentration (BGC) from the absorbance data collected by the sensor presented in Section 4. The process for parameter adjustment of IFC and GE is called training process, which needs a set of training sample data set that include six attributes, five absorbance attributes and one reference BGC attribute. The schematic diagram of this algorithm is showed in Figure 4.

5.1. Independent Feature Converter

IFC is used to transform absorbance vector A to independent components concentration vector

C^{*}

. According to Equation (5), any component concentration can be nearly calculated by multiplication of the inverse of coefficient matrix

Θ^{- 1}

with absorbance vector A. The key is determining the coefficient matrix

Θ

that is hard to measure and, to some extent, varies from person to person. Although

Θ

cannot be determined accurately, the approximate matrix can be figured out by Independent Component Analysis (ICA) [17] based on the training data set.

ICA is an effective liner-transformation algorithm in multivariate and linear algebra where a matrix is factorized into two matrices, with the property that can extract independent components from a complex data matrix [17]. These independent components make the resulting matrices easier to inspect. In this paper, the two resulting matrices represent independent feature matrix F and coefficient matrix

Θ

respectively. The algorithm of ICA in this paper is called fast fixed-point algorithm [27], which uses maximum entropy to determine factorized results.

In the ICA, there is a basic premise that all the component concentrations are independent. This premise is not very strong, thus making some errors in BGC estimating, but the premise is very useful and the error can be compensated by the following GE.

First, we get an absorbance data matrix T from the training dataset, as

T = [\begin{matrix} A_{1}^{T} \\ A_{2}^{T} \\ ⋮ \\ A_{M}^{T} \end{matrix}] = [\begin{matrix} a_{1} (λ_{1}) & a_{1} (λ_{2}) & \dots & a_{1} (λ_{N}) \\ a_{2} (λ_{1}) & a_{2} (λ_{2}) & \dots & a_{2} (λ_{N}) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ a_{M} (λ_{1}) & a_{M} (λ_{2}) & \dots & a_{M} (λ_{N}) \end{matrix}],

(8)

where

A_{i}^{T}

means the transpose of the absorbance vector for the i-th record in training dataset. Based on Equation (5), the matrix T can be factorized to F and

Θ

T_{M \times N} = F_{M \times N} \cdot Θ_{N \times N},

(9)

where

F = [\begin{matrix} C_{1}^{T} \\ C_{2}^{T} \\ ⋮ \\ C_{M}^{T} \end{matrix}] = [\begin{matrix} c_{1} (1) & c_{2} (1) & \dots & c_{1} (1) \\ c_{1} (2) & c_{2} (2) & \dots & c_{2} (2) \\ ⋮ & ⋮ & ⋱ & ⋮ \\ c_{1} (M) & c_{2} (M) & \dots & c_{N} (M) \end{matrix}] .

(10)

C_{i}^{T}

means the transpose of concentration vector, and

c_{j} (i)

is the j-th component concentration of the i-th record. F is called independent feature matrix which is immeasurable. Matrix

Θ

is the same coefficient matrix in Equation (5).

Since the physical significance of the decomposition of matrix T has been elucidated, let us start discussing how to decompose the matrix T. The principle of decomposition is to make matrix F contain as much information as possible. Considering A and C as random variables, matrix F contains the most information when random variable C obeys a Gaussian distribution according to the information theory. But [17] has pointed out that the signal analyzed by ICA cannot be subject to a Gaussian distribution and has proposed an alternative—the sigmoid function. Let C subject to a distribution whose probability density function (PDF) is the first order derivative of sigmoid function, as

f_{c} (C) = s i g m o i d^{^{'}} (C) = \frac{e^{C}}{{(1 + e^{C})}^{2}} .

(11)

where

f_{c} (C)

is the PDF of vector C and

s i g m o i d^{^{'}} (C)

is first order derivative of the Gaussian distribution function.

Combining Equations (5) and (11), the PDF of vector is

f_{A} (A) = f_{c} (Θ^{- 1} A) | Θ^{- 1} | = s i g m o i d^{^{'}} (Θ^{- 1} A) | Θ^{- 1} |,

(12)

and then the coefficient matrix

Θ

is estimated by means of maximum likelihood estimation (MLE). The likelihood function is

L (Θ) = l o g \prod_{i = 1}^{M} f_{A} (A_{i}) = \sum_{i = 1}^{M} (\sum_{j}^{N} l o g (s i g m o i d^{^{'}} (θ_{j} A_{i})) + l o g | Θ^{- 1} |) .

(13)

where

θ_{j}

is the j-th row of

Θ^{- 1}

. Using Adam method [28], the approximation of maximum point

Θ

can be obtained. Adam is an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. Adam is well suited for optimization problems that are large in terms of data parameters. The iterative formulas are

\{\begin{matrix} m^{(t + 1)} = β_{1} m^{(t)} + (1 - β_{1}) \cdot ▿ L (Θ^{(t)}) \\ v^{(t + 1)} = β_{2} v^{(t)} + (1 - β_{2}) \cdot ▿ L {(Θ^{(t)})}^{2} \\ m^{^{'}} = m^{(t + 1)} / (1 - β_{1}) \\ v^{^{'}} = v^{(t + 1)} / (1 - β_{2}) \\ Θ^{(t + 1)} = Θ^{(t)} - α \cdot (m^{^{'}} / \sqrt{v^{^{'}}}) \end{matrix}

(14)

where

▿ L (Θ)

is the gradient of

L (Θ)

and t is the number of iterations. m and v are the first and second momentum, which are initialized at zero, and

β_{1}

and

β_{2}

are the momentum parameters ranging from 0 to 1.

α

is iteration step which is known as learning rate.

m^{^{'}}

and

v^{^{'}}

are bias corrections that are only relevant in first few iterations when t is small. The bias correction compensates for the fact that m and v are initialized at zero and need some time to “warm up”.

Calculating

Θ

by directly using all of T data requires much computing and memory resource. In practice, we randomly select a mini subset of T to calculate the gradient of

L (Θ)

at each iteration. This strategy is called mini-batch strategy, and the length of mini subset is called batch size. The pseudo-code of ICA based on Adam and mini-batch is in Algorithm 1.

Algorithm 1 ICA based on Adam and Mini-batch

Require:: T, $α$ , $β_{1}$ , $β_{2}$ , $b a t c h_s i z e$ and $m a x_i t e r a t i o n$
Ensure:: $Θ$
1:: Initialize the matrix $Θ$ randomly.
2:: $m \leftarrow 0$ , $v \leftarrow 0$
3:: $L^{^{'}} \leftarrow 0$ , $Θ^{^{'}} \leftarrow Θ$
4:: for $i = 0 \to m a x_i t e r a t i o n$ do
5:: $T^{^{'}} \leftarrow$ Randomly select a subset of size $b a t c h_s i z e$ from T
6:: $▿ L (Θ^{^{'}}) \leftarrow$ Calculate the gradient of $L (Θ^{^{'}})$
7:: $m \leftarrow β_{1} m + (1 - β_{1}) \cdot ▿ L (Θ^{^{'}})$
8:: $v \leftarrow β_{2} v + (1 - β_{2}) \cdot ▿ L {(Θ^{^{'}})}^{2}$
9:: $m^{^{'}} \leftarrow m^{(t + 1)} / (1 - β_{1})$
10:: $v^{^{'}} \leftarrow v^{(t + 1)} / (1 - β_{2})$
11:: $Θ^{^{'}} \leftarrow Θ^{^{'}} - α \cdot (m^{^{'}} / \sqrt{v^{^{'}}})$
12:: if $L (Θ^{^{'}}) > L^{^{'}}$ then
13:: $L^{^{'}} \leftarrow L (Θ^{^{'}})$
14:: $Θ \leftarrow Θ^{^{'}}$
15:: end if
16:: end for
17:: return $Θ$

The IFC based on ICA, referred to by use of matrix

Θ^{- 1}

, has been built, and the absorbance vector A can be transformed to independent components concentration vector C by Equation (15)

C^{*} = I F C (A) = Θ^{- 1} \cdot A .

(15)

5.2. Glucose Estimator

The function of GE is to estimate BGC by reducing the errors and differences between the independent component concentration vector

C^{*}

and the true components concentration vector C. These errors and differences are cased by the weak premise made in ICF, inaccurate measuring and any other unexpected factors. The input and output of GE are

C^{*}

and BGC respectively. Because of some unknown factors, the transform process becomes a black-box process that can only be modeled by a machine learning model, such as Random Forest (RF), Adaboost, support vectors machine (SVM), decision tree (DT), etc. Plus, Deep Learning would also seem to be a good choice, but that requires thousands of training data, which is hard to get in our project at present.

By comparison (see Table 2), we use random forest (RF) to model the transform process in GE in this paper. Random forest is an ensemble learning method for classification and regression by constructing a multitude of decision trees [18]. As other supervised model, RF needs a training data set to adjust parameters, and then RF can be used to estimate BGM from

C^{*}

. The estimating value of RF is mean estimating value of the individual trees. The training data set contains a set of

C^{*}

vectors that are transformed from data matrix T by IFC and a set of reference BGC data acquired by invasive glucose meter.

There are two reasons why we choose RF to model GE. The first reason is that ICF is based on maximizing entropy of

C^{*}

, which has the same optimization goal with RF. Secondly, comparing with other machine learning models in Table 2, its accuracy is proved. Through cross-validation, the number and depth of RF is set to 200 and 2, and the aggregating method is averaging the predictions of all tress in forests. Let Equation (16) represent GE.

B G C = G E (C^{*})

(16)

In summary, the IFC and GE have been built and represented by Equations (15) and (16), which can estimate BGC from absorbance vector A acquired via the prototype designed in Section 4.

6. Experiment and Results

After producing the prototype, a series of clinical trials have been performed to verify the accuracy of the non-invasive glucose meter. This section is divided into two parts, which are data acquisition and experimental results. The former introduces how to gather experimental data from subjects, while the latter introduces how to prove the accuracy of the non-invasive glucose meter through the data gathered from subjects.

6.1. Data Acquisition

The purpose of data acquisition is to obtain a training dataset and a testing dataset, for ease of training the algorithm through these datasets and testing the accuracy of blood glucose estimation. For data gathering, each record includes nine attributes, of which the first one is the record’s ID, and the second to the sixth gathering from our prototype are absorbance data of earlobe at each wavelength of 490 nm, 660 nm, 730 nm, 850 nm, and 930 nm. The seventh is blood glucose concentration for reference which is collected by a standard invasive glucose meter. The eighth attribute is the name of subject, and the ninth and tenth are date and time of the collection.

The data acquisition experiment is performed during September of the year 2017 from 1st to 15st. There are six yellow race adult subjects participating in the experiment, among them we have five diabetic patients and one healthy individual. The characteristics of each subjects are presented in Table 3. The prototype designed in this paper and a

S A N N U O^{T M}

invasive glucose meter are adopted as data acquire equipment. We obtain blood glucose data from each subject five times a day, which happened at morning on an empty stomach, 2 h after breakfast, right before lunch, 2 h after lunch, and right before dinner. During the data acquisition process, due to personal reasons of some subjects, such as physical examination, outing, etc., blood glucose measurement cannot be performed at the specified time, resulting in some data missing. In addition, we have excluded some abnormal data due to hardware reasons, such as signal loss, signal anomalies, etc. Finally, 286 valid records are received after eliminating unmeasurable or invalid data. Among these records, 84 records that are rather complete come from a subject on focus tracking who is labeled as subject #1, while the remaining 202 records come from the other five subjects who are labeled as subject #2 to subject #6.

6.2. Experiment Results

This paper has carried on four test experiments to verify the performance of NGM with the data acquired in Section 6.1. The first experiment called accuracy experiment is designed for accuracy testing. The second experiment called convergence experiment is used to determine the convergence speed of the NGM algorithm and the minimum number of samples for training. Through the third transplant experiment, we try to test whether this NGM can be directly transplanted to use for another person. The gauge of the performance of NGM are Clarke Error Grid Analysis (EGA) and the correlation coefficient for estimation and reference BGC formulated as

C o r r = \frac{\sum \hat{c} c - \frac{\sum \hat{c} \sum c}{n}}{\sqrt{(\sum {\hat{c}}^{2} - \frac{{(\sum \hat{c})}^{2}}{n}) (\sum c^{2} - \frac{{(\sum c)}^{2}}{n})}} .

(17)

where

C o r r

is the correlation coefficient,

\hat{c}

and c are estimation and reference BGC vector respectively, and n is the length of BGC vector.

EGA is designed to quantify clinical accuracy of blood glucose estimates generated by meters as compared to a reference value [19]. The grid breaks down a scatterplot of a reference glucose meter and an evaluated glucose meter into five regions. The meanings of each region are listed in Table 4.

6.2.1. Accuracy Experiment

We randomly select 80% samples from the dataset acquired from subject #1 as training dataset, and the other 20% is used as testing dataset. There are three steps in the accuracy experiment. Step1: Training the algorithm through the training dataset. Step2: Estimating BGC value through the obtained algorithm. Step3: Calculate the correlation coefficient for estimation and reference of BGC, and then draw the Clarke Error Grid. After this experiment, the correlation coefficient

C o r r

of Random Forest with ICA equals to 0.819, and the Clarke Error Grid is shown as Figure 5b, which places 88.2% of points in regions A or B, 11.8% in region C, and no point in regions D or E. For comparison, the results of the Random Forest without ICA are shown in the Figure 5a. It can be concluded that ICA helps to improve the accuracy of Random Forest.

6.2.2. Convergence Experiment

We are particularly interested in finding how the size of training dataset affects the algorithm performance, because the size is related to how soon a good model can be built for a patient once he/she starts using our Earlight device. For this purpose, we construct a series of training datasets of different sizes, and use these training datasets to train the algorithm while recording the correlation coefficient in testing dataset. The relationship between the correlation coefficient and the number of samples in the training dataset is shown in Figure 6.

From Figure 6, we find that the correlation coefficient reaches 0.5 rapidly where the number of samples is merely 10, and finally tend to a steady value of 0.80 where the number of samples is close to 60. This experiment result proves the convergence of the algorithm. In other words, the NGM system in this paper can partially complete the parameters adjustment after 10 times of training by a standard blood glucose meter, and reaches to a high accuracy estimation state after 60 times of training. Similarly, as a comparison, we also did a convergence experiment on RF wihout ICA, and the experimental results are shown with dotted line in the Figure 6. As can be seen, without the help of ICA, the convergence speed of RF is significantly reduced.

6.2.3. Comparative Experiment

To select regression method for glucose estimating is a critical but difficult work, which can only be accomplished contrasts of various methods through experiments. Before we chose the RF as glucose estimator (GE), a comparative test of RF [18], Adaboost [30], kNN [31], SVM [33], and Decision Tree (DT) [32] for glucose estimating has been conducted. This experiment is designed for two causes which are to compare performances of different regression methods and to analyze the performance of ICA.

The comparative experiment consists of two parts. The first part is about the selection of the optimal parameters for each regression method. Since the performance of regression method depends on the parameters directly, to make sure the experimental results can represent its peak performance, it is of great importance to select a best group of parameters for regression method before conducting the comparative test. In this paper, we use 5-fold cross-validation to select best parameters for each regression method [34]. First, select a group of parameters for cross-validation. Secondly, use cross-validation to test the accuracy of this group of parameters. In 5-fold cross-validation, the original training data set is randomly partitioned into 5 equal sized subsamples. Of the 5 subsamples, a single subsample is retained as the validation data for testing the model, and the remaining 4 subsamples are used as training data. The cross-validation process is then repeated 5 times, with each of the 5 subsamples used exactly once as the validation data. The 5 results from the folds can then be averaged to produce a single accuracy (correlation coefficient,

C o r r

) that can be used as the accuracy of this group of parameters. Thirdly, try some other groups of parameters and test their accuracy. Finally, select the group of parameters that has the highest accuracy as the best group of parameters to conduct the next part of the experiment. Column 3 in Table 2 has listed the best parameters settings of different regression methods. All these methods are realized by the scikit-learn [29], which is a Python machine learning program package.

The second part of the comparative experiments, like accuracy and convergence experiments, is designed to compare the accuracy of different regression methods with or without ICA. First, randomly select 80% of samples from the dataset acquired from subject #1 as training dataset, and the other 20% of samples is used as testing dataset. Secondly, training various regression methods with the best parameters acquired from the first part of this experiment. Thirdly, calculate the accuracy (

C o r r

) as the same way in accuracy experiment. Finally, draw the convergence curves of different models. In Table 2, column 4 has listed the best

C o r r

of different regression methods, and their required number of samples in the training set that can make

C o r r

reach to 0.5 and the best are listed in column 5 and 6 respectively. It can be noted that using RF with ICA model, it takes 60 times to train the algorithm to achieve an accuracy of

C o r r = 0.819

, which means that each patient should prick himself/herself 60 times to inform the algorithm with his/her 60 blood concentration levels before performing non-invasive measurements. Still, this is twice as fast as

T e n s o r T i p^{T M}

’s 100 times of training.

As is presented in Table 2, it is safe to say that RF shows a better performance than other regression methods. Although RF does not seem to have a fast convergence speed as Adaboost do, it does show a better convergence stability and a higher accuracy than Adaboost can provide. Since kNN and DT both put up a much lower accuracy comparing with RF, they can be ruled out of consideration. For glucose estimating, SVM represents a quite terrible performance, of which the possible reason is that we have not found out a proper kernel function.

As can be seen from Table 2, ICA has some positive effect on the accuracy and convergence speed of three regression methods which are RF, Adaboost, and DT. The reason of this phenomenon is that ICA is to be equivalent to introduce prior knowledge of blood glucose measurement into regression methods. Thanks to ICA, convergence speed of these algorithms have been improved, which make sure a higher accuracy with less amount of training samples. This kind of improvement is of great importance to NGM. In addition, all the three regression methods of RF, Adaboost, and DT are based on maximum entropy which ICA also sees as the objective of decomposition, therefore ICA is helpful to enhance the accuracy of these three regression methods. As for kNN, ICA does not do any good for improving its convergence speed or accuracy as imagined, on the contrary, it brings a negative effect. It is because that kNN is a regression method based on Euclidean distance, while ICA decomposition simply transfer from one space to another, which not only does no help but also brings in more uncertainties when it comes to distance calculation.

In conclusion, RF [18] combined with ICA shows the best performance, therefore this kind of combination has been chosen for the NGM in this paper.

6.2.4. Transplant Experiment

In the transplant experiment, we try to use the algorithm trained through the dataset acquired from subject #1 to estimate the BGC of other five subjects. The correlation coefficient turns out to be 0.153, 0.021, −0.172, 0.017, and −0.221. These results are not as ideal as expected which proves that the individually-tailored algorithm cannot be transplant to others for use. Meanwhile, this experiment result indicates the great influence of individual discrepancies on optical glucose measurement. To overcome the effect of individual differences, it is necessary to have a personal customized NGM, but that is almost never the case. Although the NGM proposed in this paper cannot be transplant to others for use directly, it can be transplant after a brief learning process to readjust parameters.

6.3. Comparision with other NGMs

NGM is a hotspot in the field of glucose detecting nowadays. So far, a large amount of NGM methods have been proposed, among them some have achieved a pretty high accuracy, however most of these methods are lab-only with unsolved issues of transition and popularization. The limitations of these methods themselves lead to this awkward situation, since all the methods require support from large amount of sophisticated instruments at high expenses with complex operation procedures and strict experimental conditions. A comparison on input, accuracy, hardware cost and condition requirement of the methods proposed in this paper and three other recently published methods is listed in Table 5.

Different measurement methods have different levels of accuracy. Among them, some have a lower accuracy but a higher practicability. The quality of a method is not only judged by accuracy, instead more indicators should be taken into consideration when it comes to a comprehensive evaluation. As demonstrated by these experiments, the method proposed in this paper not only has a preferable accuracy, but also has the advantages of low cost and good robustness. In a word, our method shows an excellent comprehensive performance.

7. Conclusions and Future Works

In this paper, a non-invasive blood glucose meter based on multi-wavelength absorbance is studied. To deal with the individual discrepancies (IDs), a learning algorithm has been proposed, along with a device prototype. The experiments demonstrate that the correlation coefficient between measuring results of the prototype and an invasive glucose meter can reach 0.5 after merely 10 times of parameters correction, not to mention that after 60 times of correction the correlation coefficient rises to 0.80, which means that if you measure blood glucose 7 times a day, it only takes about 8 days to perform invasive calibrations, and then you can perform non-invasive measurement. This kind of glucose meter is able to reduce the effect of IDs in optical measurement method. In other words, in order to eliminate the impact of IDs, the model parameters of this type of learning glucose meter are individually customized. This type of glucose meter also has a shortcoming, i.e., it cannot be used interchangeably, since the parameters of a meter are adjusted through an individual training dataset. Compared with other NGM methods which require high-precision instruments, this learning non-invasive glucose meter has a lower hardware cost at only about $50, which will lead to a higher market prospect. In summary, the main contributions of this paper are as follows:

We have conducted a survey on the FDA or CE certified mini-invasive or non-invasive glucose meters currently available on the market;
For non-invasive blood glucose modeling, we propose a learning model framework, including an independent feature converter (IFC) and a glucose estimator (GE);
Based on our model, we designed a prototype of a non-invasive blood glucose meter, and collected real-world data with it;
Through experimental verification, we found that the combination of ICA and RF has the highest accuracy for estimating blood glucose concentration, and ICA helps to speed up the convergence of the model;
We confirmed the existence of individual discrepancies in NGM through the transplant experiment, and verified that the glucose meters based on a learning model cannot be used interchangeably.

This paper is only a preliminary/exploratory study of non-invasive blood glucose measurement techniques due to the efficiency and performance of our device and algorithm were evaluated on only 6 subjects. In the future, we will improve our equipment design and conduct clinical trails with more subjects and longer durations. The technologies of NGM has been developing rapidly in recent years, but most of them are in the lab-only stage, and there is still a long way to go for clinical use.

Author Contributions

Conceptualization, W.L. and A.H.; Methodology, W.L.; Hardware and software, W.L. and P.W.; Resources, A.H. and P.W.; Data curation, W.L. and A.H.; Writing—original draft preparation, W.L.; Writing—review and editing, A.H. and P.W.; Visualization, W.L.; Project administration, P.W.; Funding acquisition, P.W.

Funding

This research was funded by Peking University Medical Cross Research Seed Fund.

Acknowledgments

Our special thanks go to all the subjects who participated in the data acquisition.

Conflicts of Interest

The authors declare no conflict of interest.

References

Alberti, K.G.M.M.; Zimmet, P.Z. Definition, diagnosis and classification of diabetes mellitus and its complications. Part 1: Diagnosis and classification of diabetes mellitus. Provisional report of a WHO Consultation. Diabet. Med. 2015, 15, 539–553. [Google Scholar] [CrossRef]
WHO. Diabetes, Fact Sheet no. 312. Available online: http://www.who.int/mediacentre/factsheets/fs312/en/index.html (accessed on 6 January 2019).
Kovatchev, B.; Anderson, S.; Heinemann, L.; Clarke, W. Comparison of the numerical and clinical accuracy of four continuous glucose monitors. Diabetes Care 2008, 31, 1160. [Google Scholar] [CrossRef] [PubMed]
Boyd, J.C.; Bruns, D.E. Effects of measurement frequency on analytical quality required for glucose measurements in intensive care units: assessments by simulation models. Clin. Chem. 2014, 60, 644. [Google Scholar] [CrossRef] [PubMed]
Bailey, T.; Bode, B.W.; Christiansen, M.P.; Klaff, L.J.; Alva, S. The Performance and Usability of a Factory-Calibrated Flash Glucose Monitoring System. Diabetes Technol. Ther. 2015, 17, 787–794. [Google Scholar] [CrossRef] [PubMed]
Inc, A.D.C. Freestyle Libre Flash Glucose Monitoring System. Available online: https://freestylediabetes.co.uk/freestyle-libre (accessed on 6 January 2019).
Wientjes, K.J.; Schoonen, A.J. Determination of time delay between blood and interstitial adipose tissue glucose concentration change by microdialysis in healthy volunteers. Int. J. Artif. Organs 2001, 24, 884. [Google Scholar] [CrossRef] [PubMed]
Keenan, D.B.; Mastrototaro, J.J.; Weinzimer, S.A.; Steil, G.M. Interstitial fluid glucose time-lag correction for real-time continuous glucose monitoring. Biomed. Signal Process. Control 2013, 8, 81–89. [Google Scholar] [CrossRef]
Vashist, S.K. Non-invasive glucose monitoring technology in diabetes management: A review. Anal. Chim. Acta 2012, 750, 16–27. [Google Scholar] [CrossRef] [PubMed]
Hunter, M.; Enejder, A.; Scecina, T.; Feld, M.; Shih, W.C. Raman Spectroscopy for Non-Invasive Glucose Measurements. J. Biomed. Opt. 2005, 10, 031114. [Google Scholar]
Heise, H.M. Technology for non-invasive monitoring of glucose. In Proceedings of the International Conference of the IEEE Bridging Disciplines for Biomedicine, Amsterdam, The Netherlands, 31 October–3 November 1996; Volume 5, pp. 2159–2161. [Google Scholar]
Heise, H.M.; Marbach, R.; Koschinsky, T. Noninvasive Blood Glucose Sensors Based on Near-Infrared Spectroscopy. Artif. Organs 1994, 18, 439–447. [Google Scholar] [CrossRef]
Jiang, J.; Xu, K. Monte Carlo simulation on the effect of dermal thickness variances on noninvasive blood glucose sensing. Proc. Spie 2013, 8580, 85801C. [Google Scholar]
Li, G.; Wang, Y.; Li, Q.X.; Li, X.X.; Lin, L.; Liu, Y.L. Theoretical study on improving noninvasive measurement accuracy of blood component by dynamic spectrum method. J. Infrared Millim. Waves 2006, 25, 345–348. [Google Scholar]
Li, G.; Zhou, M.; Lin, L. Double-sampling to improve signal-to-noise ratio (SNR) of dynamic spectrum (DS) in full spectral range. Opt. Quantum Electron. 2014, 46, 691–698. [Google Scholar] [CrossRef]
Saptari, V.A. A Spectroscopic System for Near Infrared Glucose Measurement. Ph.D. Thesis, Massachusetts Institute of Technology, Cambridge, MA, USA, 2004. [Google Scholar]
Bell, A.J.; Sejnowski, T.J. An Information-Maximization Approach to Blind Separation And Blind Deconvolution; Bradford Company: Holland, MI, USA, 1999; pp. 1129–1159. [Google Scholar]
Breiman, L. Random Forest. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Clarke, W.L. The original Clarke Error Grid Analysis (EGA). Diabetes Technol. Ther. 2005, 7, 776–779. [Google Scholar] [CrossRef] [PubMed]
Dexcom, I. Dexcom G6 CGM System. Available online: https://www.dexcom.com/en-GB/uk-dexcom-g6-cgm-system (accessed on 6 January 2019).
Food, U.; Administration, D. Dexcom G6 Continuous Glucose Monitoring System. Available online: https://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfpmn/denovo.cfm?ID=DEN170088 (accessed on 6 January 2019).
Otis, B. Update on our Smart Lens Program with Alcon. Available online: https://blog.verily.com/2018/11/update-on-our-smart-lens-program-with.html (accessed on 6 January 2019).
Andreas, C.; Talary, M.S.; Martin, M.; Francois, D.; Jelena, K.; Marc, D.; Lutz, H.; Stahel, W.A. Non-invasive glucose monitoring in patients with Type 1 diabetes: A Multisensor system combining sensors for dielectric and optical characterisation of skin. Biosens. Bioelectron. 2009, 24, 2778–2784. [Google Scholar]
Swinehart, D.F. The Beer-Lambert Law. J. Chem. Educ. 1962, 39, 333. [Google Scholar] [CrossRef]
Ojalvo, I.U.; Ting, T. Interpretation and improved solution approach for ill-conditioned linear equations. AIAA J. 1990, 28, 1976–1979. [Google Scholar] [CrossRef]
Yoo, S.K. Photoplethysmography (PPG) Device and the Method Thereof. U.S. Patent 7,336,982, 26 February 2008. [Google Scholar]
Hyvarinen, E.B.A.A. A fast fixed-point algorithms for independent component analysis of complex valued signals. Int. J. Neural Syst. 2000, 10, 1–8. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv, 2014; arXiv:1412.6980. [Google Scholar]
Kramer, O. Scikit-Learn; Springer International Publishing: Berlin, Germany, 2016. [Google Scholar]
Hastie, T.; Rosset, S.; Zhu, J.; Zou, H. Multi-class AdaBoost. Stat. Interface 2009, 2, 349–360. [Google Scholar] [CrossRef]
Peterson, L. K-nearest neighbor. Scholarpedia 2009, 4, 1883. [Google Scholar] [CrossRef]
Quinlan, J.R. Induction on decision tree. Mach. Learn. 1986, 1, 81–106. [Google Scholar] [CrossRef]
Adankon, M.M.; Cheriet, M. Support Vector Machine. Comput. Sci. 2002, 1, 1–28. [Google Scholar]
Grimm, K.J.; Mazza, G.L.; Davoudzadeh, P. Model Selection in Finite Mixture Models: A k-Fold Cross-Validation Approach. Struct. Equ. Model. A Multidiscip. J. 2017, 24, 1–11. [Google Scholar] [CrossRef]
Ramasahayam, S.; Koppuravuri, S.H.; Arora, L.; Chowdhury, S.R. Noninvasive blood glucose sensing using near infra-red spectroscopy and artificial neural networks based on inverse delayed function model of neuron. J. Med. Syst. 2015, 39, 1–15. [Google Scholar] [CrossRef] [PubMed]
Geng, Z.; Tang, F.; Ding, Y.; Li, S.; Wang, X. Noninvasive Continuous Glucose Monitoring Using a Multisensor-Based Glucometer and Time Series Analysis. Sci. Rep. 2017, 7, 12650. [Google Scholar] [CrossRef] [PubMed]
Pappada, S.M.; Cameron, B.D.; Rosman, P.M.; Bourey, R.E.; Papadimos, T.J.; Olorunto, W.; Borst, M.J. Neural network-based real-time prediction of glucose in patients with insulin-dependent diabetes. Diabetes Technol. Ther. 2011, 13, 135–141. [Google Scholar] [CrossRef] [PubMed]
Jintao, X.; Liming, Y.; Yufei, L.; Chunyan, L.; Han, C. Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2017, 179, 250–254. [Google Scholar] [CrossRef]

Figure 1. The absorption spectrum of aqueous glucose solution, deducted water background, measured by UV-3600 Plus (Shimadzu, Kyoto, Japan, 2018) ultraviolet-near infrared (UV-NIR) spectrometer, which has an absorption peak at about 930 nm.

Figure 2. Structure and circuit design of the Earlight device: (a) Structural design; (b) Light emitter circuit; (c) Circuit of I-V Converter; (d) Filtering and amplifying circuit. VCC: Volt Current Condenser, GND: Ground, SHND: Shut down end, SIGN: Signal end, RS-: Resistance end.

Figure 3. The demonstration of using the prototype of blood glucose sensor, which is used to clamp the earlobe from the upper and lower sides to allow the incident light to penetrate the tissue and the transmitted light to be received by the light receptor.

Figure 4. The Schematic Diagram of the Non-invasive Glucose Measurement (NGM) algorithm consisting of the Independent Feature Converter (IFC) and Glucose Estimator (GE).

Figure 5. Clarke Error Grid Analysis for our proposed NGM system ((a) is RF without ICA, and (b) is RF with ICA), in which the x-axis is the reference Blood Glucose Concentration (BGC) and the y-axis is the estimated BGC. In (b), 64.7% of points fall in region A, while in (a) there is only 47.1%.

Figure 6. Convergence Experiment results, in which the correlation coefficient varies by the number of samples in the training set.

Table 1. Comparison and analysis of mini-invasive and non-invasive blood glucose meters. CE:Communate Europpene, FDA: U.S. Food and Drug Administration.

Type	Device	Target site	Principle	Progress	Problems
Mini-invasive	Abbott $F r e e S t y l e$ ™ $L i b r e$	Arm subcutaneous tissue fluid	Glucose oxidase	CE and FDA certified	1. Skin damage and prone to infection; 2. Time lag in glucose changing; 3. Poor accuracy in hypoglycemia; 4. $F r e e S t y l e$ ™ $L i b r e$ needs to be replaced every 14 days; 5. $G 5 / G 6$ ™ needs to be calibrated every 12 h;
	Echo Therapeutic $S y m p h o n y$ ™	Abdominal subcutaneous interstitial fluid	Glucose oxidase	CE certified
	Dexcom $G 5 / G 6$ ™	Abdominal subcutaneous interstitial fluid	Glucose oxidase	FDA (expanded use)
	Mendosa $G l u c o W a t c h$ ™	Wrist subcutaneous interstitial fluid	Reverse iontopho-resis	FDA revokes certification, and requires recall of products.	1. High requirements on the physiological and environmental condition; 2. Poor accuracy; 3. Damage to the skin;
Non-invasive	Google Smart contact lens	Tears	Unknown	Termination	1. Measurement error is large; 2. Difference between left and right tears; 3. Difficult sample collections;
	GNOGA $T e n s o r T i p$ ™ CoG	Finger	Absorption spectrum	CE certified	1. Low accuracy; 2. Need more than 100 invasive data for calibration before non-invasive measurements;
	C8 MediSensors HG1-c	Abdominal skin	Raman spectrum	CE certified	1. Sensitive to environmental noise; 2. The measurement principle is not clear; 3. Susceptible to individual differences and physiological background changes; 4. Some optical components are expensive; 5. Low accuracy; 6. Need a lot of calibration data;
	Integrity Applications $G l u c o T r a c k$ ™	Earlobe	Ultrasound, electricity, thermal	CE certified
	LifeTrac $S u g a r T r a c$ ™	Ear canal	Infrared spectrum	None

Table 2. Comparative experiment results of various regression methods.

Glucose Estimator	with ICA?	Best Parameters (In Scikit-Learn [29])	Best $Corr$	Number of Training Samples
Glucose Estimator	with ICA?	Best Parameters (In Scikit-Learn [29])	Best $Corr$	$Corr = 0.5$	Best $Corr$
Random Forest [18]	Yes	n_estimator = 200	0.819	10	60
Random Forest [18]	No	max_depth = 2	0.523	20	41
Adaboost [30]	Yes	base_estimator = DT	0.695	8	52
Adaboost [30]	No	n_estimator = 20	0.565	7	49
kNN [31]	Yes	n_neighbors = 3	0.461	$I n f$	49
kNN [31]	No	n_neighbors = 3	0.642	23	43
Decision Tree [32]	Yes	max_depth = 20	0.538	11	44
Decision Tree [32]	No	max_depth = 20	0.407	$I n f$	46
SVM [33]	Yes	C = 0.5, Kernel = rdf	0.119	$I n f$
SVM [33]	No	tol = 0.001	0.154	$I n f$

I n f

indicates that the method does not converge.

Table 3. The characteristics of each subject.

ID	Gender	Age	Weight (kg)	Height (cm)	BMI	Diabetes	Skin Tone	Number of Data
#1	Male	56	76	166	27.5	II	Yellowish	84
#2	Female	43	54	160	21.0	II	White	47
#3	Male	31	104	186	30.1	II	Yellowish	38
#4	Male	36	68	172	22.9	I	Dull	41
#5	Female	65	53	154	22.3	II	Yellowish	23
#6	Male	26	61	183	18.2	Health	Yellowish	53

Table 4. The meaning of each region in Clarke Error Grid.

Region	Meaning
A	Those values within 20% of the reference sensor.
B	Those points that are outside of 20% but would not lead to inappropriate treatment.
C	Those points leading to unnecessary treatment.
D	Those points indicating a potentially dangerous failure to detect hypoglycemia or hyperglycemia.
E	Those points that would confuse treatment of hypoglycemia for hyperglycemia and vice versa.

Table 5. Comparison of different Non-invasive Glucose Measurement (NGM) methods.

Methods	Input	Accuracy	Hardware Cost	Condition Requirement
This paper	light absorbance	$C o r r = 0.819$	$< $ 50$	open condition
Ramasahayam, S., et al. [35]	spectroscopy	mean error = 1.02 mmol/L	$> $ 1000$	open condition
Geng, Z., et al. [36]	impedance spectroscopy, optical properties, temperature and humidity.	$C o r r = 0.831$	$> $ 2000$	laboratory environment
Pappada, S. M., et al. [37]	heart rate, blood oxygen, blood pressure	mean error = 2.43 mmol/L	unknown	laboratory environment
Jintao, X., et al. [38]	spectroscopy	$C o r r = 0.570$	$> $ 1000$	laboratory environment

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Liu, W.; Huang, A.; Wan, P. Overcoming Individual Discrepancies, a Learning Model for Non-Invasive Blood Glucose Measurement. Appl. Sci. 2019, 9, 192. https://doi.org/10.3390/app9010192

AMA Style

Liu W, Huang A, Wan P. Overcoming Individual Discrepancies, a Learning Model for Non-Invasive Blood Glucose Measurement. Applied Sciences. 2019; 9(1):192. https://doi.org/10.3390/app9010192

Chicago/Turabian Style

Liu, Weijie, Anpeng Huang, and Ping Wan. 2019. "Overcoming Individual Discrepancies, a Learning Model for Non-Invasive Blood Glucose Measurement" Applied Sciences 9, no. 1: 192. https://doi.org/10.3390/app9010192

APA Style

Liu, W., Huang, A., & Wan, P. (2019). Overcoming Individual Discrepancies, a Learning Model for Non-Invasive Blood Glucose Measurement. Applied Sciences, 9(1), 192. https://doi.org/10.3390/app9010192

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Overcoming Individual Discrepancies, a Learning Model for Non-Invasive Blood Glucose Measurement

Abstract

1. Introduction

2. Related Works

2.1. Mini-Invasive

2.2. Non-Invasive

3. Theoretical Analysis

3.1. Earlobe Model

3.2. Wavelength Selection

4. Circuit of Blood Glucose Sensor

5. Algorithm of NGM

5.1. Independent Feature Converter

5.2. Glucose Estimator

6. Experiment and Results

6.1. Data Acquisition

6.2. Experiment Results

6.2.1. Accuracy Experiment

6.2.2. Convergence Experiment

6.2.3. Comparative Experiment

6.2.4. Transplant Experiment

6.3. Comparision with other NGMs

7. Conclusions and Future Works

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI