A Method to Detect Type 1 Diabetes Based on Physical Activity Measurements Using a Mobile Device

Featured Application: Non-invasive method of type 1 diabetes detection based on physical activity measurement. Abstract: Type 1 diabetes is a chronic disease marked by high blood glucose levels, called hyperglycemia. Diagnosis of diabetes typically requires one or more blood tests. The aim of this paper is to discuss a non-invasive method of type 1 diabetes detection, based on physical activity measurement. We solved a binary classiﬁcation problem using a variety of computational intelligence methods, including non-linear classiﬁcation algorithms, which were applied and comparatively assessed. Prediction of disease presence among children and adolescents was evaluated using performance measures, such as accuracy, sensitivity, speciﬁcity, precision, the goodness index, and AUC. The most satisfying results were obtained when using the random forest method. The primary parameters in disease detection were weekly step count and the weekly number of vigorous activity minutes. The dependance between the weekly number of steps and the type 1 diabetes presence was established after an insightful analysis of data using classiﬁcation and clustering algorithms. The ﬁndings have shown promising results that type 1 diabetes can be diagnosed using physical activity measurement. This is essential regarding the non-invasiveness and ﬂexibility of the detection method, which can be tested at any time anywhere. The proposed technique can be implemented on a mobile device.


Introduction
Diabetes mellitus is a group of metabolic diseases that is characterized by hyperglycemia and results from defects in insulin action, insulin secretion, or both [1]. Elevated blood glucose connected with this disease can cause dysfunction and failure of various organs, which are the effects of long-term diabetes. Currently, according to the WHO and American Diabetes Association classification (ADA), there are four types of diabetes: type 1, type 2, other specific types of diabetes, and gestational diabetes [2,3].
Type 1 diabetes causes the patient's blood glucose to become too high. This happens when his or her body cannot produce enough insulin, which controls blood glucose. Patients need daily injections of insulin to keep blood glucose levels under control. It is one of the leading health problems in Poland and Europe, for people of all ages. It causes constant damage to health and contributes to premature death [4,5]. According to the International Diabetes Federation estimation, the incidence of type 1 diabetes among children and adolescents under the age of 15 years is increasing in many countries, Figure 1. Estimated number of children and adolescents <20 years with type 1 diabetes by IDFregion, 2017 [6]. Type 1 diabetes is described as the most prevalent metabolic disease and the third most common and irreversible chronic disease in childhood, especially below 15 years of age [7]. Despite great progress in medicine, diabetes is an incurable disease, and it is an extraordinary burden on patients and their families. Due to its chronic, progressive, and incurable nature, it greatly affects adolescents, in particular basically their self-esteem, educational opportunities, and lifestyle. Children and adolescents with type 1 diabetes must face many problems related to treatment restrictions.
Measurement of blood sugar is the basic test most often ordered by doctors to detect carbohydrate tolerance disorders and also to diagnose and monitor the treatment of diabetes. Blood is drawn for testing on an empty stomach, followed by a meal or after administration of glucose solution. Serious barriers in the treatment of diabetes among children are problems with painful injections or blood tests, shame about diabetes, arguing with parents about the plan for diabetes control, and compliance. Particularly troublesome are activities related to measuring the level of glucose in the blood, making injections of insulin, exercising, controlling the content of carbohydrate dietary exchanges in the diet, wearing a diabetic or information bracelet, carrying sweets for hypoglycemia, and eating snacks [8].
An additional problem is the fact that the symptoms of diabetes are often ambiguous. They may be confused or attributed to other diseases. Diabetes can only be unequivocally diagnosed when a glucose load test is performed. Too late of a diagnosis of diabetes in childhood can lead to serious changes, such as destruction of blood vessels, visual disturbances, and problems with the nervous system and kidneys. Very serious diabetes, having been unrecognized for a long time, may endanger children's lives; therefore, extraordinary vigilance should be maintained while observing children, in order to react in time to the first signals of the disease [9,10].
While analyzing the information above, the question arises whether it is possible to diagnose diabetes without performing blood tests. The present work aims to diagnose type 1 diabetes among children based on their physical activity. Selected classification algorithms are compared to obtain the most satisfying results. The promising results encourage developing an application using computational intelligence methods.

Available Methods of Assessing Physical Activity
Physical activity results in an increase in energy expenditure above resting levels. The rate of energy expenditure is directly linked to the intensity of the activity [11]. Physical activity can be classified according to the Borg scale, ranging from sedentary, light, moderate, to vigorous activities [12].
Currently, there are many methods that allow determining the parameters of physical activity with high accuracy. These include all monitors like pedometers and accelerometers that have motion sensors and are worn on the body of the subject to perform various motion measurements, e.g., step count, the duration of physical activity, and its intensity [13].

Pedometers and Accelerometers in Physical Activity Measuring
The simplest and most popular devices allowing activity measurements are pedometers, which record the number of steps. Thanks to the ability to display the result on a regular basis, they are considered as a motivating tool to perform more physical activity in everyday life. However, measurements by pedometers in scientific research have many limitations. Devices provide information on the frequency of movement, but they do not determine the intensity of physical exercise. Pedometer step counts are also more inaccurate at slow speeds (<60 m/min); therefore, they may be inappropriate for older adults, and the result may not be reliable. Pedometer readings can also vary according to where the pedometer is mounted. In addition, its weakness is also the possibility of falsifying and increasing results by intentionally shaking the device or by shocks caused by driving a car, which do not prove that the subject was more active [14].
Currently, the most accurate motion sensors used to assess physical activity are accelerometers. The devices detect the acceleration of body movement, giving the opportunity to measure reliably the intensity and duration of physical activity, as well as the number of steps taken, and sedentary analysis [15]. Those parameters of the motion are read by the piezoelectric sensor, which converts the analog signal into the digital one in the range (0.1-3.6 Hz). Thanks to this, very accurate monitoring of physical activity is possible. An example of a commonly-used accelerometer is ActiGraph.

ActiGraph Activity Monitor
ActiGraph has been used in large-scale field studies and has become the de facto standard device for objective physical activity monitoring [16]. It is particularly recommended for examination of children and adolescents because it allows for detection of acceleration in three planes of motion, which provides more accurate analysis of the movement relative to pedometers. This is especially important in the case of children's examination because the device records all forms of physical activity, such as doing push-ups or climbing. Many publications describe the advantages of using accelerometers in scientific research, such as objectivity, non-invasiveness, and accuracy, while maintaining the comfort of the user [15].
Published findings related to the application of ActiGraph concerned with exploring differences in daily physical activity profiles among individuals with mild Alzheimer's disease were compared to a control group [17]. Features that can be derived from the accelerometer have been also used to recognize the presence and severity of motor fluctuations in patients with Parkinson's disease [18]. It has been also used with measurements of physical activity to evaluate the effectiveness of surgical and therapy-based interventions in children with cerebral palsy or to derive diurnal rest-activity patterns from actigraphy in adolescents and to analyze associations with adiposity measures and cardiometabolic risk factors [19,20].
However, ActiGraph activity monitors have limited memory and battery capacity to store raw signal data and are additionally quite expensive. One of the current models, ActiGraph wGT3X-BT, currently sells for 225 USD [21]. The costs of devices may vary if bought separately, as compared to bulk orders.
Due to memory limitations, information about movement is read by the accelerometer in the form of the number of pulses (named counts), which are added up in the designated time unit [22]. A count is a unit aimed to be proportional to the average overall acceleration of the human body in a specified period of time. The sum of the received counts is converted into the intensity of physical activity, categorized as sedentary behaviors, light physical activity (LPA), moderate PA (MPA), and vigorous PA (VPA).
There are commonly-used regression equations named as cut points for the ActiGraph accelerometers in predicting energy expenditure (EE) in children and adolescents [23]. The cut points are derived as a part of published research aimed at quantifying activity levels using ActiGraph products. All cut point sets are scaled to 60-s epochs.
In this study, the parameters of physical activity are calculated according to the Freedson Children (2005) model. Definitions of the cut point levels for this model are given in Table 1.

Methods to Compare New and Traditional Accelerometer Data
There are many publications describing how to convert a raw accelerometer signal into the output data of the ActiGraph [16,24,25]. Such data can be obtained using a common smartphone, which is equipped with an accelerometer and a pedometer. Mapping the conversion of counts will allow performing tests in an inexpensive and easy way, which will be comparable to those obtained using the ActiGraph activity monitor.
The research literature describes that counts are calculated as the area under the filtered and rectified (non-negative) curve. The ratio between raw acceleration signal and counts is likely to be brand specific [16]. The experiment described in the literature showed that a third-order Butterworth filter resulted in the highest correlation between ActiGraph counts and unscaled raw accelerometer counts (r = 0.975, p < 0.01) [24].
The complete method of the conversion of raw accelerometer data to the output the ActiGraph signal is presented below as steps. First, it is necessary to gather 60 s of analog accelerometer reads and calculate the Euclidean distance on analog data in order to create one signal from three axes. Second, this signal should be processed using a third-order Butterworth filter. Next, the area under the filtered and rectified signal should be calculated. Then, the result should be labeled by type of activity (i.e., sedentary, vigorous, etc.) using predefined cut points and a count of the selected incremented label. All steps should be repeated until enough data are collected.
This method allows for consistency with traditional physical activity measurements so that it is possible to make a historical analysis and comparisons.

Data Source
The dataset was collected from a group of schoolchildren between the ages of 6 and 18 being under the care of the diabetic clinic for children at Rzeszow State Hospital in Poland in 2016 by E.Czeczek-Lewandowska as a part of her Ph.D. thesis research [8]. The dataset was divided into two groups based on the results of HbA1c glycated hemoglobin tests for diabetes that were read from the patient's medical records provided by the diabetic clinic with parental consent. The analysis included the last two results from the maximum period of one year prior to the study, on the basis of which the arithmetic mean was calculated.
Of the 451 children that took part in the research, the inclusion and exclusion criteria were extracted and analyzed. The eligibility criteria that were applied were: ages between 6 and 18, type 1 diabetes diagnosed a minimum of one year prior to the examination, HbA1c values determined at least twice in the year prior to the start of the study, informing parents about the study and child consent, required physical activity record length (excluding night hours and activities performed in contact with water), and training the parent and child in terms of using the accelerometer. Children who did not meet the inclusion criteria, were diagnosed with type 2 diabetes or other metabolic disorders, had current complications in the course of diabetes, and became sick during the study period were excluded from the study. Additional excluding criteria were exceptionally bad weather conditions, a period of holidays, and holiday break during the study period. Finally, the study group consisted of 215 children with type 1 diabetes and 115 healthy children from a control group. Nine parameters for each child were collected and are listed below.
1. General and BMI parameters: The weight and height of the body were obtained using a Radwag WPT 60/150 OW electronic scale during a three-stage measurement. The level of physical activity was assessed with a hip-worn ActiGraph wGT3X-BT activity monitor used by the children 12 h a day for a week, excluding night time and activities performed in contact with water, i.e., bath, swimming pool. The parameters of physical activity were calculated according to the Freedson Children (2005) method.

Classification Methods
Classification systems have an important role in decision-making tasks by categorizing the available information based on some criteria [26]. The purpose of this research was to assess the relative efficacy of some well-known classification methods. We have considered classification techniques that are based on statistical and AI techniques. A brief review of the relevant classification methods is presented in this section.

Support Vector Machine
Support vector machine (SVM) is a classification algorithm used for finding an optimal hyperplane that maximizes the margin between classes. That hyperplane is orientated in such a way that it is as far as possible from the closest data points from each of the classes. These closest points are called support vectors [27]. The key element of the SVM algorithm is the kernel function. It transforms a non-linear feature space into a linear one before the hyperplane search [28].

Probabilistic Neural Network
The probabilistic neural network (PNN) is a feedforward neural network model. It consists of input, pattern, summation, and output layers. The input layer is represented by the features of the input vector. The pattern layer is composed of as many neurons as learning samples. The summation layer consists of Nneurons where each of them computes the signal only for patterns that belong to the n th class. The output layer is used to yield the decision; its result with the largest probability value is 1, and the rest of the outputs are 0 [29].

Multilayer Perceptron
Multilayer perceptron (MLP) is a feedforward artificial neural network that uses the backpropagation technique for training. It is composed of one or more layers of neurons. Data are transferred to the input layer; there may be one or more hidden layers; and predictions are made on the output layer [30].

Group Method of Data Handling
The group method of data handling (GMDH) is a family of inductive algorithms of multi-parametric datasets. It features the fully-automatic parametric and structural optimization of models. GMDH is used for constructing a high-order regression-type polynomial [31].

Gene Expression Programming
Gene expression programming (GEP) is an evolutionary algorithm that creates models, equations, or computer programs. GEP programs are encoded in the so-called chromosomes, which are mutated by computing the expression of each chromosome. Next, the predefined genetic operators are applied, and the fitness is calculated. Finally, the best chromosomes are selected to reproduce [32].

Linear Regression
Linear regression is one of the simplest and best known algorithms in statistics and machine learning used for finding a linear relationship between the target and one or more predictors. The core idea of linear regression is to obtain a line that best fits the data [33].

Radial Basis Function Network
The radial basis function network (RBF) is an artificial neural network that uses radial basis functions as activation functions. The output of the RBF network is composed of neuron parameters and radial basis functions of the inputs [34].

Logistic Regression
Logistic regression is a statistical method for analyzing a dataset with one or more independent variables that determine an outcome. The goal of logistic regression is to find the best fitting model to describe the relationship between the binary dependent variable and a set of independent variables [35].

Decision Tree
The decision tree (DT) is a type of model used for both classification and regression. Trees answer sequential questions, which are sent down a certain route of the tree given the answer. They are intuitive and provide one of the simplest portrayals for classification purposes. Tree depth represents how many questions are asked before reaching the predicted classification [36].

Random Forests
Random forests (RF) are a classification algorithm that is a combination of decision tree predictors so that each of them depends on the values of a randomly -elected independent vector with the same distribution for all trees in the forest [37]. After training, predictions for unseen samples can be made by taking the majority vote [36].

Validation Methods
Commonly-used evaluation measures are precision, sensitivity, and accuracy. These measures can be defined with the help of four cardinalities of the confusion matrix, namely the truth positive (TP), true negative (TN), false positive (FP), and false negative (FN) [38].

Accuracy
The accuracy metric measures the total number of correct classifications (true positives and true negatives) [38].

Sensitivity
The sensitivity (recall) measures the proportion of actual positives that are correctly identified as such (e.g., the percentage of children with type 1 diabetes who are correctly identified as having the condition): [38].

Specificity
The specificity measures the proportion of actual negatives that are correctly identified as such (e.g., the percentage of healthy children who are correctly identified as not having the condition): [38].

Precision
The precision metrics determine the quality of positive predictions (true positives and false positives): [38].

AUC
For a binary classification problem, the evaluation of the performance is typically illustrated with the receiver operating characteristic (ROC) curve, which plots the true positives versus the false positive rate at various threshold settings. It is convenient to reduce it to a single scalar value representing expected performance. A common method is to calculate the area under the ROC curve (AUC). An ideal classifier achieves an AUC equal to 1, while the classifier that makes a random decision achieves an AUC equal to 0.5 [38,39].

Goodness Index
The goodness index (G) represents the Euclidean distance between the evaluated point in the receiver operating characteristic space and the point (0,1), which represents the perfect classifier that classifies all positive cases and negative cases correctly.
G can assume values between 0 and √ 2, and a classifier can be considered as: [40].
The G value result analysis allows evaluating the best-performing classifier [28].

Clustering Method
The k-means clustering algorithm is one of the most popular clustering algorithms, which is used to find groups that are not explicitly labeled in the data. It uses iterative refinement to produce a final result. The inputs of the algorithm are the dataset and the number of clusters k. A cluster is a collection of data points that have been aggregated together because of certain similarities, and the dataset is a collection of features for each data point. The algorithms start with initial estimates for the k centroids, which can either be randomly initialized or randomly selected from the dataset. Then, the algorithm iterates between two steps: • Data assignment: each data point is assigned to its nearest centroid, based on the squared Euclidean distance. If c i is the collection of centroids in set C, then each data point x is assigned to a cluster based on: arg min where dist(·) is the standard Euclidean distance. S i is the set of data point assignments for each i th cluster centroid.

•
Centroid update: centroids are recomputed by taking the mean of all data points assigned to that centroid's cluster.
The algorithm iterates between those two steps until convergence. Convergence is reached when the computed centroids do not change or the centroids and the assigned points oscillate back and forth from one iteration to the next one. The result may be a local optimum, so assessing more than one run of the algorithm with randomized starting centroids may give a better outcome [41].

Feature Selection Methods
Feature selection is the first and fundamental step in data analysis. This is the process of selecting a subset of relevant features (variables, predictors) for use in model construction. Feature selection methods aid in creating an accurate predictive model by choosing only features that are relevant. Irrelevant features in the dataset can decrease the performance of the models; redundant data can allow a greater opportunity to make decisions based on noise and increase algorithm complexity, while algorithms are trained more slowly.
There are three general classes of feature selection algorithms: filter methods, wrapper methods, and embedded methods. Filter feature selection methods apply a statistical measure to assign a score to each feature. The features are ranked by the score and are either selected to be kept or removed from the dataset. The methods are often univariate and consider the feature independently, or with regard to the dependent variable. Examples of some filter methods include correlation coefficient scores and information gain. These methods are used to create the feature ranking [42].
Pearson's correlation coefficient is one of the methods of measuring the association between variables of interest, and it is based on the covariance method. It gives information about the magnitude of the association, or correlation, as well as the direction of the relationship [43].
Entropy measures the amount of uncertainty in the dataset. The information gain is based on the decrease in entropy after splitting a dataset on an attribute [44]. It is used to generate a decision tree from a dataset. Constructing a decision tree comes down to finding an attribute that returns the highest information gain.
The information gain IG is the change in information entropy H from a prior state to a state that takes some information as given: where H(d | a) is the conditional entropy of decision d given attribute a and H(d) is the entropy of decision d, which is equal to: Information gain can be calculated for each remaining attribute. The attribute with the largest information gain is used to split the dataset on this iteration [45]:

Data Analysis Results
The aim of this research was to answer whether type 1 diabetes among children and adolescents can be diagnosed based on physical activity. We defined the prediction problem of type 1 diabetes presence among children as a binary classification problem. The results were obtained using DTREG, Weka, and Python Scikit-Learn software packages [46][47][48].
The assessment of physical activity impact on the prevalence of type 1 diabetes among children and adolescents was based on parameters closely related to the intensity of physical activity. These parameters were calculated according to the Freedson Children (2005) model (Table 1). We considered the dataset consisting of parameter values of 215 sick and 115 healthy children. The selected classification parameters set was composed of the total number of steps and sedentary, light, moderate, and vigorous activity minutes per week.
Subsequently, we decided to create a feature ranking (FR) automatically. FR specifies the significance of features for a problem by ranking features according to their importance in the model using ranking algorithms [42]. The FR based on correlation coefficient scores was performed, and the results are presented in Table 2. Due to the fact that evaluating the entropy is a key step in the decision tree algorithm, it was used to calculate the homogeneity of a sample, and we decided to create FR based on information gain, which is based on the entropy. The results are presented in Table 3. The data presented in Table 2 showed that the most significant parameter was the step count (per week). Data presented in Table 3 resulted in three important parameters, i.e., vigorous activity minutes, moderate activity minutes, and step count.
The presented results of the FR are purely illustrative, because threshold values were set to exclude unimportant parameters. We decided to use the classification of all physical activity parameters.

Classification Result
Firstly, we built a decision tree with an overall goal to extract general information from a dataset and transform that information into a structure that can be understood by an ordinary user. A decision tree was built from physical activity parameters, i.e., the total number of steps and the groups of sedentary, light, moderate and vigorous activities, using the implementation of the c4.5 algorithm, called J48, from the Weka software package. The algorithm was started with default values, such as the confidence threshold for pruning the set to 0.25 and a minimum number of instances per leaf equal to two.
At each node of the tree, the algorithm chose the attribute of the data that most effectively split the set of samples into subsets enriched in one class or the other. The splitting criterion was the normalized information gain. The information gain feature ranking results described in Section 4.1 had the vigorous activity minutes parameter in the first place. Hence, it could be concluded that the root of the decision tree would be the same parameter. In Figure 2, as can be observed, the results of the decision tree classification are not completely consistent with logical thinking, and in some cases, they seem contradictory. Accurate and reliable information is vital for effective decision making. Thus, we employed an undersampling technique to obtain reliable estimates. It is a technique used to adjust the class distribution of a dataset. For this purpose, 115 of the 215 sick children were chosen by the random selection process to obtain two equivalent ratios of sick and healthy patient classes. After undersampling, the remaining 230 patients were considered eligible and were enrolled in the study.
The best results in the prediction of type 1 diabetes presence among children and adolescents were obtained with decision tree forests. This model enabled the prediction with the highest accuracy (86.09%), specificity (84.35%), and precision (84.87%). The PNN also showed high accuracy (84.35%) and the highest sensitivity (89.57%), but markedly lower specificity (79.13%) and precision (81.10%). The AUC for PNN (0.926578) also exceeded the values of this parameter for the remaining classifiers ( Figure 1). The averaged accuracy, sensitivity, specificity, precision, goodness index, and AUC value obtained for all applied computational intelligence methods and a linear regression model are presented in Table 4. The values were obtained using a 10-fold cross-validation procedure [49]. The given dataset consisting of 230 samples was split into 10 folds, where each fold was used as a testing set at some point. In the first iteration, the first fold was used to test the model, and the rest were used to train the model. In the second iteration, the second fold was used as the testing set, and the rest served as the training set. This process was repeated until each fold of the 10 folds had been used as the testing set. Based on the obtained scores in every iteration, the mean value was calculated in order to assess the performance of the model.

Clustering Result
In the last step, we wish to explain the correlation between physical activity parameter values and type 1 diabetes presence. This relied on finding the equation that played a major role in the correct diabetes classification among children and adolescents. For this purpose, we assumed that we did not have a classification into sick and healthy patients and used the k-means clustering algorithm.
After clustering, we compared the obtained clusters with their corresponding classes from the dataset. It turned out that 215 of 330 records had identical classes. Then, we built a decision tree for the remaining 215 records using the c4.5 algorithm with the same setup as described in the Classification Result section. The result of the decision tree is presented in Figure 3. The obtained results confirmed the assumption that the correlation between physical activity and type 1 diabetes presence can be evaluated based on measuring step count. It is possible to predict the prevalence of the disease correctly at least in 65% of the cases. As a result, a child was determined to be sick when performing fewer than 60,837 steps per week.

Discussion
The purpose of this research was to find a relationship between the intensity of physical activity and the presence of type 1 diabetes among children and develop a non-invasive method of type 1 diabetes detection. Assessment of the physical activity was based on ActiGraph activity monitor measurements. The ActiGraph measurements for health-related research were also carried out in published findings [17][18][19][20].
Decision tree forests, as well as other computational intelligence methods were applied for the detection of different diseases, e.g., breast cancer and heart disease [50,51]. Application of decision tree forests, which included five parameters connected with the intensity of physical activity, enabled the prediction of type 1 diabetes presence among children and adolescents between the ages of six and 18 with a high accuracy of 86.09% and specificity of 84.35%. The PNN also showed a high accuracy of 84.35%. Our results were comparable to similar articles, in which neural networks were used for outcome prediction of diabetes presence. For example, an SVM algorithm using the RBF kernel, the same as was used in this paper, was able to predict the presence of elevated blood glucose level via electrochemical measurement of saliva with approximately 85% accuracy [52].
As the final result of the study, it was concluded that if the number of steps is lower than around 61,000 a week, it is likely that the child is suffering from type 1 diabetes. After dividing this by the seven days of a week, we obtained the average number of steps per day, which was around 9000, but it should be noted that gender and age were not included in the calculation of this result. The updated international literature indicates that we can expect, among children, boys to average 12,000-16,000 steps/day, girls to average 10,000-13,000 steps per day, and adolescents to reach approximately 8000-9000 steps/day [53]. Thus, the obtained result was consistent with the normative international literature.
Decreased physical activity of ill children compared to healthy peers was the result of the disease. Many children also complained that it was difficult for them to go through a pitch of more than 100 m, that it was difficult for them to run, play sports, exercise, lift something heavy, take a bath, or shower by themselves, and that they felt pain and were tired.
The results of the research are promising and encourage developing a mobile application for type 1 diabetes diagnosis dedicated to children and adolescents. Although the popularity of using mobile phones applications in various health disorders has reached about 30%, it should be taken into account that young people are more likely to use and more effective at using new mobile phone applications, and the popularity and potential acceptance of mobile health solutions have an increasing tendency [54].