An Intelligent Diabetic Patient Tracking System Based on Machine Learning for E-Health Applications

Background: Continuous surveillance helps people with diabetes live better lives. A wide range of technologies, including the Internet of Things (IoT), modern communications, and artificial intelligence (AI), can assist in lowering the expense of health services. Due to numerous communication systems, it is now possible to provide customized and distant healthcare. Main problem: Healthcare data grows daily, making storage and processing challenging. We provide intelligent healthcare structures for smart e-health apps to solve the aforesaid problem. The 5G network must offer advanced healthcare services to meet important requirements like large bandwidth and excellent energy efficacy. Methodology: This research suggested an intelligent system for diabetic patient tracking based on machine learning (ML). The architectural components comprised smartphones, sensors, and smart devices, to gather body dimensions. Then, the preprocessed data is normalized using the normalization procedure. To extract features, we use linear discriminant analysis (LDA). To establish a diagnosis, the intelligent system conducted data classification utilizing the suggested advanced-spatial-vector-based Random Forest (ASV-RF) in conjunction with particle swarm optimization (PSO). Results: Compared to other techniques, the simulation’s outcomes demonstrate that the suggested approach offers greater accuracy.


Introduction
The healthcare industry is constantly growing and thus provides a wide range of research challenges in the field of computer science. Advances in information and communication technology (ICT), sensors, Big Data analysis, machine learning (ML), and artificial intelligence (AI), all can be employed to meet these challenges. For instance, users of IoT-enabled signal surveillance systems can forecast health related conditions including heart attacks, and chronic fevers. Consequently, this facilitates elder care, as well as senior assistance, wellness, and preventive measures [1]. Providing dependable assistance when necessary and decreasing the patient travel issue can improve the quality of care. The primary purpose of new technology is to continuously monitor patients with prolonged diseases, whose prevalence has grown in current centuries [2]. IoT technology, therefore, offers novel options for diabetic patients. The suggested IoT-based healthcare system combines the ML technique with an advanced sensor system to gather a crucial human physiological signal. The centralized cloud processor has received the gathered signal across wireless media for processing and visualization.

•
An intelligent method for monitoring diabetes patients that uses ML.

•
The architectural components include smart gadgets, sensors, and cellphones, all used to obtain body measurements.

•
The normalization approach is then used to normalize the pre-processed data. Linear discriminant analysis (LDA) is used to extract features. • The intelligent system performed data categorization using the proposed advancedspatial-vector based Random Forest (ASV-RF) together with particle swarm optimization (PSO) to generate a diagnosis.

Literature Survey
In this section, a wide range of studies on the health monitoring system for diabetes patients are presented.
"An adaptive and predictive context-aware monitoring system" was presented by the authors of [8] as a solution to the problems of continuous monitoring, a shortage of abnormality diagnosis methods, and prediction approaches needing lengthy training periods. An outline of current technology trends for developing a system for HIoT data protection is given in [9], and the accompanying security issues are then examined. Additionally, they offer a structural platform for tracking the health indicators of patients with disabilities or chronic degenerative diseases. Various use-case scenarios illustrate how application components interact with one another. One of the most common medical problems in daily life is the diagnosis and prognosis of diabetes. Consistency and other factors contribute to the body's long-term micro-vessel problems of diabetes. The experimental test was conducted systematically in [10] utilizing a variety of machines. Understand the classifiers to estimate the prevalence of type-A diabetes in people. The authors of [11] presented a "smart healthcare recommendation system for multidisciplinary diabetes disease patients (SHRS-M3DP)" model to predict the disease quickly and accurately in the patients. However, a better generalized, efficient diagnosis and suggestion approach for various human illnesses still must be developed.
A comprehensive overview of pervasive, intelligent, and networked health services for tracking individuals having chronic and lifestyle illnesses was presented in [12]. The intelligent patient tracking and management architecture employs deep learning (DL) and cloud-based analyses. Another approach was presented in [13], where a support vector machine (SVM) was used to forecast diabetes probability. The samples of females of Pima Indian heritage were only used in the database, which introduced bias based on ethnicity and gender. The fundamental relationship may reduce generalizability even though these variables were not chosen as characteristics in the feature selection process when training the system. Consequently, this approach was only evaluated with two-woman helpers and one clinical examiner, and the outcomes were as anticipated.
The study of [14] focuses on real-time data for improved prediction and accuracy utilizing ML and IoT. The suggested hardware and software system aids patients in early cardiac disease prediction. According to the actual demands and difficulties that the elders and their caretakers face, novel medical treatments are contrasted in [15]. A systematic review of methods for diabetes mellitus identification, detection, and self-management is conducted by [16]. Although it included the main research contributions from 2015 to 2020 in the field, it has missed some of the pertinent contributions appearing in the following years until now.
The authors of [17] presented a brand-new health-monitoring mechanism to record the disease burden by forecasting illnesses based on primary data gathered from individuals who are accessible in remote locations. Additionally, they suggest a safe data storage architecture for protecting patient data in cloud systems. Here, they provide two brandnew cryptographic techniques for encrypting and decrypting data. To illustrate and assess the software system's efficacy in managing diabetes, the paper [18] outlines the design and development of a programming system to enhance treatment adherence using an ML technique. The numerous aspects that have an impact on diabetes patients' health are addressed by the suggested method for this control system [21]. The current system records the user's walking activity and stores the route, but it does not link the route recorded to the number of calories burned. The author of [22] mentions that a literature review of the work has been done, focusing on the benefits of merging telemedicine with AI. These advantages present endless growth opportunities. The article also examines how AI and telemedicine have been utilized to enhance continuous monitoring and the challenges these methods are intended to solve. The author of [23] uses the patient's glucose and blood pressure values, the aim is to forecast their hypertension and diabetes state. Classification methods for supervised machine learning are used. In this case, a system is taught to forecast the patient's blood pressure and diabetes state. The support vector machine classification method was deemed the most accurate after evaluating all the classification algorithms and was therefore selected to train the model. The study [24] provides a model that uses light transmission to calculate the amount of glucose in the body. As the Li-Fi technology is both quicker and more efficient than the conventional Wi-Fi networks, it is the one that is employed. According to the author of [25], IoT and AI are employed to investigate the healthcare industries in order to enhance patient support and patient care going forward. The precision of the patient support process is decreased by the inability of traditional healthcare aid systems to anticipate the precise patient health information and demands. A patient's personal specifics, such as their medical reports, temperature, fitness tracker, body mass, health activities, and other health care information, are properly predicted using an IoTs sensor with AI, which aids in choosing the best help procedure. The study [26] introduced grey filter bayesian convolution neural network (GFB-CNN), a real-time dataand deep-neural-network-driven IoT smart healthcare strategy. They proposed a GFB-CNN-based, AI-driven Internet of Things (IoT) eHealth architecture to enhance precision and efficiency across the essential quality of service parameters. The article [27] begins with a discussion of the technologies involved in the design of 5G e-health systems from the physical layer, the application layer, and the cross-layer viewpoint. Table 1 depicts the literature survey.

Methodology
The preprocessing, extraction of features, and classification processes listed in Figure 1 are the basic processes in this section's construction of a classification utilizing ML.

Dataset
The records of 62 people with diabetes (44 males and 18 females), who endured days of examinations on average, were incorporated into the dataset for this investigat [19]. The glucose concentration set consists of 12,612 glucose concentration datapoints a 5 characteristics.

Preprocessing
The key aspect before using ML algorithms is data preprocessing. Because actual d are frequently noisy, insufficient, and unreliable, they cannot be used immediately in prediction step. To adequately describe the evidence for the diagnosis of diabetes disea a preprocessing stage is used.
The diabetes disease datapoint has a variety of characteristics; each characteri has a unique set of numeric numbers, which makes processing more challenging. A result, a normalization system is employed to normalize the datapoint in the ran from 0 to 1 and reduce the numerical burden of the diabetes disease progno computation. Data normalization can be achieved using a variety of techniques. A m max normalizing method is applied in the suggested system. Using Equation (1), t approach displays a quantitative score from the given dataset into with range [0, Here, = 1 = 0 is used. Using this approach, the values of all characteristics fall inside the range from 0 to 1.

Feature Extraction Using LDA
One of the main challenges in ML is the extraction of features, which is a crucial st By combining the previous dimensions, feature extraction develops new dimensions.
Linear discriminant analysis (LDA) is a form of class-based discrimination. T method helps supervised learning discover a collection of basis vectors. Th fundamental vectors are shown as . The proportion of the between and within cl disperses from the training instance set-which is maximized-makes up the vecto

Dataset
The records of 62 people with diabetes (44 males and 18 females), who endured 67 days of examinations on average, were incorporated into the dataset for this investigation [19]. The glucose concentration set consists of 12,612 glucose concentration datapoints and 5 characteristics.

Preprocessing
The key aspect before using ML algorithms is data preprocessing. Because actual data are frequently noisy, insufficient, and unreliable, they cannot be used immediately in the prediction step. To adequately describe the evidence for the diagnosis of diabetes disease, a preprocessing stage is used.
The diabetes disease datapoint D d has a variety of characteristics; each characteristic has a unique set of numeric numbers, which makes processing more challenging. As a result, a normalization system is employed to normalize the datapoint D d in the range from 0 to 1 and reduce the numerical burden of the diabetes disease prognosis computation. Data normalization can be achieved using a variety of techniques. A min-max normalizing method is applied in the suggested system. Using Equation (1), this approach displays a quantitative score from the given dataset into D nor with range [0, 1]: Here, ne = 1 and ne = 0 is used. Using this approach, the values of all the characteristics fall inside the range from 0 to 1.

Feature Extraction Using LDA
One of the main challenges in ML is the extraction of features, which is a crucial step. By combining the previous dimensions, feature extraction develops new dimensions.
Linear discriminant analysis (LDA) is a form of class-based discrimination. This method helps supervised learning discover a collection of basis vectors. These fundamental vectors are shown as w k . The proportion of the between and within class disperses from the training instance set-which is maximized-makes up the w k vectors. The following generalized eigenvalue issue is resolved for discovering w k basis vectors. Here, L = dubspace sdimension, S C = between and S v = withinclasses Here, a = no. o f class, X ∈ R N = sample, X k = sampleset, M k = no. o f classink, and µ = mean.
First L greatest eigenvalues {ψ k |1 ≤ k ≤ L} are the base vectors w k desired in Equation (1) if S V is not singular. Attributed to the reason that the LDA base vectors were orthogonal to one another, it may be projected using a basic linear method, W T x, into the LDA subspace to derive its representations.

Classification Using ASV-RF Algorithm
Generally speaking, Random Forest (RF) significantly outperforms the single tree classifier. It offers an effective method for categorizing sets of sparse data. However, because basic RF chooses features at random, it is simple to choose irrelevant or distracting characteristics, mainly when the training data is noisy. This could produce subpar categorization outcomes. As explained in earlier sections, the data matrix for type categorization has numerous missing values, which adds noise. It must be improved to use the basic RF in search form categorization.
Because to the shallow feature space, there are a lot of missing values in the training data set. Therefore, many more missing data points cause characteristics to lose importance or possibly become noisy. An unreliable classification tree will emerge from the randomized feature selection for bootstrap samples, which may yield many irrelevant or noisy features. Creating a feature weighting method for building a high-quality classifier seems attractive. By using a weighting system during feature selection as opposed to random selection, we expanded the basic RF. The weighting metrics are set to be chi-squared and are represented as Equation (5).
Here, the definition of O ij as a measured value, which denotes the number of a joint incident, is Equation (5). Similarly, Equation (4) calculates a weight for every characteristic in the feature space, while only the characteristics with high weights are considered to construct the decision tree (DT).
We constructed a collection of decision trees and then combined the output of each classifier using a likelihood estimation method. Suppose that the input case x is the testing case and that every classification model (DT) h j [j = 1 . . . k] chooses for the potential target class c i . Every classifier's output can be estimated as P(I = h j ) . The final categorization results are then calculated by adding the probability values as Equation (7): X input vector belongs to c i if and only if c i has the highest likelihood. Algorithm 1 is a representation of the ASV-RF. Assuming that there are n features, [β.n] features would be chosen as the training set in stage 3, wherein β represents the feature selection frequency. Learning separate classifiers from the training data supplied is stage 4. The bootstrapping approach is used to choose training data. Sampling with the replacement is employed to choose t features from n' features (Here, t = [log 2 n + 1]). Equation (7) is employed to categorize the unlabeled cases, and after every round, the trained DT is inserted to the forest M*.

PSO
Fish schooling and bird flocking are the sources of inspiration for particle swarm optimization (PSO). A community of particles is created whose present location provides the cost function that needs to be decreased to get the optimal result in a multidimensional space. After every iteration, the advanced velocity and position of the individual particles are revised based on an averaged impact of the current velocity, distance from its best showing so far during the search phase, as well as the distance from the foremost particle, i.e., the particle generating the better outcome so far.
In a multidimensional solution space, a particle's location and velocity are often represented by the variables x and v. The d × 1 vectors = (x i1 , x i2,... x id ) and = (v i1 , v i2,... v id ) represent the position and velocity of a particle in d-dimensional space, correspondingly. For each particle i, the better location exposed so far is noted as another d × 1 vector pbest i1, pbest i2, . . . . . . pbest id . The best global particle among all particles i is represented as gbest, and its position in the dth dimension is gbest d, . Based on the performance of the kth iteration, the velocity, and position updating formulas for the particle in the dth dimensions in the (k + 1)th iteration are as follows: Here, D represents a dimension in the multi-dimensional search issue, and N P represents the size of the population. c 1 and c 2 are acceleration constants that provide proportional random weight of the deviation from the best individual performance of the particles and the best collective efficiency in the dth dimension. To balance global and local investigations well, the proposed system applies the PSO with an adjustable inertia weight, w, during the whole search procedure. According to the following equation, the moment of inertia w is determined in this study.
Here, iter max is the maximum number of iterations, and iter is the present number of iterations. We begin with a large value of w max so that we may run an aggressive global search in search of a possible good solution and progressively lower w so that we can fine-tune our search locally as we grow closer and closer to the minimal point.

Experimental Findings
Here, we test our proposed ASV-RF method regarding the diabetic patient monitoring system. This experiment is carried out using Python, and the collected data samples are used to perform the tests. Our proposed algorithm is also compared with existing algorithms (Sequential minimal optimization (SMO) [20], SVM [21], and DT [21]), to gain our method with the maximum performance in expressions of Accuracy, Sensitivity, Precision, Recall, Specificity, F1-score, TP, FP, Kappa, MAE, and RMSE metrics. This research aims to assess the employed techniques' efficacy and suggest the most effective algorithm for prediction. We assess the prediction outcomes using a variety of evaluation metrics via the confusion matrix. Figure 2 depicts the representation of the confusion matrix.
Here, D represents a dimension in the multi-dimensional search issue, and NP represents the size of the population. c1 and c2 are acceleration constants that provide proportional random weight of the deviation from the best individual performance of the particles and the best collective efficiency in the dth dimension.
To balance global and local investigations well, the proposed system applies the PSO with an adjustable inertia weight, w, during the whole search procedure. According to the following equation, the moment of inertia w is determined in this study.
Here, is the maximum number of iterations, and is the present number of iterations. We begin with a large value of so that we may run an aggressive global search in search of a possible good solution and progressively lower w so that we can finetune our search locally as we grow closer and closer to the minimal point.

Experimental Findings
Here, we test our proposed ASV-RF method regarding the diabetic patient monitoring system. This experiment is carried out using Python, and the collected data samples are used to perform the tests. Our proposed algorithm is also compared with existing algorithms (Sequential minimal optimization (SMO) [20], SVM [21], and DT [21]), to gain our method with the maximum performance in expressions of Accuracy, Sensitivity, Precision, Recall, Specificity, F1-score, TP, FP, Kappa, MAE, and RMSE metrics. This research aims to assess the employed techniques' efficacy and suggest the most effective algorithm for prediction. We assess the prediction outcomes using a variety of evaluation metrics via the confusion matrix. Figure 2 depicts the representation of the confusion matrix.   Figure 3 represent the results of proper and improper classified data and the training time for both existing and proposed algorithms. Training time is the time taken by an algorithm to train on a dataset. SMO has 0.032 s, SVM has 0.027 s, DT has 0.051 s, and ASV-RF has 0.019 s in terms of training time to train the dataset. Therefore, it is evident that the suggested approach trains the dataset faster than existing methods. Additionally, our proposed method correctly classifies more data than existing methods. That means the proposed method's improper classification rate is much less when compared with rates from the existing methods.   Figure 3 represent the results of proper and improper classified data and the training time for both existing and proposed algorithms. Training time is the time taken by an algorithm to train on a dataset. SMO has 0.032 s, SVM has 0.027 s, DT has 0.051 s, and ASV-RF has 0.019 s in terms of training time to train the dataset. Therefore, it is evident that the suggested approach trains the dataset faster than existing methods. Additionally, our proposed method correctly classifies more data than existing methods. That means the proposed method's improper classification rate is much less when compared with rates from the existing methods.  Similarly, Table 3 depicts the various metrics' comparative results of proposed and existing methods. The percentage of samples for which the suggested method correctly predicted outcomes is presented as the system's effectiveness. The accuracy is calculated using Equation (11). It's a measure of how many samples are correctly categorized. It determines the degree of similarity between the final results and the input data. The graph demonstrates how the new technique is more accurate than the old one.

Accuracy = TP + TN TP + TN + FP + FN
One of the most crucial metrics for accuracy is precision, calculated as the proportion of properly classified cases to all instances of predictively positive data, as shown in Equation (12). It measures the precision of the recommended procedure by comparing the number of actual successes with the number of expected successes. The performance of the suggested technique is evaluated by distinguishing between true and false positives.
The ability of the suggested model to recognize each significant sample in a data collection is known as sensitivity. It is determined statistically by dividing the TPs percentage by the total TPs and FNs (Equation (13)).  Similarly, Table 3 depicts the various metrics' comparative results of proposed and existing methods. The percentage of samples for which the suggested method correctly predicted outcomes is presented as the system's effectiveness. The accuracy is calculated using Equation (11). It's a measure of how many samples are correctly categorized. It determines the degree of similarity between the final results and the input data. The graph demonstrates how the new technique is more accurate than the old one.
One of the most crucial metrics for accuracy is precision, calculated as the proportion of properly classified cases to all instances of predictively positive data, as shown in Equation (12). It measures the precision of the recommended procedure by comparing the number of actual successes with the number of expected successes. The performance of the suggested technique is evaluated by distinguishing between true and false positives.
The ability of the suggested model to recognize each significant sample in a data collection is known as sensitivity. It is determined statistically by dividing the TPs percentage by the total TPs and FNs (Equation (13)).
Figures 4 and 5 depict the comparative assessments of various metrics for proposed and existing methods. The proposed model's recall is the capacity to identify every important sample in a data collection. It is intended statistically as the proportion of the TPs divided by the summation of the TPs and FNs (Equation (14)).    The f1-score incorporates both into a single factor (Equation (15)) by calculating the harmonic mean of the proposed model's recall and precision. Specificity is the likelihood of a negative outcome under the premise that the result is, in fact, negative. This probability is often referred to as the real negative rate.
The proportion between the value of TNs and the total amount of TNs and FPs is referred to as specificity (Equation (16)).
From   Figure 6 depict the outcomes of kappa, MAE, and RMSE metrics for both proposed and existing methods. A measure that contrasts actual accuracy versus predicted accuracy is called Kappa. SMO has 91.92%, SVM has 95.06%, DT has 94.03%, and ASV-RF has 98.52% in terms of Kappa value. MAE estimates the average degree of mistakes in a set of forecasts. All individual differences are equally weighted in the testing sample's mean of the absolute disparities between predicted and observed. In Table 4 SMO has 3.48%, SVM has 1.19%, DT has 2.08%, and ASV-RF has 1.01% in terms of MAE. RMSE is a metric used to evaluate the reliability of forecasts. SMO has 13.54%, SVM has 8.16%, DT has 9.08%, and ASV-RF has 7.25% in terms of RMSE. As shown, the suggested ASV-RF technique performs better than other methods using these measures.    Figure 6 depict the outcomes of kappa, MAE, and RMSE metrics for both proposed and existing methods. A measure that contrasts actual accuracy versus predicted accuracy is called Kappa. SMO has 91.92%, SVM has 95.06%, DT has 94.03%, and ASV-RF has 98.52% in terms of Kappa value. MAE estimates the average degree of mistakes in a set of forecasts. All individual differences are equally weighted in the testing sample's mean of the absolute disparities between predicted and observed. In Table 4 SMO has 3.48%, SVM has 1.19%, DT has 2.08%, and ASV-RF has 1.01% in terms of MAE. RMSE is a metric used to evaluate the reliability of forecasts. SMO has 13.54%, SVM has 8.16%, DT has 9.08%, and ASV-RF has 7.25% in terms of RMSE. As shown, the suggested ASV-RF technique performs better than other methods using these measures.   Providing accurate patient information to the hospital to protect the patient's life is referred to as "security of life." A threat to the patient's health might result from failure to comply. By misusing the devices, people with evil intentions might transmit inaccurate data to the hospital Figure 7 depict the outcomes of security of life. It is observed that SMO has 85%, SVM has 91%, DT has 73%, and ASV-RF has 91% in terms of security of life. This chart demonstrates that the suggested approach of ASV-RF has a high value. Providing accurate patient information to the hospital to protect the patient's life is referred to as "security of life." A threat to the patient's health might result from failure to comply. By misusing the devices, people with evil intentions might transmit inaccurate data to the hospital Figure 7 depict the outcomes of security of life. It is observed that SMO has 85%, SVM has 91%, DT has 73%, and ASV-RF has 91% in terms of security of life. This chart demonstrates that the suggested approach of ASV-RF has a high value.  Providing accurate patient information to the hospital to protect the patient's life is referred to as "security of life." A threat to the patient's health might result from failure to comply. By misusing the devices, people with evil intentions might transmit inaccurate data to the hospital Figure 7 depict the outcomes of security of life. It is observed that SMO has 85%, SVM has 91%, DT has 73%, and ASV-RF has 91% in terms of security of life. This chart demonstrates that the suggested approach of ASV-RF has a high value.

Conclusions
E-health trackers keep track of a person's actions and offer useful feedback, especially when dangerous circumstances arise. This paper presented the ASV-RF method for smart

Conclusions
E-health trackers keep track of a person's actions and offer useful feedback, especially when dangerous circumstances arise. This paper presented the ASV-RF method for smart patient monitoring. With the help of this technique, it was possible to assess the person's dependencies, forecast his future health status, and foresee its decline before potential consequences. The normalization approach was used to normalize the raw dataset for further processes regarding patient monitoring. For feature extraction, the LDA method was employed. The study's findings revealed that the suggested strategy worked superior to other current approaches in terms of accuracy (99.86%), sensitivity (99.13%), precision (99.61%), recall (99.97%), specificity (98.97%), f1-score (98.89%), TP (99.8%), FP (0.5%), Kappa (98.52%), MAE (1.01%), and RMSE (7.25%) metrics. The proposed framework can be extended with various large datasets in the future. Researchers in this healthcare field will benefit from academic study and methods, particularly from computerized forecasting and virtual assistants for human disorders. They want to regularly gather user input in our future development and feature enhancements. It will keep our application focused on the patient, allowing us to consider users' demands while refining current features and building new ones. Last but not least, we must always protect the privacy of our consumers as our first concern. The application will have openings for a data breach or leak.