A Hybrid Deep Learning Model Using CNN and K-Mean Clustering for Energy Efﬁcient Modelling in Mobile EdgeIoT

: In mobile edge computing (MEC), it is difﬁcult to recognise an optimum solution that can perform in limited energy by selecting the best communication path and components. This research proposed a hybrid model for energy-efﬁcient cluster formation and a head selection (E-CFSA) algorithm based on convolutional neural networks (CNNs) and a modiﬁed k-mean clustering (MKM) method for MEC. We utilised a CNN to determine the best-transferring strategy and the most efﬁcient partitioning of a speciﬁc task. The MKM method has more than one cluster head in each cluster to lead. It also reduces the number of reclustering cycles, which helps to overcome the energy consumption and delay during the reclustering process. The proposed model determines a training dataset by covering all the aspects of cost function calculation. This training dataset helps to train the model, which allows for efﬁcient decision-making in optimum energy usage. In MEC, clusters have a dynamic nature and frequently change their location. Sometimes, this creates hurdles for the clusters to form a cluster head and, ﬁnally, abandons the cluster. The selected cluster heads must be recognised correctly and applied to maintain and supervise the clusters. The proposed pairing of the modiﬁed k-means method with a CNN fulﬁls this objective. The proposed method, existing weighted clustering algorithm (WCA), and agent-based secure enhanced performance approach (AB-SEP) are tested over the network dataset. The ﬁndings of our experiment demonstrate that the proposed hybrid model is promising in aspects of CD energy consumption, overhead, packet loss rate, packet delivery ratio, and throughput compared to existing approaches.


Introduction
MEC is an intelligent technique.Smartly examining partial computational offloading can decrease the energy usage of the client device (CDs) and quality service delay.It data privacy and security, deployment protocol selection, energy consumption, and task scheduling.In past decades, researchers have given great attention to expanding IoT and mobile devices and substantial requests for sensitive areas, i.e., speech recognition, virtual reality, immersive gaming, Google glass, video progression, and object recognition [8].
Appliances with limited resources can encounter poor reliability and a terrible consumer experience.Users mainly utilise AI-based automation systems and MEC techniques to deal with such issues.With data security methods, these models can work in limited battery capacity to make precise forecasts and judgments.In MEC and IoT communication, a mass number of overloaded links can be the cause of the bottleneck.Deep learning and machine learning methodologies are vital in dealing with such issues [9].In MEC systems, energy consumption is the most significant issue.
The primary goal of this research is to deal with energy issues in cluster head selection and cluster formation by establishing the most effective decision policies for MEC.To achieve the above, this research proposed a hybrid deep learning model based on CNN and the modified k-mean clustering (MKC) method for MECs called the "Energy-efficient cluster formation and head selection (E-CFSA) algorithm".The proposed hybrid model considers the partitioning by using a partial loading method, which determines the expense for each potential partitioning and loading strategy, and afterwards chooses the optimum.
In the proposed system, a cluster head selection and cluster formation process involves a rotational-based method to resolve the self-organisation and high availability characteristics.A master cluster head block directs a successful team to transmit the information to the base station efficiently.The proposed method also utilised a modified k-mean clustering method (MKM) to select the balanced cluster head and to choose more than one CH in a cluster to lead the group.The CHs selection depends on the length and timeliness of cluster nodes and allows more than two clusters to show in the group.It helps to reduce the reclustering cycles and achieves better energy and time results.Experimental analyses were performed on proposed and existing methods, i.e., WCA and AB-SEP.This research also allows IoT and mobile devices to discover pooled forecasting jointly.
The article is organised as follows: Section 2 discusses the literature on energy-efficient cluster head selection for MEC.Section 3 discusses the material and methods.It also covers the working of the proposed hybrid model.Section 4 covers the simulation results, discussion, analysis, and comparison.Section 5 discusses the conclusion and future work.

Literature Review
A natural energy optimisation problem is discussed in [10].This research includes the performance review of energy utilisation, the transforming model for energy transformation, and improving the MEC system's power quality.The proposed techniques deal with energy challenges using collaborative block descent and fuzzy linear programming concepts.As of now, significant research initiatives have been dedicated to constructing offloading schemes for MEC networks.
Deep learning (DL) techniques need a lot of processing and memory to store training data and vast training models.A novel deep learning model is developed in [11].The proposed model functions better on edge devices by implementing shallow features with limited processing capacity IoT equipment.A novel model is discussed in [12] when precise decisions must only be considered.The proposed deep learning model speeds up edge devices' knowledge acquisition and convolution layers.It also decreases the width of the components for multiclass classification [13].An early disappearance of features in the MEC system with limited learning outcomes is always challenging [14].The research proposed an enhanced CNN model using a modified layering technique for edge devices.
To predict the optimum computational and storage transferring techniques, three components-based models are discussed in [15].The proposed model utilises state variables, decision behaviour, and scheme utility.The number of state variables helps to split the MEC process into static and dynamic transferring.It also utilises intelligent terminals, "I-Devices", which are more suitable for network infrastructure programs and services [16].
MEC systems have the higher processing power and deal with large amounts of data processing.Subsequently, article [17] discussed a deep reinforcement method for adaptive MEC.The proposed model maximises the MES data sampling and enhances the transfer rate.A comprehensive replica learning strategy to reduce MD service delays is discussed in [18].A cost function of the proposed model deals with the service process and offers better communication and fewer service delays.Applying practical deep learning-based computing with a limited training dataset for MEC systems is difficult.The proposed model also suggested a high-rate binary transferring process with dynamic MEC networks [19].This research deals with energy uncertainty issues by using CNN.To reduce the weight value of energy usage and latency, the proposed model uses a binary task allocation method based on time-varying workflows and different analysis features.
An energy-efficient method based on a deep learning technique was introduced in [20].This research suggested a gradient-based deterministic strategy in MEC systems to solve the global optimisation problem.This article assesses a dense decentralised cellular network with multiple users, servers, and activities.The authors also considered MES and CD mobility for designing a region parallel task transferring model that enables reliable communication for low latency application areas.A comprehensive investigation of the emerging multimedia IoT is described in [21].It also promotes several novel applications that enhance the overall quality of life by linking all the connected devices via emerging technological solutions [22].
The primary objective of the proposed model [23] was to emphasise the outline of MEC and its significant applications.MEC introduces edge devices among node sources and the cloud to prolong cloud services.This research also deals with the energy issues in MEC by enhancing the capabilities and optimising the load [24].
Table 1 discuss a comparison review of numerous existing research based on various parameter in the field of MEC energy consumption.

Materials and Methods
This section describes the features of the proposed and existing methods and covers the essential parameters and database specifications.

Proposed Hybrid Model
We proposed an "Energy-efficient model" E-CFSA for MEC in this research.The proposed hybrid model utilises CNN and modified k-mean clustering (MKC).Figure 1 shows the architecture of the proposed model.
The proposed model includes several functions, which provide for different phases.A detailed description of these phases is covered in the following subsections.

CNN Model
The CNN architecture is presented in Figure 2. In the proposed hybrid model, a CNN model uses three convolutional layers and a similar number of active, fully connected layers.In addition to the output layer, each layer in the CNN model of the proposed system is accompanied by a ReLu activation function (rectified linear function) over the sigmoid.The ReLu utilises a function (np.Heaviside (x, 1)).
A sigmoid function decreases the output response to a frequency comparable to 0 or 1 [25].Through block-by-block inspection, CNN gathers the related data for Tw and Tc and concentrates on regional content.We consider a responsive MEC task sequence of events in which the task weight Tw is changeable, and the task workloads Td can modify individually to reduce energy consumption.In the CNN model, we use batch normalisation (BN) to resolve internal correlation coefficient shift patterns in feature diagrams to avoid overfitting the current model by correcting gradient movement and enhancing network generalisation.Table 2 shows the parameters of the CNN model.

CNN Model
The CNN architecture is presented in Figure 2. In the proposed hybrid model, a CNN model uses three convolutional layers and a similar number of active, fully connected layers.In addition to the output layer, each layer in the CNN model of the proposed system is accompanied by a ReLu activation function (rectified linear function) over the sigmoid.The ReLu utilises a function (np.Heaviside (x, 1)).

CNN Model
The CNN architecture is presented in Figure 2. In the proposed hybrid model, a CNN model uses three convolutional layers and a similar number of active, fully connected layers.In addition to the output layer, each layer in the CNN model of the proposed system is accompanied by a ReLu activation function (rectified linear function) over the sigmoid.The ReLu utilises a function (np.Heaviside (x, 1)).
A sigmoid function decreases the output response to a frequency comparable to 0 or 1 [25].Through block-by-block inspection, CNN gathers the related data for Tw and Tc and concentrates on regional content.We consider a responsive MEC task sequence of events in which the task weight Tw is changeable, and the task workloads Td can modify individually to reduce energy consumption.In the CNN model, we use batch normalisation (BN) to resolve internal correlation coefficient shift patterns in feature diagrams to avoid overfitting the current model by correcting gradient movement and enhancing network generalisation.Table 2 shows the parameters of the CNN model.A sigmoid function decreases the output response to a frequency comparable to 0 or 1 [25].Through block-by-block inspection, CNN gathers the related data for T w and T c and concentrates on regional content.We consider a responsive MEC task sequence of events in which the task weight T w is changeable, and the task workloads Td can modify individually to reduce energy consumption.
In the CNN model, we use batch normalisation (BN) to resolve internal correlation coefficient shift patterns in feature diagrams to avoid overfitting the current model by correcting gradient movement and enhancing network generalisation.Table 2 shows the parameters of the CNN model.In the proposed hybrid model, the CNN model utilises forward pass and backward phases for weight distribution, which further helps cluster head formation.

Forward Pass
The CNN model predicts all the possible outcomes after receiving the input data (xi) and weight training into the input layer.Equation (1) shows how to determine the net input.Net input (NT input−data ) depends on the weight parameter (w ij ).Equations ( 2) and (3) show the net input calculation for Layers 1 and 2. In the below equation, x represents the input data, and w represents the weight.
The inputs are squashed by applying the logistic function.It generates the new output represented in Equations ( 4) and (5).
The output from the neurons in the hidden layers is utilised as new input variables in a subsequent iteration of this procedure for better products for the CNN layer.

Determining the Total Error
A mean squared error operation function is applied to determine the error for each output variable, and based on these results; a total cumulative error is measured.In Equation ( 6), the mean squared error formula is represented [26].
The sum of the measured errors is represented in Equation (7); the total error for this CNN is as follows: En total = (En output1 + En output2 ) where: Calculating the values of E output1 and E output2 in Equation ( 8)

Backward Pass
The backpropagation (BP) method maintains the network weight information and ensures total performance.A backpropagation method minimises the error rate across each output unit and the network if the absolute error exceeds the desired value.The BP algorithm takes the derivative of the inner function product of En total concerning the specified weights [27].
Consider the weight training (W 1 , W 2 , W 3 , and W 4 ).Using the convolutional layers from Equation ( 9) as the chain rule's application location: After repeating the above equation for weight w 10 , the hidden layer is described in Equation (10): After repeating the similar process for additional weight training at a hidden layer in which n = (1, 2, 3, . . ., 8), now, the total error change for the output is: The output's variation concerning its cumulative estimated input is then calculated below.
After partially differentiating Equation ( 14), ∂out output1 Partially differentiating Equation ( 15) concerning ∂net input1 A variation in the network weights for the cumulative estimated input of output1 can be determined by (16).
The total error can be determined by putting the values of Equation ( 11) into (16).We have also calculated the outcome for a fraction [ ∂E total ∂w n ] for each weight category from n = 1 to 10.After that, a new error function is determined to overcome the total error rate.The new weight can be the difference in the outcome of a fraction [ ∂E total ∂w n ] and the actual weight as described in Equation (17).
where: Wn-New weights Wn and µ learning rate.The algorithm's hidden and output layers utilise the ReLu activation function.Neural network models use backpropagation to train individuals and automatically update all the weights in response to input datasets.This helps to determine the possible errors in output and hidden neurons.
Rather than using the traditional gradient descent stochastic method, the interpolation method is employed.It determines the active learning rate, aiding computational efficiency and cutting learning costs.It is deemed that the error can be reduced to an acceptable level.The model has been accurately trained even when the final output (Y) equals the network node predicted values [28].where: Wn-New weights Wn and µ learning rate.The algorithm's hidden and output layers utilise the ReLu activation functio ral network models use backpropagation to train individuals and automatically all the weights in response to input datasets.This helps to determine the possibl in output and hidden neurons.
Rather than using the traditional gradient descent stochastic method, the in tion method is employed.It determines the active learning rate, aiding computati ficiency and cutting learning costs.It is deemed that the error can be reduced t ceptable level.The model has been accurately trained even when the final ou equals the network node predicted values [28].However, the prototype is outstanding at forecasting future undiscovered s because the experiment loss is nearer to the training loss.In the first 100 epochs o epochs of training, the CNN learned very quickly; however, as it neared the en activity, the slope began to flatten.The model has achieved better statistics in the tion, but noise signals can affect the performance.As a result, testing loss outco more significant than training loss development.
The simulation graph clearly shows that after the training phase, the proposed more precisely forecasted the scores within 0.085.The proposed model generated between 0 and 1 for each sample.Nodes with the best precision and lowest err selected as cluster head terminals.In Figure 4, this is represented by the "black st However, the prototype is outstanding at forecasting future undiscovered samples because the experiment loss is nearer to the training loss.In the first 100 epochs of its 500 epochs of training, the CNN learned very quickly; however, as it neared the end of its activity, the slope began to flatten.The model has achieved better statistics in the simulation, but noise signals can affect the performance.As a result, testing loss outcomes are more significant than training loss development.
The simulation graph clearly shows that after the training phase, the proposed model more precisely forecasted the scores within 0.085.The proposed model generated a rating between 0 and 1 for each sample.Nodes with the best precision and lowest error were selected as cluster head terminals.In Figure 4, this is represented by the "black star".

Modified K-Means in Cluster Formation Procedure
We modified the proposed model's existing k-means clustering m ter head formation.It is a vector quantisation approach and divides 'n into 'k' groups.Each observation is closely related to a particular clust In this algorithm, the variable 'k' refers to a dataset's number of the communication signals are placed into cluster centres assuming th (18) shows the formula for the "n-dimensional centroid point" (NDmensional space: After this step, the next node's distance toward the cluster cen estimated.Once the model's training is completed, the Euclidean dis towards the network's x and y coordinates (19).
The node point with the lowest distance towards the cluster centr merged in the calculated new cluster.This process is iteratively repe cluster.After the end of the process, a new cluster is formulated.We a ular elbow analysis technique to recognise all the best clusters (Ck).
Figure 5 shows that the graph starts to flatten knowingly when the (k) is 10 and 20; at this point, the graph looks like an elbow.It means t of clusters can be taken as any value from 10 to 20 for a given data means perform well.
A graph was plotted among the number of cluster nodes and d

Modified K-Means in Cluster Formation Procedure
We modified the proposed model's existing k-means clustering method [29] for cluster head formation.It is a vector quantisation approach and divides 'n' observations data into 'k' groups.Each observation is closely related to a particular cluster.
In this algorithm, the variable 'k' refers to a dataset's number of nodes/ clusters.All the communication signals are placed into cluster centres assuming the k value.Equation (18) shows the formula for the "n-dimensional centroid point" (ND-CP) within k n-dimensional space: NDCP(X d1 , X d2 , Xd 3 . . . . . .X dn ) = ∑ n i=1 X d1 st k1 , . . . . . . . . .., ∑ n i=1 X dn th kn (18) After this step, the next node's distance toward the cluster centre's coordinates is estimated.Once the model's training is completed, the Euclidean distance is calculated towards the network's x and y coordinates (19).
The node point with the lowest distance towards the cluster centroid point is directly merged in the calculated new cluster.This process is iteratively repeated to check each cluster.After the end of the process, a new cluster is formulated.We also utilise the popular elbow analysis technique to recognise all the best clusters (Ck).
Figure 5 shows that the graph starts to flatten knowingly when the number of clusters (k) is 10 and 20; at this point, the graph looks like an elbow.It means the optimal number of clusters can be taken as any value from 10 to 20 for a given dataset under which k-means perform well.

Dataset Description
This research utilises an online network dataset generated on a network simulator.The dataset contains 16 attributes, including node number, x-coordinate, y-coordinate, no. of the packet received, no. of a packet sent, no. of the packet forwarded, no. of packet drop, no. of neighbours, initial energy (constant), remaining energy, node speed, pause time, energy consumption, simulation time (constant), transmission range (constant), and optimal node reliability factor [30] as the target value.
In NS simulator 2.35, the dataset scenario was generated.An experimental setup with 1000 endpoints was positioned inside a terrain area of (1100 × 1100) meters [31].A random mobility way-point model was used to set up each node, giving it a maximum speed range of 0 to 35 m/s, a data transmitting range of 300 m and a preliminary energy capacity of 300 joules.The data transmission packet size was 512 bytes, and a constant bit rate-based UDP traffic was used to produce the data traffic pattern.On this basis, various performance features have been recorded for each node during the simulation.The target value represents the node reliability factor calculated during the simulation.
The dataset used in the analysis includes 1034 data samples of selected 11 features, from which 70% of samples were considered for training and 30% for testing.

Data Preprocessing
The following steps have been defined under data preprocessing [32].A graph was plotted among the number of cluster nodes and distortion results, as shown in Figure 6.Equation (20) shows the formula to determine the distortion value.

Dataset Description
This research utilises an online network dataset generated on a network simulator.The dataset contains 16 attributes, including node number, x-coordinate, y-coordinate, no. of the packet received, no. of a packet sent, no. of the packet forwarded, no. of packet drop, no. of neighbours, initial energy (constant), remaining energy, node speed, pause time, energy consumption, simulation time (constant), transmission range (constant), and optimal node reliability factor [30] as the target value.
In NS simulator 2.35, the dataset scenario was generated.An experimental setup with 1000 endpoints was positioned inside a terrain area of (1100 × 1100) meters [31].A random mobility way-point model was used to set up each node, giving it a maximum speed range of 0 to 35 m/s, a data transmitting range of 300 m and a preliminary energy capacity of 300 joules.The data transmission packet size was 512 bytes, and a constant bit rate-based UDP traffic was used to produce the data traffic pattern.On this basis, various performance features have been recorded for each node during the simulation.The target value represents the node reliability factor calculated during the simulation.
The dataset used in the analysis includes 1034 data samples of selected 11 features, from which 70% of samples were considered for training and 30% for testing.

Data Preprocessing
The following steps have been defined under data preprocessing [32].The variable Ck shows the possible clusters in p_x (number of points), and variable D_x shows the sum of distances among cluster points for cluster x.The number of groups formed is k.Furthermore, the most efficient node is selected as cluster head within each cluster by using a CNN based on four features: node degree value (NDN), node speed (NS), energy consumption (ECN), and packet drop (PDN).

Dataset Description
This research utilises an online network dataset generated on a network simulator.The dataset contains 16 attributes, including node number, x-coordinate, y-coordinate, no. of the packet received, no. of a packet sent, no. of the packet forwarded, no. of packet drop, no. of neighbours, initial energy (constant), remaining energy, node speed, pause time, energy consumption, simulation time (constant), transmission range (constant), and optimal node reliability factor [30] as the target value.
In NS simulator 2.35, the dataset scenario was generated.An experimental setup with 1000 endpoints was positioned inside a terrain area of (1100 × 1100) meters [31].A random mobility way-point model was used to set up each node, giving it a maximum speed range of 0 to 35 m/s, a data transmitting range of 300 m and a preliminary energy capacity of 300 joules.The data transmission packet size was 512 bytes, and a constant bit rate-based UDP traffic was used to produce the data traffic pattern.On this basis, various performance features have been recorded for each node during the simulation.The target value represents the node reliability factor calculated during the simulation.
The dataset used in the analysis includes 1034 data samples of selected 11 features, from which 70% of samples were considered for training and 30% for testing.

Data Preprocessing
The following steps have been defined under data preprocessing [32].

Statistical Analysis and Visualisation of Data
Tables 3 and 4 represent the statistical data analysis in quantile and descriptive statistics, respectively.They are about analysing the data and variables to produce meaningful information.Quantile statistics refers to dividing a probability distribution into areas of equal probability (four equal parts); it includes minimum and maximum value, 5th Percentile, Q1, median, Q3, 95th percentile, ranges, and interquartile range for variables.
Descriptive statistics are also vital for analysing the data when raw data is challenging to visualise.Moreover, it summarises data meaningfully and shows a more accessible and straightforward interpretation.It includes standard deviation, coefficient of variation, kurtosis, mean, median absolute deviation (MAD), skewness, sum, variance, and memory size.These metrics help to quickly understand and visualise the data during preprocessing [33].
The histogram and box plot of each feature is shown in Figures 7 and 8, respectively, to visualise essential elements from the dataset.These provide a quantitative understanding to summarise the distribution of variables and also help to identify data density, patterns, and outlier points in datasets.In histogram plots, data samples are displayed in a bar chart where the x-axis generally gives intervals or discrete bins for the observations.The y-axis shows the dataset's frequency or count of adherence [33].Figure 8 represents the blue box for the middle 50% of the data, within that black line for the median, the end lines for the whiskers that summarise the range of sensible data, and finally, dots for the possible outliers.
Another representation is that the box plot also helps to observe the skewness, spread, and outlier points in which the x-axis is defined to represent the data sample, and the y-axis shows the observation values.For each attribute, single boxplots have been drawn that summarise the middle 50% of the data and create a box that starts with the observation at the 25th Percentile (Q1) and ends with the 75th Percentile (Q3), known as the interquartile range.The 50th Percentile (Q2) shows the median represented by a line.Whisker lines extend from both ends of the box, demonstrating the expected range of sensible data in the distribution (minimum and maximum value of data) [34].
Observations outside the whiskers might be outliers and are drawn with small circles.Mathematically, this is represented as: The expected range is [( − 1.5 × ), ( + 1.5 ×  ]; any data point outside the given range will be considered an outlier.Figure 8 represents the blue box for the middle 50% of the data, within that black line for the median, the end lines for the whiskers that summarise the range of sensible data, and finally, dots for the possible outliers. Another representation is that the box plot also helps to observe the skewness, spread, and outlier points in which the x-axis is defined to represent the data sample, and the y-axis shows the observation values.For each attribute, single boxplots have been drawn that summarise the middle 50% of the data and create a box that starts with the observation at the 25th Percentile (Q1) and ends with the 75th Percentile (Q3), known as the interquartile range.The 50th Percentile (Q2) shows the median represented by a line.Whisker lines extend from both ends of the box, demonstrating the expected range of sensible data in the distribution (minimum and maximum value of data) [34].

Normalisation of Data and Feature Selection
The previous sections discussed the comprehensive details of the dataset.This section applies the preprocessing data feature.This phase first involves data cleaning to remove null records and outliers.It utilises a z-normalization process to standardise feature values by putting them on the same scale.This phase eliminates the missing value and noise from the dataset and normalises the dataset.It helps the model training process and also improves the overall accuracy.Mathematically, Z-score normalisation represents Equation ( 22) Xi, µ, and σ represent the original mean and standard deviation samples.Equation ( 23) represents the σ value.

𝜎 =
∑(X − μ) no.ofsample (23) Figure 9 shows the Pearson correlation and Spearman correlation matrix.This help to quantify the relationship between features and measure the strength of how they strongly correlate.Mathematical formulas are shown in Equation (24) for pairs of random elements (X, Y).Observations outside the whiskers might be outliers and are drawn with small circles.Mathematically, this is represented as: The expected range is [(Q 1 − 1.5 × IQR), (Q 3 + 1.5 × IQR ]; any data point outside the given range will be considered an outlier.

Normalisation of Data and Feature Selection
The previous sections discussed the comprehensive details of the dataset.This section applies the preprocessing data feature.This phase first involves data cleaning to remove null records and outliers.It utilises a z-normalization process to standardise feature values by putting them on the same scale.This phase eliminates the missing value and noise from the dataset and normalises the dataset.It helps the model training process and also improves the overall accuracy.Mathematically, Z-score normalisation represents Equation ( 22) Xi, µ, and σ represent the original mean and standard deviation samples.Equation ( 23) represents the σ value.
Figure 9 shows the Pearson correlation and Spearman correlation matrix.This help to quantify the relationship between features and measure the strength of how they strongly correlate.Mathematical formulas are shown in Equation ( 24) for pairs of random elements (X, Y).
Electronics 2023, 12, x FOR PEER REVIEW 15 of 24 (a) (b) When calculating covariance cov(X, Y) to determine the direction of the relationship between features by capturing the variance between X and Y, it may be given a positive or negative result.Here, σ and σ show the standard deviation of X and Y (25).Finally, the result of ρ(X, Y) always represents a value between −1 and 1.
Here initially, Xi and Yi converted into ranks variables rg and rg , respectively, and calculate ρ(X, Y) between rank variables.cov(rg , rg ) is covariance, and σ σ are the standard deviation of the rank variables.The correlation coefficient matrix gives evidence about correlated features.These are measured on a scale of −1 to 1.In Figure 7, the feature with the value −1 (dark maroon) and 1 (dark blue) or nearly that means it has highly corrected to each other.This analysis identifies four features: energy consumption, a neighbour of node, node speed, and packet drop, which are positively correlated to the target value (optimal node reliability factor).

Experimental Results and Discussions
This section covers experimental analysis and discussion.The proposed E-CFSA model and the existing AB-SEP method [2] and WCA [3] are compared based on performance measuring parameters.

Network Setup
This research utilised a multi-hop network connection with varying homogeneous sensor nodes.Experimental modelling was performed over NS-2 and Python [35].Many grids were used to split the rectangle region under deep consideration.This is consistent with the idea that the implemented network is rectangular and supported by most existing studies.When calculating covariance cov(X, Y) to determine the direction of the relationship between features by capturing the variance between X and Y, it may be given a positive or negative result.Here, σ X and σ Y show the standard deviation of X and Y (25).Finally, the result of ρ(X, Y) always represents a value between −1 and 1.
where a spearman correlation coefficient ρ(rg X , rg Y ) = r S = cov(rg X ,rg Y ) σ rg X σ rg y and, . Here initially, Xi and Yi converted into ranks variables rg X and rg Y , respectively, and calculate ρ(X, Y) between rank variables.cov(rg X , rg Y ) is covariance, and σ rg X σ rg y are the standard deviation of the rank variables.The correlation coefficient matrix gives evidence about correlated features.These are measured on a scale of −1 to 1.In Figure 7, the feature with the value −1 (dark maroon) and 1 (dark blue) or nearly that means it has highly corrected to each other.This analysis identifies four features: energy consumption, a neighbour of node, node speed, and packet drop, which are positively correlated to the target value (optimal node reliability factor).

Experimental Results and Discussions
This section covers experimental analysis and discussion.The proposed E-CFSA model and the existing AB-SEP method [2] and WCA [3] are compared based on performance measuring parameters.

Network Setup
This research utilised a multi-hop network connection with varying homogeneous sensor nodes.Experimental modelling was performed over NS-2 and Python [35].Many grids were used to split the rectangle region under deep consideration.This is consistent with the idea that the implemented network is rectangular and supported by most existing studies.
A rectangular pattern of sensor networks has been used to send and receive packets.The BS was situated either toward the middle or near any of the connected edges in the simulation.Constant power communication is used as a power transmission process [36].
The proposed hybrid model considers the partitioning procedure in a partial loading method which determines the expense for each potential partitioning and loading strategy and chooses the optimum solution.The attributes of key simulation attributes are displayed in Table 5.

Performance Measuring Parameters
The specifications that evaluate the performance of both the traditional and the proposed E-CFSA are presented in this subsection.Experiments were conducted to compare the E-CFSA with the conventional approach to determine the cluster head's routing overhead, throughput, packet delivery ratio, and stability period [37].

•
Packet delivery ratio (PDR): The PDR is the data packets the destinations receive to those the sources generate.Mathematically, it can be defined by Equation (26).S1 is the number of packets sent, and S2 is the number of packets received for nodes.
• Throughput (Th): It is defined as the fraction of the sum of delivered packets (from the source) and the total simulation time by Equation (27).
• Routing or network overhead (RO): It is defined as the number of control and routing packets required for communication in the network, as described in Equation (28).RO = number of routing packets packets received • Cluster head stability time (CHST): It is defined as the total period for which a network node works as a cluster head.The average of that period is known as the average stability time.
• Energy consumption (EC): The cumulative energy the system uses for data transformation, communication, and confirmation, as described in Equation ( 29).

EC =
Energy used in communication Total Energy (29)

Simulation Results and Discussion
As described in Table 6, the proposed E-CFS algorithm and existing methods AB-SEP and WCA were evaluated in different scenarios with different parameters, i.e., the number of grids and clusters and the number of nodes.The number of grids ranges depending on the start conditions, varying from six to one hundred.Moreover, the node sample size ranges depending on the network region size.It differs from 20 to 100 (in Scenario-1) and 200 to 1000 (in Scenario-2).The first scenario was implemented with 20 to 100 nodes, a grid size of (4 × 4), and a terrain of (500 × 500) meters.Experiments were performed to measure the performance of the proposed E-CFSA hybrid deep learning model.Moreover, a comparative analysis was performed with existing AB-SEP and WCA methods [38].Table 7 shows the simulation results (impact of network size), and Table 8 shows the results (effect of node speed).The simulation results based on the network size and node speed are presented in Figures 10 and 11 for the proposed E-CFSA and existing AB-SEP and WCA.In Figure 8, when the number of nodes varies from 20 to 100, the PDR % of the proposed model is best for node 80; similar to 100 nodes, the throughput is 255, and similar routing overhead, average stability time (in a sec), and energy consumption (for all CHs and non-CHS nodes) results for the proposed method are best as compared to the existing WCA and AB-SEP.Similar to Figure 11, simulation results for the impact of node speed clearly show that the proposed method achieves better PDR, throughput, packet loss rate, and average stability time of CHs compared to the existing AB-SEP and WCA methods [39].
Figures 12 and 13 show the throughput simulation results for the proposed E-CFS and the existing WCA and AB-SEP methods with varying node speeds and network node count.The experimental findings demonstrate the higher data rate of E-CFSA speeds during both the trials and variability.In this simulation, the network's node speed varied from 5 m/s to 25 m/s, and the number of nodes varied from 20 to 100.These outcomes also show that WCA chose the shortest route but did not consider the node reliability factor.As a result, there is a regular variation in network sizes during routing, which reduces throughput.The experimental results of routing overhead with changing network size (number of nodes) and packet loss ratio with changing node speed are shown in Figures 14 and 15, respectively.These two experiments are critical in evaluating and contrasting the efficiency of the proposed E-CFSA with the existing WCA and AB-SEP methods.The proposed method reduces routing overhead and packet drop ratio.In this simulation, the network's node speed varied from 5 m/s to 25 m/s, and the number of nodes varied from 20 to 100.These outcomes also show that WCA chose the shortest route but did not consider the node reliability factor.As a result, there is a regular variation in network sizes during routing, which reduces throughput.The experimental results of routing overhead with changing network size (number of nodes) and packet loss ratio with changing node speed are shown in Figures 14 and 15, respectively.These two experiments are critical in evaluating and contrasting the efficiency of the proposed E-CFSA with the existing WCA and AB-SEP methods.The proposed method reduces routing overhead and packet drop ratio.In this simulation, the network's node speed varied from 5 m/s to 25 m/s, and the number of nodes varied from 20 to 100.These outcomes also show that WCA chose the shortest route but did not consider the node reliability factor.As a result, there is a regular variation in network sizes during routing, which reduces throughput.The experimental results of routing overhead with changing network size (number of nodes) and packet loss ratio with changing node speed are shown in Figures 14 and 15, respectively.These two experiments are critical in evaluating and contrasting the efficiency of the proposed E-CFSA with the existing WCA and AB-SEP methods.The proposed method reduces routing overhead and packet drop ratio.In this simulation, the network's node speed varied from 5 m/s to 25 m/s, and the number of nodes varied from 20 to 100.These outcomes also show that WCA chose the shortest route but did not consider the node reliability factor.As a result, there is a regular variation in network sizes during routing, which reduces throughput.The experimental results of routing overhead with changing network size (number of nodes) and packet loss ratio with changing node speed are shown in Figures 14 and 15, respectively.These two experiments are critical in evaluating and contrasting the efficiency of the proposed E-CFSA with the existing WCA and AB-SEP methods.The proposed method reduces routing overhead and packet drop ratio.Figures 16 and 17 show the comparison results calculated for a lifetime for the proposed and existing methods.These results were calculated with dynamic network size and network node speed.The results of the experiments clearly show that the proposed E-CFSA achieve lower stability times and keeps all variations in the network size and node speed.The innovative collaborations enable the proposed E-CFSA to outperform traditional approaches.The above experimental findings also suggest that k-means assists the proposed E-CFSA in cluster head selection, enabling this outclass over existing methods.
Figures 16 and 17 show the comparison results calculated for a lifetime for the proposed and existing methods.These results were calculated with dynamic network size and network node speed.The results of the experiments clearly show that the proposed E-CFSA achieve lower stability times and keeps all variations in the network size and node speed.The innovative collaborations enable the proposed E-CFSA to outperform traditional approaches.The above experimental findings also suggest that k-means assists the proposed E-CFSA in cluster head selection, enabling this outclass over existing methods.

Scenario-Two
The second scenario was implemented with 200 to 1000 nodes, a grid size of (10 × 10), several rounds of 100 to 2000, and a terrain of (1000 × 1000) meters.Experiments were performed to measure the performance of the proposed E-CFSA.Then, a comparison was made with the current state-of-the-art methods.Table 9 shows the results of the effect of the network size.
Figures 18-21 show the simulation results of scenario two for the proposed E-CFSA, the existing WCA, and AB-SEP with variations in the number of nodes and node speed in the network.The proposed method has a better PDR % of 98.99 % for 20 nodes, which is best compared to the WCA and AB-SEP methods.Similarly, the proposed method achieves 94.89 % throughput for a node speed of 5 m/s, which is the best result.Similar to Figures 18 and 19, the proposed method performs a network lifetime of 107 s and an average stability time of CH of 97 s, which is best compared to existing WCA and AB-SEP methods.Figures 16 and 17 show the comparison results calculated for a lifetime for the proposed and existing methods.These results were calculated with dynamic network size and network node speed.The results of the experiments clearly show that the proposed E-CFSA achieve lower stability times and keeps all variations in the network size and node speed.The innovative collaborations enable the proposed E-CFSA to outperform traditional approaches.The above experimental findings also suggest that k-means assists the proposed E-CFSA in cluster head selection, enabling this outclass over existing methods.

Scenario-Two
The second scenario was implemented with 200 to 1000 nodes, a grid size of (10 × 10), several rounds of 100 to 2000, and a terrain of (1000 × 1000) meters.Experiments were performed to measure the performance of the proposed E-CFSA.Then, a comparison was made with the current state-of-the-art methods.Table 9 shows the results of the effect of the network size.
Figures 18-21 show the simulation results of scenario two for the proposed E-CFSA, the existing WCA, and AB-SEP with variations in the number of nodes and node speed in the network.The proposed method has a better PDR % of 98.99 % for 20 nodes, which is best compared to the WCA and AB-SEP methods.Similarly, the proposed method achieves 94.89 % throughput for a node speed of 5 m/s, which is the best result.Similar to Figures 18 and 19, the proposed method performs a network lifetime of 107 s and an average stability time of CH of 97 s, which is best compared to existing WCA and AB-SEP methods.

Scenario-Two
The second scenario was implemented with 200 to 1000 nodes, a grid size of (10 × 10), several rounds of 100 to 2000, and a terrain of (1000 × 1000) meters.Experiments were performed to measure the performance of the proposed E-CFSA.Then, a comparison was made with the current state-of-the-art methods.Table 9 shows the results of the effect of the network size.
Figures 18-21 show the simulation results of scenario two for the proposed E-CFSA, the existing WCA, and AB-SEP with variations in the number of nodes and node speed in the network.The proposed method has a better PDR % of 98.99 % for 20 nodes, which is best compared to the WCA and AB-SEP methods.Similarly, the proposed method achieves 94.89 % throughput for a node speed of 5 m/s, which is the best result.Similar to Figures 18  and 19, the proposed method performs a network lifetime of 107 s and an average stability time of CH of 97 s, which is best compared to existing WCA and AB-SEP methods.

Conclusions
This research proposes an energy-efficient cluster formation and head selection algo rithm (E-CFSA) using a CNN and modified k-mean clustering for a MEC environment This research develops an efficient way to make clusters with stable cluster heads using machine learning in which nodes form clusters using the k-means algorithm.A CNN wa trained to select an efficient cluster head.Data collection has been performed through net work simulation to build training and test data, data analytics applied to analyse the data and feature selection used to select the best model.The trained model predicted score with an error of +/−0.075 on the test dataset.This procedure has reduced the repetitive passage of cluster heads with more extended stability and a lifetime of member nodes in a cluster that is analysed through the cluster head stability time parameter.Finally, per formance is examined under the best model regarding overhead, packet delivery ratio and throughput with the network size and node speed variation.The model has shown better results with less overhead, packet loss rate, and a higher throughput and packe delivery ratio than the existing WCA and AB-SEP methods.
The proposed model can be extended in future work using bioinspired methods, a they offer enticing concepts; selecting cluster nodes can be considered an intelligent and optimisation problem for more complex and heterogeneous networks.

Conclusions
This research proposes an energy-efficient cluster formation and head selection algorithm (E-CFSA) using a CNN and modified k-mean clustering for a MEC environment.This research develops an efficient way to make clusters with stable cluster heads using machine learning in which nodes form clusters using the k-means algorithm.A CNN was trained to select an efficient cluster head.Data collection has been performed through network simulation to build training and test data, data analytics applied to analyse the data, and feature selection used to select the best model.The trained model predicted scores with an error of +/−0.075 on the test dataset.This procedure has reduced the repetitive passage of cluster heads with more extended stability and a lifetime of member nodes in a cluster that is analysed through the cluster head stability time parameter.Finally, performance is examined under the best model regarding overhead, packet delivery ratio, and throughput with the network size and node speed variation.The model has shown better results with less overhead, packet loss rate, and a higher throughput and packet delivery ratio than the existing WCA and AB-SEP methods.
The proposed model can be extended in future work using bioinspired methods, as they offer enticing concepts; selecting cluster nodes can be considered an intelligent and optimisation problem for more complex and heterogeneous networks.

(
A) Network initialisation phase is responsible for network variable declaration and initialisation and splits the network into small subgroups.(B) Evolution of nodes: This is responsible for node selection.This phase calculates the trust among nodes.It utilises a modified version of k-means clustering to create the best-fit clusters.(C) Cluster head selection: This phase utilises the CNN method for the best cluster head selection to save energy.(D) Data transmission: This is the last phase of the proposed model and is responsible for data transmission.

Figure 1 .
Figure 1.The architecture of the proposed hybrid model.

Figure 2 .
Figure 2. CNN model in Proposed Hybrid Model.

Figure 1 .
Figure 1.The architecture of the proposed hybrid model.

Figure 1 .
Figure 1.The architecture of the proposed hybrid model.

Figure 2 .
Figure 2. CNN model in Proposed Hybrid Model.

Figure 2 .
Figure 2. CNN model in Proposed Hybrid Model.

Figure 3
shows the training time graph (training error vs testing error).In each epoch testing and training, data points are represented by a curve that shows the model's inconsistency (one epoch = one cross on the entire dataset).A testing error demonstrates the model's resilience towards the extracted features.The model's loss results significantly decrease for the first 50 epochs; this is encouraging.Ultimately, the medium-sized dataset causes some training and validation curve fluctuations.w += w + µ * ∂E ∂w

Figure 3
shows the training time graph (training error vs testing error).In eac testing and training, data points are represented by a curve that shows the model' sistency (one epoch = one cross on the entire dataset).A testing error demonstr model's resilience towards the extracted features.The model's loss results signi decrease for the first 50 epochs; this is encouraging.Ultimately, the medium-sized causes some training and validation curve fluctuations.

Figure 3 .
Figure 3.The outcome of the Learning curve.

Figure 3 .
Figure 3.The outcome of the Learning curve.

Figure 4 .
Figure 4. Representation of cluster head by black stars.

Figure 4 .
Figure 4. Representation of cluster head by black stars.

Figure 11 .
Figure 11.Graph PDR Vs.Node speed.Figures 12 and 13 show the throughput simulation results for the proposed E-CFS and the existing WCA and AB-SEP methods with varying node speeds and network node count.The experimental findings demonstrate the higher data rate of E-CFSA speeds during both the trials and variability.

Figure 11 .
Figure 11.Graph PDR Vs.Node speed.Figures 12 and 13 show the throughput simulation results for the proposed E-CFS and the existing WCA and AB-SEP methods with varying node speeds and network node count.The experimental findings demonstrate the higher data rate of E-CFSA speeds during both the trials and variability.

Figure 11 .
Figure 11.Graph PDR Vs.Node speed.Figures 12 and 13 show the throughput simulation results for the proposed E-CFS and the existing WCA and AB-SEP methods with varying node speeds and network node count.The experimental findings demonstrate the higher data rate of E-CFSA speeds during both the trials and variability.

Figure 15 .
Figure 15.Graph Packet loss ratio Vs.Node speed.Figure 15.Graph Packet loss ratio Vs.Node speed.

Figure 15 .
Figure 15.Graph Packet loss ratio Vs.Node speed.Figure 15.Graph Packet loss ratio Vs.Node speed.

Figure 16 .
Figure 16.Graph Average Stability Time Vs. Node speed.

Figure 17 .
Figure 17.Graph Average Stability Time of CHs Vs.Number of nodes.

Figure 16 .
Figure 16.Graph Average Stability Time Vs. Node speed.

Figure 16 .
Figure 16.Graph Average Stability Time Vs. Node speed.

Figure 17 .
Figure 17.Graph Average Stability Time of CHs Vs.Number of nodes.

Figure 17 .
Figure 17.Graph Average Stability Time of CHs Vs.Number of nodes.

Table 1 .
Comparative analysis of existing research.

Table 2 .
Parameters of CNN Model in Proposed Hybrid Model.

Table 6 .
Experimental scenarios and specifications.

Table 7 .
Simulation results (Impact of Network Size).

Table 8 .
Simulation results (impact of Node speed).