Next Article in Journal
The Effect of Sun Tan Lotion on Skin by Using Skin TEWL and Skin Water Content Measurements
Next Article in Special Issue
V2ReID: Vision-Outlooker-Based Vehicle Re-Identification
Previous Article in Journal
A Novel Scheme for Controller Selection in Software-Defined Internet-of-Things (SD-IoT)
Previous Article in Special Issue
A Hybrid Dragonfly Algorithm for Efficiency Optimization of Induction Motors
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Dynamic Learning Framework for Smooth-Aided Machine-Learning-Based Backbone Traffic Forecasts

1
School of Electrical Engineering, University Technology Malaysia, Skudai, Johor 81310, Malaysia
2
School of Telecommunication Engineering, Future University, Khartoum 10553, Sudan
3
Department of Computer Science, University of São Paulo, São Paulo 05508-090, Brazil
4
Department of Information Technology, College of Computer and Information Sciences, Princess Nourah Bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
5
Faculty of Engineering, Uni de Moncton, Moncton, NB E1A3E9, Canada
6
Department of Electrical and Electronic Eng. Science, School of Electrical Engineering, University of Johannesburg, Johannesburg 2006, South Africa
7
International Institute of Technology and Management, Commune d’Akanda, Libreville BP 1989, Gabon
8
School of Psychology and Computer Science, University of Central Lancashire, Preston PR1 2HE, UK
*
Author to whom correspondence should be addressed.
Sensors 2022, 22(9), 3592; https://doi.org/10.3390/s22093592
Submission received: 4 March 2022 / Revised: 29 April 2022 / Accepted: 6 May 2022 / Published: 9 May 2022

Abstract

:
Recently, there has been an increasing need for new applications and services such as big data, blockchains, vehicle-to-everything (V2X), the Internet of things, 5G, and beyond. Therefore, to maintain quality of service (QoS), accurate network resource planning and forecasting are essential steps for resource allocation. This study proposes a reliable hybrid dynamic bandwidth slice forecasting framework that combines the long short-term memory (LSTM) neural network and local smoothing methods to improve the network forecasting model. Moreover, the proposed framework can dynamically react to all the changes occurring in the data series. Backbone traffic was used to validate the proposed method. As a result, the forecasting accuracy improved significantly with the proposed framework and with minimal data loss from the smoothing process. The results showed that the hybrid moving average LSTM (MLSTM) achieved the most remarkable improvement in the training and testing forecasts, with 28% and 24% for long-term evolution (LTE) time series and with 35% and 32% for the multiprotocol label switching (MPLS) time series, respectively, while robust locally weighted scatter plot smoothing and LSTM (RLWLSTM) achieved the most significant improvement for upstream traffic with 45%; moreover, the dynamic learning framework achieved improvement percentages that can reach up to 100%.

1. Introduction

Next-generation networks have been designed to offer reliable service with ultra-low latency, massive-scale connectivity, high security, extreme data rates, optimized energy, and better quality of service (QoS) [1,2,3]. Despite these features, the technology (infrastructure and logic) used in these networks must display an intelligence for coping with the dynamic QoS demand [4,5,6,7,8,9] and react autonomously to different dynamic and self-organizing situations. Additionally, network management is complicated due to the coupling between various service layers where congestions can arise and spread vertically as well as horizontally. Furthermore, the congestions arising due to poor management can affect the QoS and service-level agreement (SLA). Therefore, proactive approaches for managing bandwidth and network resources are highly needed. The legacy static network resource allocation indicated that a bandwidth reservation can guarantee a particular QoS. However, a dynamic network resource allocation effectively resolves this problem [10,11,12,13]. It relies on the forecasting network resources’ demands and acts accordingly to enable a timely and dynamic response. Thus, the accuracy of predictive approaches was regarded as a vital factor and essential in various applications of the predictive frameworks. Reliable artificial intelligence (AI) and machine learning (ML) techniques are crucial and widely used in different applications, such as network traffic forecasts [4,5,6,7,8,9,14], the Internet of things (IoT) [10], and wireless communications [11,15]. The data characteristics indicated that the traffic used in real-time applications in current and future networks exhibited variable, nonlinear, and unstructured data formats with slowly decaying autocorrelations between different samples. These features showed that the traffic can exhibit long-range dependence (LRD) [12,16]. To ensure a proper control strategy, the short-step forecasts, such as the one-step forecast models used for LRD traffic, could not respond accurately to the dynamic bandwidth allocation, especially in the higher latency links [1]. Hence, a long-term traffic forecast was required for implementing a flexible strategy to control the networks [1].
Many ML-based studies have focused on the multiple-step bandwidth forecasting process. In general, two different approaches were used for designing bandwidth forecast algorithms, where the first algorithms were based on the supervised ML models, the other relies on statistical models. The first algorithm described various traffic forecasting models based on a supervised ML process, specifically the artificial neural networks (ANNs). On the other hand, the second algorithm used statistical models that are based on the generalized autoregressive integrated moving average (ARIMA) model [1,2]. The major difference noted between the ANN and ARIMA models was that the ARIMA model required the imposition of a stationary property. It also did not accurately forecast while handling LRD [17,18]. The LSTM process is a very effective ML technique used for time series forecasts. This process was applied in multiple-step predictions under different scenarios. The main advantage noted after applying the LSTM-RNN process was that it could quickly learn the temporal dependencies on the input data. At the same time, it was not necessary to specify a fixed set of lagged inputs [17,18,19]. LSTM could resolve the long-term dependency issue as it memorized the information for more extended periods, unlike some other linear time series forecast algorithms (such as ARIMA and its various extensions) that were affected by the unnecessary fluctuations occurring in the series [19,20,21,22]. This is then projected to all forecasted results. Owing to the ability of the LSTM technique to forget or remember information based on its activation function, these techniques can re-evaluate their weights based on their correlation with the remaining time series. This ability of LSTM makes it versatile and adaptable when dealing with errors, noise, and sample gaps. However, as the fluctuations and noise increase in a time series, it is more difficult for the forecasting technique to provide accurate performance. Therefore, data smoothing and filtering must be conducted before any forecasting. This preprocessing method could handle significant fluctuations and outliers by adjusting the built-in sliding window. Motivated by these, we consider the combination of LSTM and smoothing for the multistep-ahead forecasting of backbone network traffic forecasting.
Moreover, to address model reliability and validity, the concept changes detection mechanism must be incorporated and addressed due to the rapid data characteristics and distribution changes. In this work, a real dataset was collected and analyzed. The dataset was collected from a premier internet service provider backbone network. The major contributions of the study can be summarized as follows:
  • Investigation of the hybrid multistep-ahead forecast framework after combining LSTM and the local smoothing techniques for the network traffic forecast;
  • A change detection framework is proposed. This framework was used to determine when to build new hybrid forecast model;
  • Finally, the effectiveness of the model was furtherly analyzed and compared with the relevant study.
The remaining study is organized in the following manner: Section 2 discusses the related works. Section 3 provides a detailed description of the smoothening-aided LSTM model for bandwidth slice forecasts. Then, Section 4 discusses the performance of the proposed model. It also shows the forecasting accuracy, smoothing analysis, and statistical validation of the results. Lastly, the conclusion is presented in Section 5.

2. Related Work

Several studies have analyzed the effectiveness and superiority of the LSTM process for bandwidth forecasting [21,22,23,24,25,26,27,28,29,30]. For instance, in [28] the researchers investigated the performance of different ML techniques and assessed the forecast performances of their video over the internet. They studied neural networks (NNs), support vector machines (SVMs), and decision trees (DTs). They concluded that modeling based on the time series data was better for generating promising results. Additionally, it was seen that the ANN model showed a better performance than the other ML techniques. In [24], the researchers used a hybrid neural network-wavelet model to analyze network traffic. They used the wavelets for decomposing the input data into details and approximations, while the NN was optimized with the help of a genetic algorithm. They noted that their proposed model could significantly improve the forecast accuracy of the process. Though wavelet processing helps in eliminating the unnecessary data, it can lead to some unintentional issues via the traffic load forecast based on LSTM and deep NNs (DNN). The simulation results showed that the forecast-based scalability mechanism performed better than the threshold-based one. In [23], the researchers proposed a new mechanism for scaling the access management functions (AMFs) in the 5G virtualized environment. This mechanism was based on forecasting the mobile traffic using the LSTM NNs to estimate the user attach request rate, which helped predict the accurate number of AMF examples required to process the upcoming user traffic. Since it is a proactive technique, the proposed model helps avoid deployment latency while scaling the resources. The simulation results further indicated that the LSTM-based model was more efficient than the threshold-based model. The proposed technique used LSTM on the request rate data without preprocessing, which eventually may decrease the forecast accuracy. In [21], the researchers compared the performance of the LSTM networks used for 4G traffic forecasts, seasonal ARIMA (SARIMA), and the support vector regression (SVR). For this purpose, they collected the data for 122 days, for which the data points were divided between the training and testing datasets. They noted that the LSTM model showed better performance than the SARIMA and SVR networks. In [26], the researchers developed a deep traffic predictor (DeepTP) model to forecast long-period cellular network traffic. They noted that their model showed better performance (12.3%) than the other traffic forecasting models used in the study. Furthermore, a feature-based forecasting framework that used tier 1 internet service provider (ISP) network traffic was discussed. LSTM was used as the core forecast technique. The results obtained were significant and forecasted the traffic at very small time scales (<30 s). In [22,29], the researchers discussed and proposed a hybrid empirical mode decomposition (EMD) and LSTM forecast technique. EMD decomposes the available bandwidth dataset into smoothened interstice mode functions (IMFs). After that, they applied LSTM to forecast the traffic. They noted that their hybrid model showed a better root mean square error (RMSE) value. In different studies [27,30], the authors used LSTM for forecasting the vehicular ad hoc network (VANET). They determined the forecasting accuracy with the help of the RMSE and mean absolute percentage error (MAPE); the results proved the effectiveness of the proposed mechanism. The authors of [31] proposed a smooth-aided SVM-based model for video traffic forecasting; the obtained results were promising where local smoothing techniques were incorporated ahead to the SVM to normalize the fluctuations in the input traffic. The smoothed support vector machine (SSVM) has an improvement percentage of 32.35% for a one-step-ahead forecast. In the most recent study [32], the authors proposed a hybrid LSTM and convolution neural network framework for a wireless network. The proposed solution was compared with state-of-the-art techniques, and the effectiveness and superiority of the hybrid architecture were highlighted. Table 1 summarize the most notable related work.
Though the earlier studies presented many positive results, we noted that the accuracy of multistep-ahead forecasting in autonomous network management was very challenging. Owing to the noise inconsistency and bursts in the network traffic, small fluctuations occurred in the traffic data that could degrade the forecast accuracy of the model [18]. Very few studies reported in the literature considered noise preprocessing, while no study presented a dynamic framework for concept changes. Previously adopted noise preprocessing methods in previous studies, such as wavelets and EMD, are less flexible than window-based noise processing [31]. The only notable study used windows-based noise processing, such as Gaussian smoothing, moving average, and Savitzky–Golay filters, and used SVMs as an ML technique. At the same time, it was already proven in [28] that NNs and LSTM outperform SVMs in forecasting accuracy. Moreover, the mentioned study was carried out in very limited scenario (one-step forecasts).
Due to the evolved dynamic nature of the network properties, the frameworks must detect and adapt to all changes taking place in the statistical properties of the big data traffic. The changes noted in the traffic profiles, such as a sudden surge in the traffic, took place due to the change in the users, application behavioral variations taking place in the traffic demands, and because of the emergence of novel technologies, applications, or even a global pandemic, such as that of COVID-19 [33,34]. As a result, the number of home users or eMBB traffic increases significantly compared with the corporate traffic [33], which witnessed a significant decrease owing to lockdowns and widespread adoption of work-from-home culture in the business operational model. In this study, considering the promising finding of using window-based techniques as a preprocessing method to handle all the significant fluctuations and outliers by adjusting the built-in sliding window, we extended these results to further these studies and explore the effects of hybrid local smoothing processes and the LSTM-NN technique [35].

3. Methods

To resolve the challenges related to resource management noted in next-generation network backbones, i.e., a beyond 5G (B5G) network environment, we propose a hybrid ML model. This model combines the LSTM and smoothing processes and uses them for the core network bandwidth slices. The forecasting model is called the smoothed LSTM. Figure 1 shows the proposed overall conceptual framework. The model is motivated by the promising results presented earlier [31]. The proposed ML technique is modeled as a time series batch learning process. The researchers extended this algorithm by preprocessing the dataset.
As depicted in Figure 1, the Anderson–Darling test was employed as a change detection method to dynamically manage the dynamic selection of the hybrid algorithm based on the changes in the underlying statistical properties. Then, to avoid eroding the periodic patterns and trends in the series, the system studied the local and global trends separately to detect and eliminate long-term or short-term noise. The preprocessing focuses on the local variations. It applied local smoothing techniques to eliminate the fluctuations and unnecessary noise in the data, which can negatively affect the model’s prediction accuracy, especially in the case of the nonlinear and nonstationary time series. The local preprocessing techniques show a higher dynamic reaction to the noise level and short-term variations than the other wavelet- and Hilbert–Huang transform (HHT)-based processes. A similar approach was used earlier [31], where researchers studied the superior nonlinear approximation ability of an SVM combined with the “classical” local smoothing processes, such as Gaussian smoothing, moving average, and Savitzky–Golay filters. The study results indicated that their proposed model performed better than the state-of-the-art model, viz., logistic regression. We determined the effectiveness of their proposed model by using the real and available network traffic datasets. After the local smoothing preprocessing takes place and the provided y arrives as an input, a forecast y ^ t is produced using the current LSTM model δ , after which a loss function f y ^ t , y t is used to update the model. Finally, a statistical test was conducted by the Diebold–Mariano test to validate the obtained results.

3.1. Dataset

The dataset was collected from a premier internet service provider in Africa. We examined different bandwidth utilization time series; the collected data represent LTE, MPLS, and the upstream tier 1 carrier traffic’s aggregated backbone traffic. Three hundred and fifty time steps’ sample data were collected. Each time step represents 28.8 min, and 350 time steps represent one week. This was attributed to the limitations of the data collection tool. The values were interpolated and used for developing a time series model. The findings will benefit the real-world core and backbone networks in such a way as to achieve efficient network resource planning.
For this work, the computer specifications that were used to process and execute the proposed framework were Core i5 1.8 GHz with 16 GB of RAM. Figure 2 shows the backbone topology where the dataset was collected.
Table 2 shows the description for each bandwidth slice.
Three different traffic profiles were used to explore different traffic characteristics. Slice 1 represents the aggregated backbone traffic for 4G-LTE measured at the SGI interface between the packet data network (PDN-GW) and the core routers in the evolved packet core (EPC). The EPC is responsible for the establishment, management, and authentication of users’ sessions. The core routers are linked to the MPLS backbone network and the tier 1 upstream providers through the upstream routers. Slice 2 is the aggregated backbone traffic for corporate data centers; it was gathered from corporate users’ virtual routing function (VRF) instances at the MPLE backbone routers. Finally, slice 3 represents aggregated traffic at upstream router (A).
It is evident from Figure 3 that all bandwidth slices exhibit significant seasonal patterns with daily peaks. Nevertheless, the data also show a stochastic pattern with continuous irregular fluctuations between successive points. On the other hand, no long-term trend appeared to exist. Some slices exhibit a weekly pattern, such as in the MPLS slice since it is more associated with corporate users where corporate business is active mainly during weekdays rather than during weekends. Table 3 shows the summarized descriptive statistics, Figure 3 shows the sample time series dataset and Figure 4 shows the dataset histograms.
From Table 3, the most notable statistical property is that the LTE and MPLS follow the Johnson SB statistical distribution, while the upstream traffic follows the Gen. extreme value distribution. Equations (1)–(4) show the portability density functions (PDFs) for each distribution function associated with every bandwidth slice, respectively:
f x = 0.829 0.589 2 π z 1 z e x p 1 2 0.589 0.829 l n z 1 z 2
where z x 3.885 × 10 8 0.589 .
f x = 0.854 0.398 2 π z 1 z e x p   1 2 0.3980.854 l n z 1 z 2
where z x 1.637 × 10 8 0.398 .
f x = 1 1.08 × 10 9 t x 0.48 e x p t x
where t x = 1 0.52 x 4.88 × 10 9 1.08 × 10 9 1 0.52
Since ξ 0
3.13 5.92 × 10 7 Γ 1 3.13 e ( x 6.35 ) 2.96 × 10 7 ) 3.13
f x = 1.53 1.39 × 10 8 2 π z 1 z e x p   1 2 0.072 + 1.5 l n z 1 z 2
where z x 1.11 × 10 7 1.3966 × 10 8 .

3.2. Local Smoothing Techniques

As discussed, noise in the time series forecast can significantly and negatively affect the forecasts in the n steps ahead. Hence, this issue must be handled carefully. Minimizing the effects of low- and high-frequency noise can help accurately forecast the short- or long-term-scale data. Some earlier studies discussed the importance of noise removal or data processing [7,18,22,24,29]. In the subsequent section, we discuss the different local smoothing methods applied in this study.

3.2.1. Local Regression

LOWESS [36] is a first-degree polynomial model with weighted linear least squares, while LOESS is a second-degree polynomial model based on the basic fitting model, which employs localized data subsets to construct a curve that approximates the primary data, with weights derived using Equation (6). The LOWESS model evaluates the fit at x i for deriving the fitted values, y ^ i , and residuals, ε ^ i =   y ^ i     y i , at every observation ( x i , y i ). The additional robustness weight w i , was calculated and subjected to the magnitude of ε ^ i . Accordingly, a new weight w i x i , was assigned to each observation, where w i is defined as shown in Equation (7) [34]:
w i x = Δ i   x Δ q x
w i = 1 ε ^ i 6 M A D 2 2 , ε ^ i < 6 M A D 0   ,   ε ^ i 6 M A D
where M A D = M e d i a n   ε ^ i .
Two different versions of the above techniques were used, i.e., “RLOWESS” and “RLOESS”. In these forms, the researchers assigned lower weights to the outliers in the regression. Moreover, zero weights were assigned to the new values outside the six mean absolute deviations.

3.2.2. Moving Average

Moving average (MA) [13,35,36,37] is regarded as a real-time filter that eliminates the high frequency from the data. It is generally used for trend forecasting. The estimated coefficients were equal to the reciprocal of the span or bandwidth. MA is also called “exponential smoothing”. Here, the researchers define C i as the throughput at time i. Consider c = C i , i = 1 p as the time series, where p was the length of the time series. Hence, the MA of period q at time l was calculated using Equation (8) [35,36,37,38,39]:
m l q =   1 q i = 1 q c l i + 1

3.2.3. Savitzky–Golay Smoothing Filter

The Savitzky–Golay (SG) smoothing filter [40] is a low-pass filter that is characterized by two parameters that are indicated as K and M. The SG filter is defined as the weighted MA value, i.e., a finite impulse response (FIR) filter. The researchers calculated the filter coefficients using the unweighted linear least squares regression and polynomial model of a particular degree (default of 2). Furthermore, the time series to be determined is described as x(n), while the observed time series was estimated as y (n) = x (n) + w (n). Here, w (n) is regarded as the additive white Gaussian noise, wherein the final output is derived using Equation (9):
x ^ n = K = M M h k y n k   I
It is noted that a high-degree polynomial helps in achieving a higher smoothing level without attenuating any data features. It is worth mentioning that LOESS is used for seasonal decomposition. However, we focused on using LOESS and other local regression techniques for smoothing in this study since decomposition may aggressively remove some important dataset features. Let us understand how to choose the bandwidth q. Bandwidth plays a vital role in the general local regression fit, while the simplest approach involves selecting q as a constant for all x i . However, a large variance is observed if the selected bandwidth is minimal. This was attributed to insufficient data falling in the smoothing window and generating a noisy fit. However, not all data will be fitted in the specified window if q is very large. As a result, it is challenging to select an optimum q value to avoid unnecessary data loss from the original time series. Hence, we proposed a solution, described in Algorithm 1, that finds the minimum q value that causes minimal data loss reflected in the minimum mean square error (MSE).
Algorithm 1 Loss aware smoothing
Input:
y : B a n d w i d t h   S l i c e , Z: Series length, q: smoothing window
size
Output:
MSE,   y ^   :   locally fitted value using local smoothing technique
Process:
1- For n = q to Z − q do
2- Initialize K [];
3- for j = n − q to n + q Do
4- y ^     smooth (y j )   w i t h   m i n i m u m   M S E
5- Assign ( y ^ j )   i n t o   K []
6- Return y ^

3.3. Anderson–Darling

The Anderson–Darling test is [41] a nonparametric test that shows a superior performance while detecting departures from normality [41]. A K-sample is a type of Anderson–Darling test used to detect if multiple observations are generated from the same statistical distribution.
A D = n 1 2 i   =   1 n 2 i 1 ln x i + ln 1 x n + 1 i
where { x i   < … <   x n } is the ordered input sample of size n (ranging from the smallest to the largest element). The hypothesis states that the { x i   < … <   x n } that arises from the same distribution is rejected if the AD in Equation (10) i larger than the critical values of A D α at the given α .

3.4. LSTM

LSTM [32] is a recurrent neural network that forgets and propagates information for a recurrent training period. This can improve the forecast performance. Due to its ability to correlate current and earlier information, the LSTM technique effectively forecasts time series [42]. The cell represents the basic unit of LSTM. Assume t as the sequence vector, where t = 1, 2, … T denotes the sample index, while T defines the total time series samples present in a sequence. At every index t, the input sample, xt; past cell state, at − 1; and past hidden state, ht − 1; were considered by LSTM. All temporal relationships in LSTM can be derived using the equations below [32,42]:
Γ f   t   =   σ W f h h t 1   +   W f x x t   +   b f  
Γ i t   =   σ W i h h t 1   +   W i x x t   +   b i  
Γ g t   =   ρ W g h h t 1   +   W g x x t   +   b g
Γ o   t   =   σ W o h h t 1   +   W o x x t   +   b o
a t   =   Γ f   t     a t 1   +   Γ i t     Γ g   t
h t   =   Γ o t     ρ a t
Here, Wfh, Wfx, Wih, Wih, Wgh, Wgh, Woh, and Woh represent the weight matrices, while bf, bi, bg, and bo represent the bias vectors that corresponded to the respective resultant vectors for Γf t, Γit, Γg t, and Γo t. Additionally, the forget gate, input gate, input node, and output gate are represented by using the subscript notations of f, i, g, and o, respectively. The symbol “ ” is an elsewise product. In Equations (11)–(16), the researchers represented the weight matrices by T × T, with a vector size of T × 1. The cell state emulated LSTM. The output of the hidden state was considered as a virtual output of the cell state. The sigmoid and rectified linear unit (ReLU) were used as the activation functions in this study; they were represented by σ(z) = 1·1 + ez, which yields an output in the range of (0, 1) for any input. The activation function can be used across all the LSTM gates, wherein the output gates decide if the data should be propagated (values near 1 or 0). The LSTM training process includes gradient computation that eliminates the gradient problem if all the gradients are reduced to zero [36]. ReLU activation can handle this issue, where gradients are calculated faster. However, they are not easily eliminated [32]. The function of a forget gate is to choose what information to retain and what information to remove from ht − 1 and xt. This output results in the vector Γ f t (11), which contains values ranging between (0 and 1) that help in eliminating the irrelevant values from the cell state. Then, by applying the sigmoid activation, the new information yields indices by the input gate that further yield the vector Γit (12). The output from the ReLU activation encourages the inclusion of new values in the vector Γgt (13). The result of the element-wise product of Γit and Γgt that contains new values is added to Γft at − 1. This provides the updated cell state at (15). After this, the filtered value from the updated cell state at is passed as the new hidden state ht. The values that are passed to the new hidden state ht are determined after passing the updated cell state at through the ReLU activation. This eventually yields ρ (at). Then, we determine the location of all updated cell state vectors which maintain the filtered values by the sigmoid activation (14), resulting in the vector Γot. Finally, ht is seen to be the final hidden state that can be calculated using Equation (16). Algorithm 2 presents the LSTM training process. The training process combines three repetitive processes, i.e., forward propagation, backward propagation, and model updates. The process continues further to minimize the training error. Then, the forward propagation forwards the training sample, X (where X ϵ y) and batch size B, with a learning rate of α. The output is then backwards propagated using g, where g ϵ y. After that, the error, E, and learning rate, α, are updated accordingly.
Algorithm 2 Training Process
Input:
y : bandwidth slice, p: Epochs, B: Batch size, X: training, g: testing, α : learning rate,   ϑ ^ : Initial Model
Output:
ϑ : LSTM Model, E: Forecast Error, P: parameters
Process:
01: begin
02: for I   t o   p
03: ϑ     forward propagate (   ϑ ^ , X, y, α , B )
04: E Backward Propagation ( ϑ ,g)
05: P U p d a t e ϑ , E , α
06: End for
Similar to the approach used in [23], hyperparameter selection was conducted through a grid search, as depicted in Table 4. This is due to its reliability and simplicity [42]. Other options for hyperparameter selection include random search, Bayesian optimization, particle swarm optimization (PSO), and genetic algorithm (GA) [42,43,44].

3.5. Dynamic Learning Framework

Due to the evolved dynamic nature of the existing network properties, we proposed a dynamic framework to detect and adapt to any changes in the data patterns of the data traffic. Concept change is popularly used in statistics and data stream analysis. In this study, Algorithm 3 was presented, wherein the framework consisted of S, local smoothing algorithms, and where φ i denotes the hybrid smoothed algorithms formed after combining the local smoothing algorithms and trained LSTM neural network. The input of change detection consists of a bandwidth slice y which is allocated based on the window size W j .   t i is the time step of the bandwidth slice at index i , where t i ϵ   y . During the initial stages, the reference δ represents the final selected hybrid algorithm and is initialized at the window, W j . The current window slides onto the data series and captures the next batch of data series. After detecting any change, the change detector raises the alarm. However, if no change occurs, the primary window W j slides step-by-step until any change is detected. Here, change refers to a change in the statistical distribution between W j and W j + 1 , defined by the Anderson–Darling test. In general, change detectors are used as a part of online classifiers to guarantee a quick response to sudden changes. If some changes are detected, then it is believed that the existing forecasting algorithms cannot accurately forecast when using the new data as the input. Hence, a new hybrid forecast algorithm must be trained and put in place for the generation of a novel forecasting model.
A smoothed bandwidth slice y ^ is used in the next window W j + 1 by utilizing the local smoothing algorithms in S. Then, a new LSTM hybrid model   φ i using Algorithm 2 is developed. After that, the error function E is calculated as a testing loss function; a new list of φ i   is sorted with a minimum error function. Then, a statistically significant test is performed using the Diebold–Mariano test for verifying if the new φ i   is statistically different from the other hybrid algorithms in the list. However, if the new hybrid algorithm is better, the old model δ i 1 is replaced with the new one. The pseudocode of the change detection is described in Algorithm 3. Figure 5 depicts the block diagram for the proposed framework.
Algorithm 3 Dynamic Learning
Input:
y : bandwidth slice, S: list of local smoothing algorithms, φ i : hybrid smoothed LSTM algorithm, k: list of hybrid Smoothed LSTM algorithms (6 in this case),
Output:
δ : statistically significant smoothed LSTM algorithm, E: Forecast Error
Process
01: begin
02: δ ;
03: for all time steps   t i ϵ   y   do
04: W j W j   t i ;
05: if Change is detected = true then//using Anderson–Darling
06: Stop forecasting at W j
07: y ^   smooth y   in W j + 1 using algorithms in S
//algorithm 3
08: for φ i ink
09: φ i   Build new hybrid LSTM models ( y ^ , ϑ )
//algorithm 2
10: E Calculate Forecast error of t i   i n   W j + 2 using φ i
11: k k   φ i   s o r t e d   w i t h   M i n E
12: δ   Find in k the significant φ i   with Min(E)
13: If δ is significantly better than δ i 1 //(old-Existed
forecast algorithm) then
14: replace     δ i 1 by δ   else if
15: keep δ
16: endif
17: endif
18: Loop

4. Results and Discussion

In this study, the stationarity of time series was confirmed using the augmented Dicky–Fuller (ADF) test as the nonstationary models can yield misleading results, as observed in earlier studies [24,27]; although, LSTM can be used to model a nonstationary time series. Moreover, we normalized the time series variances using the Box–Cox power transformation. Figure 6 presents the bandwidth utilization using the MA smoothing technique, whereas Figure 6a depicts the MPLS bandwidth utilization without smoothing. Figure 6b highlights the effect of using the MA smoothing process, where q = 0.003. Furthermore, Figure 6c presents the effect of applying the MA smoothing technique using q = 0.05.
It is evident in Figure 6c that a higher q value may cause the loss of the main features of the time series, bearing in mind that in the existing data-centric world losing even a small amount of data can cause a violation of the service-level agreement. It can also lead to inefficient resource planning and utilization. Therefore, q must be selected according to Algorithm 1. From Table 5 it is clear that the moving average produced the highest MSE, while LOWESS yielded the second largest MSE due to the likelihood that the first-degree polynomial linear model will not fit the nonlinear bandwidth slice adequately. Fitting using the LOESS-based quadratic polynomial produced a smaller MSE owing to the nonlinearity of the second-order local fitting models. On the other hand, the Savitzky–Golay filter produced a smaller MSE using a second-degree polynomial, compared with the LOESS, where the weights were strongly influenced by q, as shown in Equation (6). Lastly, RLOESS and RLOWESS shared a similar performance, yielding the lowest MSE values, as shown in Table 5.
Table 6 shows the performance of combining the local smoothing and LSTM introduced in Algorithm 3. The main objective of this study was to improve the forecast accuracy as a part of better network resource allocation. Therefore, the RMSE was selected as the performance metric. The tables highlight the effects of the proposed methodology on the training and computational time for every bandwidth slice. The improved results were ranked (in brackets) and highlighted accordingly, corresponding to the best combination of algorithms regarding the training RMSE, training time, and testing RMSE for 350 time steps’ forecasting. The results were compared with an earlier study [23] and used as a performance benchmark.
Table 6 and Table 7 showed that the hybrid moving average and LSTM (MLSTM) technique showed the best performance in training and testing the RMSE. However, it may require a higher computation time, such as in the LTE and upstream training phases. These results can be applied to the LTE and MPLS backbone bandwidth traffic, while hybrid RLOWESS and LSTM (RLWLSTM) showed the best performance against the upstream traffic. Although the MLSTM ranking scores were consistently high and showed an average ranking of 1.5, some performance divergence issues can be noted between the different bandwidth slices with other traffic profiles and statistical distributions.
This was further supported by the results presented in Table 3, wherein LTE and MPLS exhibited a similar Johnson SB distribution, while the upstream traffic followed the Gen. extreme distribution. Thus, the data reshaping resulting from the smoothing process can improve the LSTM forecast accuracy, provided that minimum losses can be stripped from the original data. However, no significant improvement in the processing time was noted in some hybrid algorithms, while the process showed some penalties of extra processing time. This drawback can be compensated for by increasing the computation powers of the processing CPU/GPU or routing engines. Figure 7, Figure 8 and Figure 9 show the improvement and degradation in the forecasting accuracy and computational time in terms of percentages compared with the original traffic and results presented in earlier studies [23]. Figure 7a shows the training accuracy improvement for the LTE traffic in terms of percentages. It was noted that MLSTM showed better accuracy (by ≈29%) in the training phase, while the other algorithms showed a lower performance. However, this enhancement required 7% extra computation time. In Figure 7c, the accuracy for the 350 time steps’ forecasting was improved by 22% using MLSTM and the processing time improved by ≈5%, while LLSTM achieved the maximal computational time gain; however, the training and testing performances degraded.
Regarding the MPLS backbone traffic profile, all other hybrid algorithms showed better performance compared with the original profile [18] during the training and testing phase by almost 20% on average, where higher scores were scored by MLSTM and LWLSTM by nearly 35% as depicted in Figure 8. In addition, the computational processing times for MLSTM, LLSTM, and RLLSTM also improved during the training and testing phases, as presented in Figure 8.
Regarding the upstream traffic, a significant improvement was observed for the upstream slices in all the training RMSE values, except RLLSTM. Furthermore, all the algorithms showed better computation times during the training phase (≈8%), except LLSTM. In addition to that, RLWLSTM showed 50% better performance during the testing phase than the other algorithms. Finally, regarding the computational time, only RLWLSTM showed a 4% lower performance, as depicted in Figure 9.
It was seen that the forecasting performance can be improved after using the proposed Algorithms. However, the performance depended on the data series (bandwidth slice) and its statistical properties. Therefore, the dynamic learning framework presented in Figure 5 and Algorithm 3 can help detect any changes occurring in the data distribution. Thus, new hybrid algorithms are to be introduced to replace the old forecasting algorithm. Table 8 compares the forecasting RMSE without the proposed dynamic learning framework presented in Figure 5 and the work presented in Algorithm 3.
Table 8 contains the algorithms obtained from the results in Table 7. It was evident that the proposed framework can detect the changes in the statistical distribution of the slices and provide new hybrid algorithms. The statistical distribution for each slice was obtained from Table 3 (referred to as the actual statistical distribution in Table 8). Compared against different distributions (referred to as new statistical distributions in Table 8), which are already observed within the other slices, the improved performance was 94% for the LTE 350 time steps’ forecast, while for MPLS the improvement percentage was 100%, and finally for the upstream slice the rate was 100%.

5. Conclusions

This study used the hybrid local smoothing and LSTM modeling approaches to forecast the bandwidth slice utilization. Six local smoothing techniques were studied: LOWESS, LOESS, moving average, Savitzky–Golay, RLOESS, and RLOWESS. The resultant algorithms, i.e., MLSTM, LLSTM, LWLSTM, SLSTM, RLWLSTM, and RLLSTM indicated that the hybrid LSTM can improve the forecasting accuracy. However, the improvement can be accompanied by the additional computational overhead, and the obtained results may vary depending on the underlying statistical properties of the tested data series. Therefore, the researchers incorporated a dynamic framework to detect and provide a new hybrid algorithm. The results were verified by the statistical significance tests and compared with previous studies. The researchers believed their proposed technique can be used to forecast the 4G/5G and beyond for reliable slice resource management. Furthermore, these results can be extended and applied in the automatic resource allocation algorithm as part of the slice allocator or orchestrator in the 5G networks and beyond.

Author Contributions

Conceptualization, M.K.H.; methodology, M.K.H.; software, M.K.H.; validation, M.K.H.; formal analysis, S.H.S.A.; investigation, S.H.S.A. and M.H. (Monia Hamdi); resources, S.H.S.A. and M.H. (Mosab Hamdan); writing—original draft preparation, M.K.H. and M.H. (Mutaz Hamad); writing—review and editing, M.K.H.; visualization, M.K.H.; supervision, S.H.S.A. and N.E.G.; project administration, H.H. and S.K.; funding acquisition, M.H. (Monia Hamdi). All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2022R125), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia: PNURSP2022R125).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Acknowledgments

This research work is part of the INCT InterSCity project. The authors would like to thank the Fundacao de Amparo a Pesquisa do Estado de Sao Paulo (FAPESP) Brazil for providing financial support under process no. 2021/10234-5.

Conflicts of Interest

The authors declare no conflict of interest.

Abbreviations

V2XVehicle-to-everything
QoSQuality of service
LSTMLong short-term memory
MLSTMHybrid moving average long short-term memory
MPLSMultiprotocol label switching
RLWLSTMHybrid robust locally weighted scatter plot smoothing and LSTM
SLAService-level agreement
AIArtificial intelligence
MLMachine learning
IOTInternet of things
LRDLong-range dependency
ANNArtificial neural network
ARIMAAutoregressive integrated moving average
RNNRecurrent neural network
NNNeural network
SVMSupport vector machine
DTDecision tree
DNNDeep neural network
AMFAccess management function
SVRSupport vector regression
DeepTPDeep traffic predictor
ISPInternet service provider
EMDEmpirical mode decomposition
IMFInterstice mode function
RMSERoot mean square error
VANETVehicular ad hoc network
MAPEAbsolute percentage error
SSVMSmoothed support vector machine
eMBBEnhanced mobile broadband
B5GBeyond 5G
HHTHilbert–Huang transform
RAMRandom access memory
PDN-GWPacket data network gateway
EPCEvolved packet core
VRFVirtual routing function
PDFProbability density function
LOESSLocally estimated scatterplot smoothing
LOWESSLocally weighted scatterplot smoothing
MADMean absolute deviation
RLOWESSRobust locally weighted scatterplot smoothing
RLOESSRobust locally estimated scatterplot smoothing
MAMoving average
SGSavitzky–Golay
FIRFinite v
MSEMean square error
ADAnderson–Darling
ReLURectified v
ADFAugmented Dicky–Fuller
LLSTMHybrid v
LWLSTMHybrid LOWESS and LSTM
SLSTMvLSTM
RLWLSTMHybrid RLOWESS LSTM
RLLSTMHybrid RLOESS LSTM

References

  1. Boutaba, R.; Salahuddin, M.A.; Limam, N.; Ayoubi, S.; Shahriar, N.; Estrada-Solano, F.; Caicedo, O.M. A comprehensive survey on machine learning for networking: Evolution, applications and research opportunities. J. Internet Serv. Appl. 2018, 9, 16. [Google Scholar] [CrossRef] [Green Version]
  2. Binjubeir, M.; Ahmed, A.A.; Ismail, M.A.B.; Sadiq, A.S.; Khan, M.K. Comprehensive survey on big data privacy protection. IEEE Access 2019, 8, 20067–20079. [Google Scholar] [CrossRef]
  3. Aldhyani, T.H.; Joshi, M.R. Enhancement of Single Moving Average Time Series Model Using Rough k-Means for Prediction of Network Traffic. Available online: http://www.ijera.com/papers/Vol7_issue3/Part-6/I0703064551.pdf (accessed on 1 January 2022).
  4. Cortez, P.; Rio, M.; Rocha, M.; Sousa, P. Internet traffic forecasting using neural networks. In Proceedings of the 2006 IEEE International Joint Conference on Neural Network, Vancouver, BC, Canada, 16–21 July 2006; pp. 2635–2642. [Google Scholar]
  5. Khairi, M.H.H.; Ariffin, S.H.S.; Latiff, N.M.A.A.; Yusof, K.M.; Hassan, M.K.; Al-Dhief, F.T.; Hamdan, M.; Khan, S.; Hamzah, M. Detection and Classification of Conflict Flows in SDN Using Machine Learning Algorithms. IEEE Access 2021, 9, 76024–76037. [Google Scholar] [CrossRef]
  6. Ghafoor, K.Z.; Kong, L.; Rawat, D.B.; Hosseini, E.; Sadiq, A.S. Quality of service aware routing protocol in software-defined internet of vehicles. IEEE Internet Things J. 2018, 6, 2817–2828. [Google Scholar] [CrossRef]
  7. Hassan, M.K.; Ariffin, S.H.S.; Yusof, S.K.S.; Ghazali, N.E.; Kanona, M.E. Analysis of hybrid non-linear autoregressive neural network and local smoothing technique for bandwidth slice forecast. Telkomnika 2021, 19, 1078–1089. [Google Scholar] [CrossRef]
  8. Alauthman, M.; Aslam, N.; Al-Kasassbeh, M.; Khan, S.; Al-Qerem, A.; Choo, K.-K.R. An efficient reinforcement learning-based Botnet detection approach. J. Netw. Comput. Appl. 2020, 150, 102479. [Google Scholar] [CrossRef]
  9. Sadiq, A.S.; Tahir, M.A.; Ahmed, A.A.; Alghushami, A. Normal parameter reduction algorithm in soft set based on hybrid binary particle swarm and biogeography optimizer. Neural Comput. Appl. 2020, 32, 12221–12239. [Google Scholar] [CrossRef]
  10. Chabaa, S.; Zeroual, A.; Antari, J. Identification and prediction of internet traffic using artificial neural networks. J. Intell. Learn. Syst. Appl. 2010, 2, 147. [Google Scholar] [CrossRef] [Green Version]
  11. Zhu, Y.; Zhang, G.; Qiu, J. Network Traffic Prediction based on Particle Swarm BP Neural Network. J. Netw. 2013, 8, 2685–2691. [Google Scholar] [CrossRef] [Green Version]
  12. Li, Y.; Liu, H.; Yang, W.; Hu, D.; Wang, X.; Xu, W. Predicting inter-data-center network traffic using elephant flow and sublink information. IEEE Trans. Netw. Serv. Manag. 2016, 13, 782–792. [Google Scholar] [CrossRef]
  13. Hassan, M.; Babiker, A.; Amien, M.; Hamad, M. SLA management for virtual machine live migration using machine learning with modified kernel and statistical approach. Eng. Technol. Appl. Sci. Res. 2018, 8, 2459–2463. [Google Scholar] [CrossRef]
  14. Li, X.; Li, S.; Zhou, P.; Chen, G. Forecasting Network Interface Flow Using a Broad Learning System Based on the Sparrow Search Algorithm. Entropy 2022, 24, 478. [Google Scholar] [CrossRef] [PubMed]
  15. Singh, S.K.; Salim, M.M.; Cha, J.; Pan, Y.; Park, J.H. Machine learning-based network sub-slicing framework in a sustainable 5 g environment. Sustainability 2020, 12, 6250. [Google Scholar] [CrossRef]
  16. Chen, Z.; Wen, J.; Geng, Y. Predicting future traffic using hidden Markov models. In Proceedings of the 2016 IEEE 24th International Conference on Network Protocols (ICNP), Singapore, 8–11 November 2016; pp. 1–6. [Google Scholar]
  17. Yoo, W.; Sim, A. Time-series forecast modeling on high-bandwidth network measurements. J. Grid Comput. 2016, 14, 463–476. [Google Scholar] [CrossRef] [Green Version]
  18. Afolabi, D.; Guan, S.-U.; Man, K.L.; Wong, P.W.; Zhao, X. Hierarchical meta-learning in time series forecasting for improved interference-less machine learning. Symmetry 2017, 9, 283. [Google Scholar] [CrossRef] [Green Version]
  19. Yao, W.; Khan, F.; Jan, M.A.; Shah, N.; Yahya, A. Artificial intelligence-based load optimization in cognitive Internet of Things. Neural Comput. Appl. 2020, 32, 16179–16189. [Google Scholar] [CrossRef]
  20. Wang, J.; Li, R.; Wang, J.; Ge, Y.-Q.; Zhang, Q.-F.; Shi, W.-X. Artificial intelligence and wireless communications. Front. Inf. Technol. Electron. Eng. 2020, 21, 1413–1425. [Google Scholar] [CrossRef]
  21. Dalgkitsis, A.; Louta, M.; Karetsos, G.T. Traffic forecasting in cellular networks using the LSTM RNN. In Proceedings of the 22nd Pan-Hellenic Conference on Informatics, Athens, Greece, 29 November–1 December 2018; pp. 28–33. [Google Scholar]
  22. Song, H.; Zhao, D.; Yuan, C. Network Security Situation Prediction of Improved Lanchester Equation Based on Time Action Factor. Mob. Netw. Appl. 2021, 26, 1008–1023. [Google Scholar] [CrossRef]
  23. Alawe, I.; Ksentini, A.; Hadjadj-Aoul, Y.; Bertin, P. Improving traffic forecasting for 5G core network scalability: A machine learning approach. IEEE Netw. 2018, 32, 42–49. [Google Scholar] [CrossRef] [Green Version]
  24. Yang, H.-J.; Hu, X. Wavelet neural network with improved genetic algorithm for traffic flow time series prediction. Optik 2016, 127, 8103–8110. [Google Scholar] [CrossRef]
  25. Zhou, J.; Zhao, W.; Chen, S. Dynamic network slice scaling assisted by prediction in 5G network. IEEE Access 2020, 8, 133700–133712. [Google Scholar] [CrossRef]
  26. Nihale, S.; Sharma, S.; Parashar, L.; Singh, U. Network traffic prediction using long short-term memory. In Proceedings of the 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2–4 July 2020; pp. 338–343. [Google Scholar]
  27. Sepasgozar, S.S.; Pierre, S. An Intelligent Network Traffic Prediction Model Considering Road Traffic Parameters Using Artificial Intelligence Methods in VANET. IEEE Access 2022, 10, 8227–8242. [Google Scholar] [CrossRef]
  28. You, J.; Xue, H.; Gao, L.; Zhang, G.; Zhuo, Y.; Wang, J. Predicting the online performance of video service providers on the internet. Multimed. Tools Appl. 2017, 76, 19017–19038. [Google Scholar] [CrossRef]
  29. Zhao, W.; Yang, H.; Li, J.; Shang, L.; Hu, L.; Fu, Q. Network traffic prediction in network security based on EMD and LSTM. In Proceedings of the 9th International Conference on Computer Engineering and Networks, Changsha, China, 18 October 2019; pp. 509–518. [Google Scholar]
  30. Abdellah, A.R.; Koucheryavy, A. VANET traffic prediction using LSTM with deep neural network learning. In Internet of Things, Smart Spaces, and Next Generation Networks and Systems; Springer: Berlin/Heidelberg, Germany, 2020; pp. 281–294. [Google Scholar]
  31. Li, Y.; Wang, J.; Sun, X.; Li, Z.; Liu, M.; Gui, G. Smoothing-aided support vector machine based nonstationary video traffic prediction towards B5G networks. IEEE Trans. Veh. Technol. 2020, 69, 7493–7502. [Google Scholar] [CrossRef]
  32. Mahajan, S.; HariKrishnan, R.; Kotecha, K. Prediction of Network Traffic in Wireless Mesh Networks using Hybrid Deep Learning Model. IEEE Access 2022, 10, 7003–7015. [Google Scholar] [CrossRef]
  33. Ye, J.; Hua, M.; Zhu, F. Machine learning algorithms are superior to conventional regression models in predicting risk stratification of COVID-19 patients. Risk Manag. Healthc. Policy 2021, 14, 3159. [Google Scholar] [CrossRef] [PubMed]
  34. Siriwardhana, Y.; De Alwis, C.; Gür, G.; Ylianttila, M.; Liyanage, M. The fight against the COVID-19 pandemic with 5G technologies. IEEE Eng. Manag. Rev. 2020, 48, 72–84. [Google Scholar] [CrossRef]
  35. Singh, U.; Determe, J.-F.; Horlin, F.; De Doncker, P. Crowd forecasting based on wifi sensors and lstm neural networks. IEEE Trans. Instrum. Meas. 2020, 69, 6121–6131. [Google Scholar] [CrossRef] [Green Version]
  36. Cleveland, W.S. Robust locally weighted regression and smoothing scatterplots. J. Am. Stat. Assoc. 1979, 74, 829–836. [Google Scholar] [CrossRef]
  37. Beloglazov, A.; Abawajy, J.; Buyya, R. Energy-aware resource allocation heuristics for efficient management of data centers for cloud computing. Future Gener. Comput. Syst. 2012, 28, 755–768. [Google Scholar] [CrossRef] [Green Version]
  38. Anwar, S.M.; Aslam, M.; Zaman, B.; Riaz, M. An enhanced double homogeneously weighted moving average control chart to monitor process location with application in automobile field. Qual. Reliab. Eng. Int. 2022, 38, 174–194. [Google Scholar] [CrossRef]
  39. Raudys, A.; Pabarškaitė, Ž. Optimizing the smoothness and accuracy of moving average for stock price data. Technol. Econ. Dev. Econ. 2018, 24, 984–1003. [Google Scholar] [CrossRef] [Green Version]
  40. Schafer, R.W. On the frequency-domain properties of Savitzky-Golay filters. In Proceedings of the 2011 Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE), Sedona, AZ, USA, 4–7 January 2011; pp. 54–59. [Google Scholar]
  41. Aslam, M.; Algarni, A. Analyzing the Solar Energy Data Using a New Anderson-Darling Test under Indeterminacy. Int. J. Photoenergy 2020, 2020, 6662389. [Google Scholar] [CrossRef]
  42. Hochreiter, S. The vanishing gradient problem during learning recurrent neural nets and problem solutions. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 1998, 6, 107–116. [Google Scholar] [CrossRef] [Green Version]
  43. Heravi, E.J.; Aghdam, H.H.; Puig, D. Classification of Foods Using Spatial Pyramid Convolutional Neural Network. Pattern Recognit. Letters 2018, 105, 50–58. [Google Scholar] [CrossRef]
  44. Elgeldawi, E.; Sayed, A.; Galal, A.R.; Zaki, A.M. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. [Google Scholar] [CrossRef]
Figure 1. Conceptual framework.
Figure 1. Conceptual framework.
Sensors 22 03592 g001
Figure 2. Backbone topology.
Figure 2. Backbone topology.
Sensors 22 03592 g002
Figure 3. Backbone bandwidth slices: (a) LTE, (b) MPLS, and (c) upstream traffic.
Figure 3. Backbone bandwidth slices: (a) LTE, (b) MPLS, and (c) upstream traffic.
Sensors 22 03592 g003aSensors 22 03592 g003b
Figure 4. Dataset histograms: (a) LTE, (b) MPLS, and (c) upstream traffic.
Figure 4. Dataset histograms: (a) LTE, (b) MPLS, and (c) upstream traffic.
Sensors 22 03592 g004aSensors 22 03592 g004b
Figure 5. Dynamic learning framework.
Figure 5. Dynamic learning framework.
Sensors 22 03592 g005
Figure 6. Bandwidth utilization using moving average: (a) original MPLS slice (b); MPLS slice smoothed with q = 0.003; and (c) MPLS slice smoothed with q = 0.05.
Figure 6. Bandwidth utilization using moving average: (a) original MPLS slice (b); MPLS slice smoothed with q = 0.003; and (c) MPLS slice smoothed with q = 0.05.
Sensors 22 03592 g006
Figure 7. Improvement percentages for LTE: (a) training RMSE, (b) training time, (c) 350 time steps’ testing, and (d) 350 time steps’ testing time.
Figure 7. Improvement percentages for LTE: (a) training RMSE, (b) training time, (c) 350 time steps’ testing, and (d) 350 time steps’ testing time.
Sensors 22 03592 g007aSensors 22 03592 g007b
Figure 8. Improvement percentages for MPLS traffic: (a) training RMSE, (b) training time, (c) 350 time steps’ testing, and (d) 350 time steps’ testing time.
Figure 8. Improvement percentages for MPLS traffic: (a) training RMSE, (b) training time, (c) 350 time steps’ testing, and (d) 350 time steps’ testing time.
Sensors 22 03592 g008aSensors 22 03592 g008b
Figure 9. Improvement percentages for upstream: (a) training RMSE, (b) training time, (c) 350 time steps’ testing, and (d) 350 time steps’ testing time.
Figure 9. Improvement percentages for upstream: (a) training RMSE, (b) training time, (c) 350 time steps’ testing, and (d) 350 time steps’ testing time.
Sensors 22 03592 g009aSensors 22 03592 g009b
Table 1. Related work summary.
Table 1. Related work summary.
RefML TechniqueApplication (Approach)DatasetNoise PreprocessingDynamic Learning
[28]NN, DT, and SVMForecast and performance assessment of video over the internetInternet trace (14-day and 10-day datasets)NoNo
[24]Back propagation NNImprovement of network forecasting accuracyFour days of network trafficYes (wavelet)No
[23]LSTMTo predict the number of AMFs in 5G coreControl trafficNoNo
[21]LSTMTo forecast cellular traffic4G traffic utilization data collected for 122 daysNoNo
[22]LSTMTo forecast (<30 s) tier 1 ISP trafficTier 1 ISP traffic variable, hourly, daily, 5 minYes
(EMD)
No
[29]LSTMTo forecast network trafficNetwork trafficYes
(EMD)
No
[27,30]LSTMTo forecast V2V trafficV2V trafficNoNo
[31]SVMTo forecast video trafficVideo trafficYes
(Gaussian smoothing, moving average, and Savitzky–Golay filters)
No
[32]LSTM and convolutional neural networkTo forecast wireless network trafficWireless trafficNoNo
Table 2. Slice description.
Table 2. Slice description.
NoBandwidth SliceDescription
1LTERepresents the aggregated backbone bandwidth traffic for 4G-LTE
2MPLSRepresents the aggregated backbone traffic for corporate data centers
3Upstream trafficRepresents the aggregated backbone traffic to the tier 1 internet service provider
Table 3. Descriptive statistics.
Table 3. Descriptive statistics.
Sample SizeRangeMeanVarianceStd. DeviationSkewnessMin10%25% (Q1)50% (Median)75% (Q3)Distribution
LTE7019.9 × 1089.8 × 1084.5 × 10162.1 × 108−0.5054.58 × 1086.50 × 1088.2 × 1081.0 × 1091.13 × 109Johnson SB
MPLS7016.1 × 1084.4 × 1082.3 × 10161.5 × 108-1.80 × 1082.43 × 1083.1 × 1084.3 × 1085.48 × 108Johnson SB
Upstream7014.75 × 1095.1 × 1099.76 × 10179.8 × 108−0.4632.67 × 1093.57 × 1094.4 × 1095.3 × 1085.79 × 109Gen. extreme value
Table 4. LSTM hyperparameters.
Table 4. LSTM hyperparameters.
ParameterName
Library (Python)Tensorflow, Keras, NumPy, Sklearn
Batch size1
Epochs20
Optimizer/learning rateADAM
Loss functionRMSE
Neurons2
Hidden layer1
Activation functionReLU
Table 5. Smoothing MSE.
Table 5. Smoothing MSE.
Smoothing TechniqueLTE-MSEMPLS-MSEUpstream MSE
Moving average2.41 × 1074.77 × 1077.89 × 107
LOWESS2.0785 × 1072.65 × 1076.79 × 107
LOESS6.40 × 1041.40 × 1071.10 × 105
SGolay1.0133 × 10−82.17 × 10−92.25 × 10−8
RLOWESS1.7030 × 10−101.70 × 10−101.03 × 108
RLOESS1.7030 × 10−101.70 × 10−107.03 × 107
Table 6. Performance of combining local smoothing and LSTM.
Table 6. Performance of combining local smoothing and LSTM.
SliceSmoothing TechniqueTraining RMSETraining Time (s)Testing RMSE for 350 Time StepsTesting Time for 350 Time Steps Smoothing TechniqueTraining RMSETraining TimeTesting RMSE for 350 Time StepsTesting Time for 350 Time Steps
LTEOriginal [23]806298712.93373955390.00560MPLSOriginal [23]54009827.36549829490.0059
MLSTM5783686 (1)13.87256527763
(1)
0.00531
(3)
MLSTM3486200
(1)
7.221
(3)
3380349 (1)0.0049
(3)
LLSTM909299214.00486441440.0045
(1)
LLSTM4205124
(3)
6.849
(1)
4022174 (3)0.0039
(1)
LWLSTM80656749.785
(1)
7391435
(2)
0.0051
(2)
LWLSTM3496462
(2)
8.58213411503(2)0.0059
(5)
SLSTM907578313.72586550620.0061SLSTM4286874
(6)
7.5884071104
(6)
0.0049 (3)
RLWLSTM906794910.120
(2)
86350380.0054
(5)
RLWLSTM4250014
(5)
6.979
(2)
4045647 (4)0.0040
(2)
RLLSTM911207213.23786405860.00535
(4)
RLLSTM4232122
(4)
13.9954045664
(5)
0.0058
(4)
Table 7. Performance of combining local smoothing and LSTM.
Table 7. Performance of combining local smoothing and LSTM.
SliceSmoothing TechniqueTraining RMSETraining Time(s) Testing RMSE for 350 Time StepsTesting Time (s) 350 Time Steps
UpstreamOriginal [23]405653312.36738365870.005761
MLSTM3403134
(2)
13.8963810110
(3)
0.00530
(3)
LLSTM3976124 (4)11.346
(1)
3792150
(2)
0.00524
(2)
LWLSTM3411313
(3)
14.5173889609
(5)
0.00576
(5)
SLSTM3997700
(5)
13.6873819971
(4)
0.00521
(1)
RLWLSTM2253007
(1)
13.1261989996
(1)
0.006011
RLLSTM494602412.78641221460.00544
(4)
Table 8. Performance of dynamic learning.
Table 8. Performance of dynamic learning.
SliceTechniquesActual Statistical DistributionNew Statistical DistributionWithout Dynamic Learning Framework (RMSE)With Dynamic Learning Framework (RMSE)
LTEMLSTMJohnson SBGen. gamma (4P)96214174956527763
(94%)
MPLSMLSTMJohnson SBGen. extreme value47520698253380349
(100%)
UpstreamRLWLSTMGen. extreme valueGen. gamma (4P)51579912931989996
(100%)
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Hassan, M.K.; Syed Ariffin, S.H.; Ghazali, N.E.; Hamad, M.; Hamdan, M.; Hamdi, M.; Hamam, H.; Khan, S. Dynamic Learning Framework for Smooth-Aided Machine-Learning-Based Backbone Traffic Forecasts. Sensors 2022, 22, 3592. https://doi.org/10.3390/s22093592

AMA Style

Hassan MK, Syed Ariffin SH, Ghazali NE, Hamad M, Hamdan M, Hamdi M, Hamam H, Khan S. Dynamic Learning Framework for Smooth-Aided Machine-Learning-Based Backbone Traffic Forecasts. Sensors. 2022; 22(9):3592. https://doi.org/10.3390/s22093592

Chicago/Turabian Style

Hassan, Mohamed Khalafalla, Sharifah Hafizah Syed Ariffin, N. Effiyana Ghazali, Mutaz Hamad, Mosab Hamdan, Monia Hamdi, Habib Hamam, and Suleman Khan. 2022. "Dynamic Learning Framework for Smooth-Aided Machine-Learning-Based Backbone Traffic Forecasts" Sensors 22, no. 9: 3592. https://doi.org/10.3390/s22093592

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop