Addressing the Effectiveness of DDoS-Attack Detection Methods Based on the Clustering Method Using an Ensemble Method

Zeinalpour, Alireza; Ahmed, Hassan A.

doi:10.3390/electronics11172736

Open AccessArticle

Addressing the Effectiveness of DDoS-Attack Detection Methods Based on the Clustering Method Using an Ensemble Method

by

Alireza Zeinalpour

^*

and

Hassan A. Ahmed

Department of Information Systems, Monte Ahuja College of Business, Cleveland State University, Cleveland, OH 44114, USA

^*

Author to whom correspondence should be addressed.

Electronics 2022, 11(17), 2736; https://doi.org/10.3390/electronics11172736

Submission received: 12 August 2022 / Revised: 26 August 2022 / Accepted: 29 August 2022 / Published: 31 August 2022

Download

Browse Figure

Review Reports Versions Notes

Abstract

:

The curse of dimensionality, due to lots of network-traffic attributes, has a negative impact on machine learning algorithms in detecting distributed denial of service (DDoS) attacks. This study investigated whether adding the filter and wrapper methods, preceded by combined clustering algorithms using the Vote classifier method, was effective in lowering the false-positive rates of DDoS-attack detection methods. We examined this process to address the curse of dimensionality of machine learning algorithms in detecting DDoS attacks. The results of this study, using ANOVA statistical analyses, showed that incorporating the wrapper method had superior performance in comparison with the filter and clustering methods. IT professionals aim at incorporating effective DDoS-attack detection methods to detect attacks. Therefore, the contribution of this study is that incorporating the wrapper method is the most suitable option for organizations to detect attacks as illustrated in this study. Subsequently, IT professionals could incorporate the DDoS-attack detection methods that, in this study, produced the lowest false-positive rate (0.012) in comparison with all the other mentioned studies.

Keywords:

DDoS attack; detection; filter method; wrapper method; clustering algorithms

1. Introduction

DDoS-attack recognition methods that use machine learning algorithms are used to classify DDoS attacks [1]. DDoS attacks result in service interruptions, creating major issues for organizations and agencies. According to Idhammad, et al. [2], unsupervised and supervised learning are two approaches for detecting these attacks. Machine learning algorithms have problems with the curse of dimensionality [3]. The curse of dimensionality is caused when redundant features exist in a set of data [4]. Selecting appropriate data using feature-selection methods can improve performance [5]. We conducted one-way-ANOVA statistical analyses to realize the significance of the difference in the effectiveness of DDoS-attack detection methods that apply the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method.

This research study builds on research conducted by Zeinalpour [1] to address the high false-positive rates of DDoS-attack detection methods for assessing whether the addition of the filter and wrapper methods prior to the clustering method is effective. The curse of dimensionality lowers the effectiveness of DDoS-attack detection methods that apply unsupervised learning algorithms [2]. Zeinalpour [1] could not verify the effectiveness of DDoS-attack detection methods when he added the filter and wrapper methods prior to each clustering algorithm. In that study, a clustering algorithm tried to form two clusters for the BENIGN and DDoS labels, representing normal network-traffic data and attacks, using the CICIDS2017 dataset to identify effective DDoS-attack detection methods. In this study, we investigated whether adding the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method is effective in lowering false-positive rates of DDoS-attack detection methods. The major goal of IT professionals is to incorporate effective DDoS-attack detection methods for recognizing attacks. Based on ANOVA statistical analyses, this study found that the wrapper method is the most suitable option for organizations in terms of recognizing attacks.

Problem Statement

DDoS-attack detection methods that are based on the implementation of unsupervised machine learning algorithms have high false-positive rates [6]. There were 2.9 million DDoS attacks during the first quarter of 2021 compared with the first quarter of 2020, reflecting a 31% increase in attack launches [7]. DDoS recognition methods that apply clustering algorithms are not effective in generating false-positive rates, which are high [1]. The security teams of organizations do not know whether adding the filter and wrapper methods prior to combined clustering algorithms using the Vote classifier method is effective in lowering the false-positive rates of DDoS-attack detection methods.

2. Materials and Methods

We attempted to take the study by Zeinalpour [1] further in determining whether adding the filter and wrapper methods prior to combined clustering algorithms using the Vote classifier method was effective in lowering the false-positive rates of DDoS-attack detection methods. We used an ex post facto design of an A-B-A-BC single group. This design enabled the comparison analysis of the combined clustering algorithms among the A, B, and BC phases to be performed. The A phase did not involve any intervention, while the B phase involved the intervention of the filter method, and the BC phase involved the incorporation of the filter method with the second intervention of the wrapper method. We used the CICIDS2017 network-traffic dataset. The independent variables were the filter, wrapper, and combined clustering algorithms. The dependent variable was the false-positive rates of DDoS-attack detection methods. We used the Vote classifier method to combine the clustering algorithms that Zeinalpour [1] considered in his study. Clustering algorithms are effective in identifying abnormal events [8]. The Vote classifier is a supervised ensemble method that requires pre-labeled data for learning. An ensemble method provides the capability to integrate learning algorithms to increase effectiveness [9]. We also used the same procedures of the filter and wrapper methods that Zeinalpour [1] used. The capability of choosing features is significant for intrusion-detection systems [10]. This study may contribute to society by facilitating organizations and agencies in providing uninterrupted services to the public.

We used the Waikato Environment for Knowledge Analysis (WEKA) workbench to enable the modeling and evaluation of DDoS-attack detection methods to be conducted. The statistical assessment capability provided through the WEKA workbench enables the forecast of events to be conducted [1]. According to Kiranmai and Laxmi [11], this tool is released under GNU general public license agreement. It provides the opportunity for preprocessing, clustering, and classification [12]. These three processes were the main concerns in this study.

We used the cross-industry standard process for data mining (CRISP-DM) framework. The CRISP-DM framework enables data-mining tasks [13] and the assessment of voluminous data [14] to be performed. In 2000, SPSS, NCR, and Daimler Chrysler companies introduced this framework [15]. Companies are be able to reach their objectives using this framework [16]. One objective is attaining high-level security against DDoS attacks. The CRISP-DM framework is applicable to this research study in terms of facilitating the assessment of the CICIDS2017 network-traffic dataset in mining data for finding effective DDoS-attack detection methods for organizations and agencies. Based on the results of one-way-ANOVA statistical analyses obtained in this study, we were not able to reject the null hypothesis, i.e., adding the filter and wrapper methods prior to combined clustering algorithms using the Vote classifier method is not effective. Therefore, we were not able to accept the alternative hypothesis, i.e., adding the filter and wrapper methods prior to combined clustering algorithms using the Vote classifier method is effective. This was in relation to the corresponding clustering algorithms’ integrations including the assessment of the rest of the experimental results in this study. In this study, we relied on and investigated whether adding the filter and wrapper methods, preceded by combined clustering algorithms using the Vote classifier method, was effective in lowering the false-positive rates of DDoS-attack detection methods, which led us to present the following research question: Is adding the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method effective in lowering the false-positive rates of DDoS-attack detection methods? Therefore, the hypotheses of this study were as follows: The null hypothesis (H₀) was that adding the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method is not effective in lowering the false-positive rates of DDoS-attack detection methods. The alternative hypothesis (H_a) was that adding the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method is effective in lowering the false-positive rates of DDoS-attack detection methods.

3. Literature Review

This section presents the literature review comprising four parts, of which an introduction to the literation review is the first section. Section 3.2 provides a review of DDoS attacks. Section 3.3 provides a review of the application of clustering algorithms in DDoS-attack detection methods, followed by the application of the CRISP-DM framework to the applied IT problem in the following section. The applied IT problem that we tackled is the high false-positive rates of DDoS-attack detection methods based on clustering algorithms.

3.1. DDoS Attacks

Denial of service (DoS) attacks are among the most serious risks and security issues on the Internet today. Distributed denial of service (DDoS) attacks are of special concern, as their impact can be correspondingly severe [17]. According to previous studies, malicious agents commonly carry out DDoS attacks. Snehi and Bhandari [18] illustrated traditional DDoS attacks that are launched by hacking a large number of computer machines spread around the globe that operate as bots for the attacks. Bhardwaj, et al. [19] explained that in a DDoS attack, an attacker aims to deplete network infrastructure, capacity, or computer resources by overwhelming it with requests. Additionally, blackmail, the exhibition of attack capabilities, vandalism, political disagreements, hacktivism, corporate rivalry, distraction from exfiltration, and other data theft activities are all possible motivations for DDoS attacks [19]. Mishra, et al. [20] raised awareness by explaining that the DDoS is the most severe attack on an SDN cloud. Despite so many developments in tools and technology, it is still hard to detect DDoS attacks [17]. Banitalebi Dehkordi, et al. [21] touched on DDoS attacks targeting a wide range of various resources and spots, beginning from bank servers up to new sites by introducing big challenges for the managers and users of these systems. Amaizu, et al. [22] exemplified that DDoS attacks were a major concern in 2020, as approximately 17 million DDoS attacks were predicted to occur in 2020. Amaizu, Nwakanma, Bhardwaj, Lee, and Kim [22] illustrated that DDoS attacks are still among the most dangerous attacks on the Internet. A DDoS attack occurs when several IoT devices collaborate to attack a single target; for large-scale operations, DDoS attackers also deploy a botnet, which is a network of compromised internet-connected IoT devices [23]. Shohani, et al. [24] confirmed that DDoS attacks are very difficult to detect and lessen, as flooded packets sent to the switches are quite similar to legitimate traffic. DDoS attacks are considered the biggest threat to the IT industry. Because of that, it is very urgent and embarrassing that we face the problem of DDoS attacks and that we work to solve it [25].

3.2. Application of Clustering Algorithms in DDoS-Attack Detection

Many researchers have worked on DDoS-attack identification and classification using different machine learning algorithms [26]. Various algorithms are applied as classifiers for flow identification in software-defined networking (SDN). Furthermore, SDN is vulnerable to DDoS attacks, which can cause network disasters. Additionally, Xu et al. explained that the performance of different algorithms can affect the effectiveness of DDoS defense in software-defined networking. SDN flows can be classified using the self-organizing map (SOM), one of the most successful classifiers. The SOM converts high-dimensional training input into low-dimensional neural-network winning neurons and detects network flows via winning neurons [27]. In addition, based on the understanding of botnet topologies, clustering techniques have been applied to network-traffic capture to detect zombie-to-C&C communications [28]. K-means clustering is one of the most common methods to form clusters by observing the distance of data points from initially selected centroids [29]. An intrusion-detection system (IDS) is a piece of software or hardware that monitors a system (software, hardware, operating system, network, etc.), and the percentage of successfully detected attacks and the equivalent percentage of undiscovered attacks determine the IDS accuracy (false negatives). Furthermore, incorrectly activated alarms (false positives) must be considered in order to improve accuracy [30]. K-means is a clustering algorithm that is able to partition data [31]. This clustering algorithm performs the assessment of distances among data points [1].

3.3. Application of CRISP-DM to Applied IT Problem

Purpose and Hypotheses of the Study: The aim of this investigation was to determine whether the use of the filter and wrapper methods, to be placed before the combined clustering algorithms using the Vote classifier method, is effective in lowering the false-positive rates of DDoS-attack detection methods. We assessed one null hypothesis and one alternative hypothesis. We applied the one-way ANOVA to assess the hypotheses. We applied the CRISP-DM framework to organize the construction of DDoS-attack detection methods. The CRISP-DM framework enabled us to apply the standard organizational process for building the DDoS-attack detection methods and evaluating them considering our research question and hypotheses.

CRISP-DM Framework: Based on Zeinalpour [1], the CRISP-DM framework facilitates extracting useful information from data associated with organizational and business data-mining tasks. This framework has six phases: business understanding, data understanding, data preparation, modeling, evaluation, and deployment [32]. In the context of this study, we provide the details of our findings with respect to each phase of this framework. The CRISP-DM framework provided us with the standard organizational process to construct the DDoS-attack detection methods and evaluating them considering our research question and hypotheses. The following subsections in this section explain the phases of CRISP-DM to understand the organizational challenges with respect to DDoS-attack detection methods and the proper evaluation of these methods.

Business Understanding: In this first phase, we identified one problem that businesses have, which is as follows: DDoS-attack detection methods that apply an unsupervised approach lead to high false-positive rates [6]. There was a 31% rise in DDoS-attack launches when comparing the first quarter of 2021 with the first quarter of 2020 [7]. The curse of dimensionality is considered a problem for machine learning algorithms [3]. This problem represents the existence of high-dimensional data. The presence of redundant attributes in the data leads to the curse of dimensionality [4]. High-dimensional data include redundant attributes. This study added the filter and wrapper methods preceded by combined clustering algorithms using the Vote classifier method to identify effective DDoS-attack detection methods.

Data Understanding: This phase provided the opportunity for us to understand and pick our data for evaluation. We used the CICIDS2017 network-traffic dataset. The literature regards this dataset as a network-traffic dataset that is superior to the “KDD, NSL-KDD, AWID, CIDDS-001, ISCXIDS2012, and UNSW-NB15 datasets” [1]. The CICIDS2017 network-traffic dataset comprises data instances of DDoS attacks and normal traffic [1]. The capture of network-traffic data for this dataset included the establishment of realistic criteria [33]. Protocols that are considered common were used to collect network-traffic data from the CICIDS2017 dataset [34].

Data Preparation: This phase provided the opportunity for us to prepare and clean the CICIDS2017 network-traffic data before modeling. To prepare this dataset, we applied the six data-cleaning procedures that Zeinalpour [1] proposed in his study. The first procedure is the manual attribute removal of the six features of Flow ID, Source IP, Destination IP, Source Port, Destination Port, and Time Stamp. The second procedure is numeric data cleansing using the NumericCleaner procedure to make normalization possible. The third procedure is data normalization using the min–max procedure to set attribute values between 0 and 1. The fourth procedure uses the EM imputation method to replace the missing value that exists for the Flow Bytes attribute among four data instances. The fifth procedure is the correction of imbalanced data instances with respect to the ones categorized as DDoS attacks and those categorized as benign using the spreadsubsample procedure. Finally, we applied the randomization of network-traffic data. These six procedures are necessary for the following reasons: These data-cleaning procedures prevent the incorrect formation of DDoS-attack detection models [1]. Data cleaning does not allow an improper model to be built [35]. This phase allows data transformation for modeling to be performed [36].

Modeling: In this phase, we applied the filter, wrapper, and clustering procedures that Zeinalpour [1] proposed to evaluate the construction of DDoS-attack detection models. The filter procedures were chi-squared and information gain. Each filter procedure uses a merit score to evaluate attributes or features, and a threshold is used to select them. These procedures do not depend on learning models to select network-traffic attributes. As Zeinalpour [1] specified in his study, the value of 0.5 was used as the threshold to determine proper attributes. On the other hand, the wrapper procedures were the Naïve Bayes and J48 classifiers, which are supervised learning algorithms. These wrapper procedures use the accuracy of classifiers to select attributes. The clustering procedures were self-organizing maps (SOMs) and k-means. We used the Vote classifier method to combine the clustering procedures. The objective of the modeling phase is to improve the outcomes to be obtained [37]. This phase provides the opportunity for evaluating learning models [36]. Chengxiang and Xiaoqing [38] presented the SOM algorithm as shown in the formula below, where the algorithm initializes the weight among the input layer with the mapped layer and updates the weight at each iteration. According to Zeinalpour [1], it calculates the Euclidean distance among data instances.

\sqrt{\sum_{i = 1}^{m} (x_{i} (t) - ω_{ij} (t)) ²}

Miniak-Górecka et al. [39] presented the k-means algorithm as shown in the formula below, where according to Zeinalpour [1], k is the number of cluster centroids and c is the number of centroids within a cluster. Similarly, as explained by Zeinalpour [1], the computation is based on the Euclidean distance among data instances. The algorithm attempts to perform the minimization of the within-cluster sum of squares among the data [39].

\sum_{i = 1}^{k} \sum_{x_{j} ∊ C_{i}}^{c} | x_{j} - c_{j} | ²

Evaluation: In this phase, we used the false-positive rate metric to evaluate DDoS-attack detection models. DDoS-attack detection methods that are unsupervised produce high false-positive rates [6]. This metric aims to measure how effective DDoS-attack detection models are [40]. It represents the ratio obtained using the number of occurrences of false positives and true negatives [41].

Deployment: For this phase, we recommend deploying DDoS-attack detection models outside of demilitarized zone (DMZ) areas, as Zeinalpour [1] proposed. DMZ areas are placed between the internal networks of a company and the internet [1]. To obtain the best security, the requirement is to rapidly detect security violations [42]. DDoS-attack detection methods are able to achieve security that is of high standard [1].

4. Data Analysis of the Experimentation

We used the ten-fold cross-validation method to determine whether adding the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method is effective in lowering the false-positive rates of DDoS-attack detection methods. This method can divide a dataset into ten folds for training and testing machine learning techniques [43]. The method considers each fold as a set for validation [44]. Validation conducted using this method obtains realistic results [1]. Afterwards, we applied one-way-ANOVA analyses to evaluate the statistical significance of the difference in effectiveness among DDoS-attack detection methods. Based on the statements in the study by Mouhamadou [45], ANOVA statistical analyses make the comparison of the means among two or more categories possible and facilitate the management of type I errors.

5. Study Validity

This section presents the study validities that we considered. The first study validity was internal validity. The second and third study validities were predictive and conclusion validities. In addition, in this section, we provide explanation on why we did not consider external validity.

We used the WEKA workbench to ensure internal validity. The WEKA workbench is reliable and guarantees internal validity [1]. This tool has the proper environment for mining useful information [11]. It has an architecture that comprises modularity and extensibility [46]. It enabled the proper modeling and evaluation of DDoS-attack detection methods to be performed in this study.

To ensure predictive and conclusion validities, we applied three approaches. The first one was the use of the spreadsubsample procedure. When imbalanced data are used to build a model, the model is skewed toward large network data instances [47]. This procedure balances network-traffic data instances to be equal according to classification [1]. It randomly eliminates data instances that belong to the majority class [48]. The second approach was applying the Randomize procedure to perform the randomization of network-traffic data. In this study, we used the ten-fold cross-validation method. It calculates the error rate in each fold [49]. According to Zeinalpour [1], the ten-fold cross-validation method may produce biased results. The CICIDS2017 dataset has data instances for DDoS attacks alongside each other and normal data instances alongside each other, in which forecasting events depend on the balance of data instances in each fold [1]. Each fold must include the attributes for most classifications in training and testing [50]. The final approach for conducting predictive and conclusion validities was the use of the ten-fold cross-validation method. This method performs an independent test set [50]. The independent test set guarantees reliable results [1]. The ten-fold cross-validation method generalizes results [51].

We did not need to consider external validity. The CICIDS2017 network-traffic dataset includes data of normal traffic and DDoS attacks [1]. This dataset features real capture of network-traffic data [52]. Enabling machine learning algorithms to model is affected by attributes [53]. Therefore, this study used the entire dataset as the population of this study, thus not requiring external validity.

6. Results

In this study, we used an ex post facto design of an A-B-A-BC single group. We assessed whether adding the filter and wrapper methods preceded by combined clustering algorithms using the Vote classifier method was effective in reducing the false-positive rates of DDoS-attack detection methods. During the A phase, we used the entire dataset of CICIDS2017 to evaluate the combination of the clustering algorithms using the Vote classifier method. In the B phase, we applied the filter method to select features for evaluating the combination of the clustering algorithms. In the BC phase, we added the wrapper method after the filter method in selecting features for evaluation. After the evaluation of the methods using the ten-fold cross-validation method, we conducted one-way-ANOVA analyses to determine the statistical significance of the difference in the effectiveness of DDoS-attack detection methods.

Prior to using the ten-fold cross-validation method on the combined clustering algorithms using the Vote classifier method for conducting the data analysis of this experimentation, we applied the following procedures: We conducted manual attribute removal, including the data preparation of the CICIDS2017 dataset using NumericCleaner, min–max normalization, EM imputation, spreadsubsample procedure, and randomization. Figure 1 presents the way in which we modeled the DDoS-attack detection methods. First, we manually removed the six attributes of Flow ID, Source IP, Destination IP, Source Port, Destination Port, and Time Stamp to prevent biased results toward particular identification numbers, systems, and times of attack events. Then, we applied NumericCleaner to perform min–max normalization. After conducting normalization, we added EM imputation to replace the missing values of the Flow Bytes attribute; subsequently, we applied the spreadsubsample procedure to balance the data, followed by randomization to randomize the order of data instances for the BENIGN and DDoS labels. Consequently, we incorporated the filter and wrapper methods to select proper network-traffic features before using ten-fold cross-validation for evaluating the DDoS-attack detection models in this study. The figure below illustrates the considered methods along with the ten-fold cross-validation method used to evaluate them. The dashed outlines in the figure represent the use of the ten-fold cross-validation method for evaluating the results. Appendix A (Table A1) presents the independents variables table. Appendix B presents the tables for obtained false positive rates when applying the clustering, filter, and wrapper method respectively.

6.1. Statistical Analysis Using One-Way ANOVA

We conducted one-way-ANOVA analyses by considering one factor variable and one dependent variable on the full dataset representing all the obtained false-positive rates of DDoS-attack detection methods using IBM SPSS Statistics 28. The factor variable was named “class”, representing the column for grouping the methods. The dependent variable represented the false-positive rates of DDoS-attack detection methods among groups. We used three numbers to group the methods. Number “1” represented the class for the use of the clustering method without the use of the filter or wrapper method. Number “2” represented the class for the use of the filter method. Number “3” represented the class for the use of the wrapper method.

The outcomes of these statistical analyses are shown in Table 1. We reflected on the means, the standard deviation, and the homogeneity-of-variance test, including the one-way-ANOVA F-test. In order for us to realize whether the overall ANOVA test had statistical significance among DDoS-attack detection methods, we composed the table named “Tests of between-subjects effects”. The test was not significant, with F(2.25) = 1.53 and p = 0.2. The p-value is presented in the column named “Sig.”. As the p-value was greater than 0.05, we could not reject the null hypothesis, i.e., adding the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method is not effective in lowering the false-positive rates of DDoS-attack detection methods. The η² (named “Partial Eta Squared” in Table 1) of 0.11 shows that there were no strong relationships among the DDoS-attack detection methods using the clustering, filter, and wrapper methods, and no changes in false-positive rates.

We conducted a descriptive analysis. The standard deviations ranged from 0.053 to 0.083, and the variances (the squaring of standard deviations) ranged from 0.003 to 0.007, indicating that there was not much difference in the variances in terms of the effectiveness of the DDoS-attack detection methods. Table 2 presents the descriptive statistics for the mean and standard deviation. We also conducted a post hoc procedure in presuming the equality of variances (Tukey) and one that did not presume the equality of variances (Dunnett’s C) to manage Type I errors among pairwise comparisons of DDoS-attack detection methods. Table 3 presents this analysis. The mean differences presented in Table 3, using Tukey and Dunnett’s C, show that the addition of the wrapper method was better than that of the filter and clustering methods. This was because the statistical analyses showed positive and higher values for the wrapper method compared with the negative and lower values of the filter and clustering methods in the “Mean Difference” column presented in Table 3. Considering the mean differences, the filter method did not perform well compared with the clustering method. However, using Tukey’s post hoc test, the pairwise analysis showed no statistical significance for the difference in the effectiveness of DDoS-attack detection methods using the clustering, filter, and wrapper methods. The homogeneity-of-variance test, presented in Table 4, with a p-value of 0.044, showed minor significance, as it was very close to 0.05. Considering the means, standard deviation, the homogeneity-of-variance test, the one-way-ANOVA F-test, and the post hoc procedures in pairwise analyses, we could not reject the null hypothesis, i.e., adding the filter and wrapper methods preceded by combined clustering algorithms using the Vote classifier method is not effective in lowering the false-positive rates of DDoS-attack detection methods. Subsequently, we were not able to accept the alternative hypothesis, i.e., adding the filter and wrapper methods preceded by combined clustering algorithms using the Vote classifier method is effective in lowering the false-positive rates of DDoS-attack detection methods.

6.2. Addition of the Filter Method in the B Phase and Comparison with the a Phase

Having considered the comparison with respect to the implementation of the respective clustering algorithms, the results of the study showed that in one case, experimentation resulted in lowering the false-positive rate. That was when we applied the information gained prior to the double SOM clustering algorithm compared with the application of the double SOM algorithm only. Table A2 and Table A3 in Appendix B present the experimental results of when no feature-selection methods were considered compared with when the filter method was applied. The mean differences in Tukey’s and Dunnett’s C statistical test procedures in Table 4 above illustrated that the filter method did not perform well when compared with the application of the clustering method alone.

6.3. Addition of the Wrapper Method in the BC Phase and Comparison with the a Phase

Having considered the comparison with respect to the implementation of respective clustering algorithms, the results of the study showed that addition of the wrapper method was effective in the following cases: With respect to the double SOM clustering algorithm, the addition of the Naive Bayes classifier after chi-squared and information gain and the incorporation of the J48 classifier after chi-squared were effective. With regard to the double k-means clustering algorithm, the addition of the Naive Bayes classifier after information gain and incorporation of the J48 classifier after chi-squared were effective. In relation to combining the k-means and SOM clustering algorithms together, adding the Naive Bayes classifier after chi-squared and information gain and incorporating the J48 classifier after chi-squared were effective. Table A2 and Table A4 in Appendix B present the experimental results of when no feature-selection methods were considered compared with when the wrapper method was applied. Having considered the statistical analyses, the mean differences, presented in Table 4, for Tukey and Dunnett’s C illustrated that the wrapper method was more effective than the filter and clustering methods.

6.4. Comparison of Results across All DDoS-Attack Detection Methods

We also compared the results of this experimentation for all DDoS-attack detection methods considering the ex post facto A-B-A-BC design. Considering the addition of the filter compared with the wrapper method, incorporating the Naive Bayes classifier after chi-squared and information gain and adding the J48 classifier after chi-squared were effective. These results were with respect to the double SOM clustering algorithm and the combination of k-means and SOMs. In relation to double k-means, the addition of the Naive Bayes classifier after information gain and the incorporation of the J48 classifier after chi-squared were effective. However, considering double k-means, when we incorporated the J48 classifier after information gain, it was only effective compared with applying information gain, not chi-squared. In relation to the other combination of clustering algorithms, the addition of the filter and wrapper methods was not effective.

7. Discussion

This study attempted to build upon the study conducted by Zeinalpour [1]. Zeinalpour [1] assessed the integration of the filter and wrapper methods prior to each clustering algorithm in reducing the false-positive rates of DDoS-attack detection methods. Zeinalpour [1] could not confirm that the addition of the filter and wrapper methods prior to each clustering algorithm was not effective in reducing the false-positive rates of DDoS-attack detection methods. Likewise, Zeinalpour [1] was not able to verify that the addition of the filter and wrapper methods prior to each clustering algorithm was effective in reducing the false-positive rates of DDoS-attack detection methods. Similarly, based on the statistical analyses in this study, we were not able to verify whether adding the filter and wrapper methods prior to the combined clustering algorithms using the Vote classifier method is effective. This was with respect to the corresponding clustering algorithms’ integrations as well as comparison of the rest of the experimentation results in this study.

Based on the research question presented in this study, we stated one null hypothesis and one alternative hypothesis. The null hypothesis was that incorporating the filter and wrapper methods preceded by combined clustering algorithms using the Vote classifier method is not effective in lowering the false-positive rates of DDoS-attack detection methods. The alternative hypothesis was that incorporating the filter and wrapper methods preceded by combined clustering algorithms using the Vote classifier method is effective in lowering the false-positive rates of DDoS-attack detection methods. Based on the statistical analyses reflected above, we were not able to disprove the null hypothesis. Therefore, we were not able to accept the alternative hypothesis.

In this study, we conducted our research to realize whether the addition of the filter and wrapper methods preceded by combined clustering algorithms using the Vote classifier method is effective in lowering the false-positive rates of DDoS-attack detection methods. Having considered the study by Zeinalpour [1], we were able to lower the false-positive rates of DDoS-attack detection methods in two instances, by adding chi-squared and J48 prior to the combined clustering algorithms using k-means and SOMs, producing 0.012. This false-positive rate was lower than when Zeinalpour added chi-squared, Naïve Bayes, and SOMs, producing a false-positive rate of 0.013.

Unsupervised methods are likely to have high false-positive rates [6]. Xiaofei et al. [54] assessed the performance of the SOM algorithm using the KDD network-traffic dataset, obtaining the false-positive rate of 4.33%. Ning et al. [55] evaluated the performance of the k-means algorithm on the NSL-KDD, UNSW-NB15, and Awid network-traffic datasets, achieving false-positive rates of 11.75%, 13%, and 6.75% respectively. The filter and wrapper methods conduct feature reduction [1]. Feature reduction makes possible the selection of attributes that bring about high prediction for learning models in classification. Having considered other studies in this respect, when Gu, et al. [41] investigated incorporating the Self-Correlation Coefficient as the procedure of the filter method on the CICIDS2017 dataset, they could achieve the false-positive rate of 29.30. In another study, Sakr, et al. [56] investigated the false-positive rates in detecting intrusions using the filter and wrapper methods. Principal Component Analysis (PCA), Correlation Feature Selection (CFS), and information gain (IG) were the procedures of the filter method; Particle Swarm Optimization (PSO) and Artificial Bee Colony (ABC) were the procedures of the wrapper method in the study by Sakr, et al. [56]. The results of the study with respect to the obtained false-positive rates are shown in the Table 5 below. The false-positive rates reflect the application of filter and filter–wrapper feature-selection methods, including when Sakr, et al. [56] did not apply any feature-selection method.

8. Limitations and Implications

Limitations represent possible issues in a study [57]. The limitation of this study was the curse of dimensionality. The curse of dimensionality is considered an issue for machine learning algorithms [3]. We applied the filter and wrapper methods to address the curse of dimensionality.

One aspect of implications in studies is assumptions. Researchers accept assumptions to be true, for which they do not tend to provide proof [57]. We assumed that the results from this study represent real-world scenarios in which DDoS-attack detection methods try to detect attacks for organizations and companies. The CICIDS2017 network-traffic dataset comprises actual network-traffic data [52].

Another aspect of implications in studies is delimitations. Delimitations are factors or variables purposely left out [57]. The delimitation of this study was considering signature-based DDoS-attack detection methods. We attempted to take the research study conducted by Zeinalpour [1] further, as he addressed the curse of dimensionality of DDoS-attack detection methods that apply clustering algorithms. In this study, we also focused on addressing the curse of dimensionality by attempting to incorporate the filter and wrapper methods before combining clustering algorithms using the Vote classifier, which is a supervised approach. The filter and wrapper methods are able to select proper network-traffic features. Machine learning algorithms suffer from the curse of dimensionality [3]. According to Salimi, Ziaii, Amiri, Hosseinjani Zadeh, Karimpouli, and Moradkhani [4], this issue is because of redundancy in features.

9. Conclusions

In this study, we tried to determine whether adding filter and wrapper methods is effective in lowering the false-positive rates of DDoS-attack detection methods before the use of combined clustering algorithms using the Vote classifier method. We were not able to verify the effectiveness of the addition of the filter and wrapper methods. This was because the ANOVA statistical analyses illustrated that the addition of the filter method was not as effective in comparison with incorporating the wrapper and clustering methods. However, according to Zeinalpour [1], DDoS-attack detection methods are great tools for providing high security. Using the Vote classifier method may be considered an advantage to lower the false-positive rates of DDoS-attack detection methods. The Vote classifier is an advantage because it enables the integration of machine learning algorithms to be performed to detect attacks compared with the single implementation of machine learning algorithms. In this study, we were able to achieve the lowest false-positive rate of 0.012 for DDoS-attack detection methods in two instances by incorporating chi-squared and J48 prior to the combined clustering algorithms using K-means and SOMs. This was with respect to the considered implementations in this study. Organizations and agencies can further examine the effectiveness of DDoS-attack detection methods by integrating supervised learning algorithms with unsupervised learning algorithms using the Vote classifier method. Organizations likely want the implementation of effective DDoS-attack detection methods that have low false-positive rates in detecting attacks. The Vote classifier method should be considered as it is an advantage because it takes advantage of the integration of machine learning algorithms to identify attacks compared with the single implementation of machine learning algorithms. The limitation of this study was the curse of dimensionality of DDoS-attack detection methods. We incorporated the filter and wrapper methods to address this limitation. This study focused on combining the clustering algorithms using the Vote classifier method. One future research direction could be to investigate the curse of dimensionality of machine learning algorithms using the wrapper method prior to integrating supervised learning algorithms to detect attacks. This is because this study showed superior performance when incorporating the wrapper method.

Author Contributions

Writing—original draft, A.Z.; Writing—review & editing, H.A.A. The authors of this research study worked in collaboration to complete the research study. This study was organized by the authors. All authors have read and agreed to the published version of the manuscript.

Funding

This research had no external funding.

Data Availability Statement

The authors of this study used the CICIDS2017 dataset. This dataset is publicly available at https://www.unb.ca/cic/datasets/ids-2017.html (accessed on 30 May 2021).

Acknowledgments

The authors conducted this study to build upon the study performed by Zeinalpour [1].

Conflicts of Interest

The authors of this study declare no conflicts in interest. The authors of this research guided the study with no sponsorship.

Appendix A. Table of Independent Variables

Table A1. Independent variables.

Independent Variable	Procedures
Combined Clustering Algorithms	Vote(SelfOrganizingMap & SimpleKMeans) Vote(SimpleKMeans & SelfOrganizingMap) Vote(SelfOrganizingMap & SelfOrganizingMap) Vote(SimpleKMeans & SimpleKMeans)
Filter Method	ChiSquaredAttributeEval InfoGainAttributeEval
Wrapper Method	WrapperSubsetEval(J48) WrapperSubsetEval(NaïveBayes)

Appendix B. Experimental Results

Table A2. False-positive rates using no feature-selection methods.

Applied Procedure in DDoS-Attack Detection Method	False-Positive Rates in DDoS-Attack Identification
Vote(SelfOrganizingMap & SimpleKMeans)	0.062
Vote(SimpleKMeans & SelfOrganizingMap)	0.062
Vote(SimpleKMeans & SimpleKMeans)	0.172
Vote(SelfOrganizingMap & SelfOrganizingMap)	0.191

Table A3. False-positive rates using the filter method.

Applied Procedure in DDoS-Attack Detection Method	False-Positive Rates in DDoS-Attack Identification
InfoGainAttributeEval and Vote(SelfOrganizingMap & SimpleKMeans)	0.093
ChiSquaredAttributeEval and Vote(SelfOrganizingMap & SimpleKMeans)	0.062
InfoGainAttributeEval and Vote(SimpleKMeans & SelfOrganizingMap)	0.093
ChiSquaredAttributeEval and Vote(SimpleKMeans & SelfOrganizingMap)	0.062
InfoGainAttributeEval and Vote(SimpleKMeans & SimpleKMeans)	0.180
ChiSquaredAttributeEval and Vote(SimpleKMeans & SimpleKMeans)	0.172
InfoGainAttributeEval and Vote(SelfOrganizingMap & SelfOrganizingMap)	0.139
ChiSquaredAttributeEval and Vote(SelfOrganizingMap & SelfOrganizingMap)	0.191

Table A4. False-positive rates using the wrapper method.

Applied Procedure in DDoS-Attack Detection Methods	False-Positive Rates in DDoS-Attack Identification
InfoGainAttributeEval, Naïve Bayes, and Vote(SelfOrganizingMap & SimpleKMeans)	0.013
InfoGainAttributeEval, J48, and Vote(SelfOrganizingMap & SimpleKMeans)	0.169
ChiSquaredAttributeEval, Naïve Bayes, and Vote(SelfOrganizingMap & SimpleKMeans)	0.013
ChiSquaredAttributeEval, J48, and Vote(SelfOrganizingMap & SimpleKMeans)	0.012
InfoGainAttributeEval, Naïve Bayes, and Vote(SimpleKMeans & SelfOrganizingMap)	0.013
InfoGainAttributeEval, J48, and Vote(SimpleKMeans & SelfOrganizingMap)	0.169
ChiSquaredAttributeEval, Naïve Bayes, and Vote(SimpleKMeans & SelfOrganizingMap)	0.013
ChiSquaredAttributeEval, J48, and Vote(SimpleKMeans & SelfOrganizingMap)	0.012
InfoGainAttributeEval, Naïve Bayes, and Vote(SimpleKMeans & SimpleKMeans)	0.014
InfoGainAttributeEval, J48, and Vote(SimpleKMeans & SimpleKMeans)	0.173
ChiSquaredAttributeEval, Naïve Bayes, and Vote(SimpleKMeans & SimpleKMeans)	0.211
ChiSquaredAttributeEval, J48, and Vote(SimpleKMeans & SimpleKMeans)	0.108
InfoGainAttributeEval, Naïve Bayes, and Vote(SelfOrganizingMap & SelfOrganizingMap)	0.014
InfoGainAttributeEval, J48, and Vote(SelfOrganizingMap & SelfOrganizingMap)	0.214
ChiSquaredAttributeEval, Naïve Bayes, and Vote(SelfOrganizingMap & SelfOrganizingMap)	0.013
ChiSquaredAttributeEval, J48, and Vote(SelfOrganizingMap & SelfOrganizingMap)	0.016

References

Zeinalpour, A. Addressing High False Positive Rates of DDoS Attack Detection Methods. Ph.D. Thesis, Walden University, Minneapolis, MN, USA, 2021. [Google Scholar]
Idhammad, M.; Afdel, K.; Belouch, M. Semi-supervised machine learning approach for DDoS detection. Appl. Intell. 2018, 48, 3193–3208. [Google Scholar] [CrossRef]
Gahar, R.M.; Arfaoui, O.; Hidri, M.S.; Hadj-Alouane, N.B. A distributed approach for high-dimensionality heterogeneous data reduction. IEEE Access 2019, 7, 151006–151022. [Google Scholar] [CrossRef]
Salimi, A.; Ziaii, M.; Amiri, A.; Hosseinjani Zadeh, M.; Karimpouli, S.; Moradkhani, M. Using a Feature Subset Selection method and Support Vector Machine to address curse of dimensionality and redundancy in Hyperion hyperspectral data classification. Egypt. J. Remote Sens. Space Sci. 2018, 21, 27–36. [Google Scholar] [CrossRef]
Huang, X.; Zhang, L.; Wang, B.; Li, F.; Zhang, Z. Feature clustering based support vector machine recursive feature elimination for gene selection. Appl. Intell. 2018, 48, 594–607. [Google Scholar] [CrossRef]
Xiang, Y.; Wenchao, Y.; Shudong, L.; Xianfei, Y.; Ying, C.; Hui, L. Web DDoS attack detection method based on semisupervised learning. Secur. Commun. Netw. 2021, 2021, 1–10. [Google Scholar] [CrossRef]
Mittal, M.; Kumar, K.; Behal, S. Deep learning approaches for detecting DDoS attacks: A systematic review. Soft Comput. 2022. [Google Scholar] [CrossRef]
Alguliyev, R.M.; Aliguliyev, R.M.; Abdullayeva, F.J. PSO+K-means algorithm for anomaly detection in big data. Stat. Optim. Inf. Comput. 2019, 7, 348–359. [Google Scholar] [CrossRef]
Akhter, M.P.; Zheng, J.; Afzal, F.; Lin, H.; Riaz, S.; Mehmood, A. Supervised ensemble learning methods towards automatically filtering Urdu fake news within social media. PeerJ Comput. Sci. 2021, 7, e425. [Google Scholar] [CrossRef]
Ambusaidi, M.A.; He, X.; Nanda, P.; Tan, Z. Building an intrusion detection system using a filter-based feature selection algorithm. IEEE Trans. Comput. 2016, 65, 2986–2998. [Google Scholar] [CrossRef]
Kiranmai, S.A.; Laxmi, A.J. Data mining for classification of power quality problems using WEKA and the effect of attributes on classification accuracy. Prot. Control Mod. Power Syst. 2018, 3, 29. [Google Scholar] [CrossRef]
Aksu, G.; Doğan, N. An analysis program used in data mining: WEKA. J. Meas. Eval. Educ. Psychol. 2019, 10, 80–95. [Google Scholar] [CrossRef]
Moslehi, F.; Haeri, A.; Moini, A. Analyzing and investigating the use of electronic payment tools in Iran using data mining techniques. J. AI Data Min. 2018, 6, 417–437. [Google Scholar] [CrossRef]
Cazacu, M.; Titan, E. Adapting CRISP-DM for social sciences. BRAIN Broad Res. Artif. Intell. Neurosci. 2020, 11, 99–106. [Google Scholar] [CrossRef]
Contreras, Y.; Vera, M.; Huérfano, Y.; Valbuena, O.; Salazar, W.; Vera, M.I.; Borrero, M.; Barrera, D.; Hernández, C.; Molina, Á.V. Digital processing of medical images: Application in synthetic cardiac datasets using the CRISP_DM methodology. Rev. Latinoam. Hipertens. 2018, 13, 310–315. [Google Scholar]
Groggert, S.; Elser, H.; Ngo, Q.H.; Schmitt, R.H. Scenario-based manufacturing data analytics with the example of order tracing through BLE-beacons. Procedia Manuf. 2018, 24, 243–249. [Google Scholar] [CrossRef]
Douligeris, C.; Mitrokotsa, A. DDoS attacks and defense mechanisms: Classification and state-of-the-art. Comput. Netw. 2004, 44, 643–666. [Google Scholar] [CrossRef]
Snehi, M.; Bhandari, A. Vulnerability retrospection of security solutions for software-defined Cyber–Physical System against DDoS and IoT-DDoS attacks. Comput. Sci. Rev. 2021, 40, 100371. [Google Scholar] [CrossRef]
Bhardwaj, A.; Mangat, V.; Vig, R.; Halder, S.; Conti, M. Distributed denial of service attacks in cloud: State-of-the-art of scientific and commercial solutions. Comput. Sci. Rev. 2021, 39, 100332. [Google Scholar] [CrossRef]
Mishra, A.; Gupta, N.; Gupta, B.B. Defense mechanisms against DDoS attack based on entropy in SDN-cloud using POX controller. Telecommun. Syst. 2021, 77, 47–62. [Google Scholar] [CrossRef]
Banitalebi Dehkordi, A.; Soltanaghaei, M.; Boroujeni, F.Z. The DDoS attacks detection through machine learning and statistical methods in SDN. J. Supercomput. 2021, 77, 2383–2415. [Google Scholar] [CrossRef]
Amaizu, G.C.; Nwakanma, C.I.; Bhardwaj, S.; Lee, J.M.; Kim, D.S. Composite and efficient DDoS attack detection framework for B5G networks. Comput. Netw. 2021, 188, 107871. [Google Scholar] [CrossRef]
Kumar, P.; Kumar, R.; Gupta, G.P.; Tripathi, R. A Distributed framework for detecting DDoS attacks in smart contract-based Blockchain-IoT Systems by leveraging Fog computing. Trans. Emerg. Telecommun. Technol. 2021, 32, e4112. [Google Scholar] [CrossRef]
Shohani, R.B.; Mostafavi, S.; Hakami, V. A statistical model for early detection of DDoS attacks on random targets in SDN. Wirel. Pers. Commun. 2021, 120, 379–400. [Google Scholar] [CrossRef]
Gadallah, W.G.; Omar, N.M.; Ibrahim, H.M. Machine learning-based distributed denial of service attacks detection technique using new features in software-defined networks. Int. J. Comput. Netw. Inf. Secur. 2021, 13, 15–27. [Google Scholar] [CrossRef]
Smys, S.; Bestak, R.; Palanisamy, R.; Kotuliak, I. (Eds.) Computer Networks and Inventive Communication Technologies: Proceedings of Fourth ICCNCT 2021; Springer: Singapore, 2021. [Google Scholar]
Xu, Y.; Yu, Y.; Hong, H.; Sun, Z. DDoS detection using a cloud-edge collaboration method based on entropy-measuring SOM and KD-tree in SDN. Secur. Commun. Netw. 2021, 2021, 5594468. [Google Scholar] [CrossRef]
Chio, C.; Freeman, D. Machine Learning and Security: Protecting Systems with Data and Algorithms; O’Reilly Media, Inc.: Sebastopol, CA, USA, 2018. [Google Scholar]
Aamir, M.; Ali Zaidi, S.M. Clustering based semi-supervised machine learning for DDoS attack classification. J. King Saud Univ.-Comput. Inf. Sci. 2021, 33, 436–446. [Google Scholar] [CrossRef]
Das, S. Detection and Explanation of Distributed Denial of Service (DDoS) Attack through Interpretable Machine Learning. Ph.D. Thesis, The University of Memphis, Memphis, TN, USA, 2021. [Google Scholar]
Chen, Y.; Qin, B.; Liu, T.; Liu, Y.; Li, S. The comparison of SOM and k-means for text clustering. Comput. Inf. Sci. 2010, 3, 268–274. [Google Scholar] [CrossRef]
Qazdar, A.; Er-Raha, B.; Cherkaoui, C.; Mammass, D. A machine learning algorithm framework for predicting students performance: A case study of baccalaureate students in Morocco. Educ. Inf. Technol. 2019, 24, 3577–3589. [Google Scholar] [CrossRef]
Prasad, M.; Tripathi, S.; Dahal, K. An efficient feature selection based Bayesian and Rough set approach for intrusion detection. Appl. Soft Comput. 2020, 87, 105980. [Google Scholar] [CrossRef]
Chiba, Z.; Abghour, N.; Moussaid, K.; El omri, A.; Rida, M. Intelligent approach to build a Deep Neural Network based IDS for cloud environment using combination of machine learning algorithms. Comput. Secur. 2019, 86, 291–317. [Google Scholar] [CrossRef]
Manimekalai, K.; Kavitha, A. Missing value imputation and normalization techniques in myocardial infarction. ICTACT J. Soft Comput. 2018, 8, 1655–1662. [Google Scholar] [CrossRef]
Nguyen, G.; Dlugolinsky, S.; Bobák, M.; Tran, V.; López García, Á.; Heredia, I.; Malík, P.; Hluchý, L. Machine learning and deep learning frameworks and libraries for large-scale data mining: A survey. Artif. Intell. Rev. 2019, 52, 77–124. [Google Scholar] [CrossRef]
Cerón, J.D.; López, D.M.; Eskofier, B.M. Human activity recognition using binary sensors, BLE beacons, an intelligent floor and acceleration data: A machine learning approach. In Proceedings of the 12th International Conference on Ubiquitous Computing and Ambient Intelligence (UCAmI 2018), Punta Cana, Dominican Republic, 4–7 December 2018; p. 1265. [Google Scholar]
Chengxiang, S.; Xiaoqing, L. Research on clustering algorithm based on improved SOM neural network. Comput. Intell. Neurosci. 2022, 2022, 1482250. [Google Scholar] [CrossRef]
Miniak-Górecka, A.; Podlaski, K.; Gwizdałła, T. Using k-means clustering in python with periodic boundary conditions. Symmetry 2022, 14, 1237. [Google Scholar] [CrossRef]
Khalaf, B.A.; Mostafa, S.A.; Mustapha, A.; Mohammed, M.A.; Abduallah, W.M. Comprehensive review of artificial intelligence and statistical approaches in distributed denial of service attack and defense methods. IEEE Access 2019, 7, 51691–51713. [Google Scholar] [CrossRef]
Gu, Y.; Li, K.; Guo, Z.; Wang, Y. Semi-supervised k-means DDoS detection method using hybrid feature selection algorithm. IEEE Access 2019, 7, 64351–64365. [Google Scholar] [CrossRef]
Bopche, G.S.; Mehtre, B.M. Graph similarity metrics for assessing temporal changes in attack surface of dynamic networks. Comput. Secur. 2017, 64, 16–43. [Google Scholar] [CrossRef]
Guo, J.; Chiang, S.; Liu, M.; Yang, C.-C.; Guo, K. Can machine learning algorithms associated with text mining from internet data improve housing price prediction performance? Int. J. Strateg. Prop. Manag. 2020, 24, 300–312. [Google Scholar] [CrossRef]
Gayathri, S.; Krishna, A.K.; Gopi, V.P.; Palanisamy, P. Automated binary and multiclass classification of diabetic retinopathy using Haralick and multiresolution features. IEEE Access 2020, 8, 57497–57504. [Google Scholar] [CrossRef]
Mouhamadou, T.S. Using Anova to examine the relationship between safety & security and human development. J. Int. Bus. Econ. 2014, 2, 101–106. [Google Scholar] [CrossRef]
Pereira, J.; Peixoto, H.; Machado, J.; Abelha, A. A data mining approach for cardiovascular diagnosis. Open Comput. Sci. 2017, 7, 36–40. [Google Scholar] [CrossRef]
Abdulraheem, M.H.; Ibraheem, N.B. A detailed analysis of new intrusion detection dataset. J. Theor. Appl. Inf. Technol. 2019, 97, 4519–4537. [Google Scholar]
Bashir, K.; Li, T.; Yohannese, C.W. An empirical study for enhanced software defect prediction using a learning-based framework. Int. J. Comput. Intell. Syst. 2018, 12, 282–298. [Google Scholar] [CrossRef]
de Rooij, M.; Weeda, W. Cross-validation: A method every psychologist should know. Adv. Methods Pract. Psychol. Sci. 2020, 3, 248–263. [Google Scholar] [CrossRef]
Anjum, S.; Qaseem, N. Big data algorithms and prediction: Bingos and risky zones in sharia stock market index. J. Islamic Monet. Econ. Financ. 2019, 5, 475–490. [Google Scholar] [CrossRef]
Li, C.; Shi, D.; Chen, Y.; Zhang, H.; Geng, H.; Wang, P. A prediction scheme for the precipitation of spr based on the data mining algorithm and circulation analysis. J. Trop. Meteorol. 2019, 25, 519–527. [Google Scholar] [CrossRef]
Abdulhammed, R.; Musafer, H.; Alessa, A.; Faezipour, M.; Abuzneid, A. Features dimensionality reduction approaches for machine learning based network intrusion detection. Electronics 2019, 8, 322. [Google Scholar] [CrossRef]
Lamba, M.; Munjal, G.; Gigras, Y. Feature Selection of micro-array expression data (FSM)—A review. Procedia Comput. Sci. 2018, 132, 1619–1625. [Google Scholar] [CrossRef]
Xiaofei, Q.; Lin, Y.; Kai, G.; Linru, M.; Meng, S.; Mingxing, K.; Mu, L. A survey on the development of self-organizing maps for unsupervised intrusion detection. Mob. Netw. Appl. 2021, 26, 808–829. [Google Scholar] [CrossRef]
Ning, H.; Zhihong, T.; Hui, L.; Xiaojiang, D.; Guizani, M. A multiple-kernel clustering based intrusion detection scheme for 5G and IoT networks. Int. J. Mach. Learn. Cybern. 2021, 12, 3129–3144. [Google Scholar] [CrossRef]
Sakr, M.M.; Tawfeeq, M.A.; El-Sisi, A.B. An efficiency optimization for network intrusion detection system. Int. J. Comput. Netw. Inf. Secur. 2019, 11, 1–11. [Google Scholar] [CrossRef]
Ellis, T.J.; Levy, Y. Towards a guide for novice researchers on research methodology: Review and proposed methods. J. Issues Inf. Sci. Inf. Technol. 2009, 6, 323–337. [Google Scholar] [CrossRef] [Green Version]

Figure 1. Evaluation approach of DDoS-attack detection methods.

Table 1. Tests of between-subjects effects.

Source	Type III Sum of Squares	df	Mean Square	F	Sig.	Partial Eta-Squared
Corrected Model	0.017 ^a	2	0.008	1.530	0.236	0.109
Intercept	0.233	1	0.233	42.081	<0.001	0.627
Class	0.017	2	0.008	1.530	0.236	0.109
Error	0.138	25	0.006
Total	0.407	28
Corrected Total	0.155	27

^a R-squared = 0.109 (adjusted R-squared = 0.038).

Table 2. Descriptive statistics.

Methods	Mean	Std. Deviation	N
Clustering Only	0.12175	0.069428	4
Filter Method	0.12400	0.053136	8
Wrapper Method	0.07356	0.083359	16
Total	0.09486	0.075865	28

Table 3. Multiple comparisons.

	(I) Methods	(J) Methods	Mean Difference (I-J)	Std. Error	Sig.	95% Confidence Interval
	(I) Methods	(J) Methods	Mean Difference (I-J)	Std. Error	Sig.	Lower Bound	Upper Bound
Tukey’s HSD	Clustering Only	Filter Method	−0.00225	0.045572	0.999	−0.11576	0.11126
	Clustering Only	Wrapper Method	0.04819	0.041601	0.488	−0.05543	0.15181
	Filter Method	Clustering Only	0.00225	0.045572	0.999	−0.11126	0.11576
	Filter Method	Wrapper Method	0.05044	0.032224	0.279	−0.02983	0.13070
	Wrapper Method	Clustering Only	−0.04819	0.041601	0.488	−0.15181	0.05543
	Wrapper Method	Filter Method	−0.05044	0.032224	0.279	−0.13070	0.02983
Dunnett’s C	Clustering Only	Filter Method	−0.00225	0.039471		−0.15616	0.15166
	Clustering Only	Wrapper Method	0.04819	0.040489		−0.10404	0.20042
	Filter Method	Clustering Only	0.00225	0.039471		−0.15166	0.15616
	Filter Method	Wrapper Method	0.05044	0.028057		−0.02681	0.12769
	Wrapper Method	Clustering Only	−0.04819	0.040489		−0.20042	0.10404
	Wrapper Method	Filter Method	−0.05044	0.028057		−0.12769	0.02681

Note: Based on observed means. The error term is mean square (error) = 0.006.

Table 4. Levene’s test of equality of error variances.

		Levene’s Statistic	df1	df2	Sig.
False Positive Rates	Based on Mean	3.539	2	25	0.044
	Based on Median	0.133	2	25	0.876
	Based on Median and with Adjusted df	0.133	2	15.940	0.876
	Based on Trimmed Mean	2.758	2	25	0.083

Note: Tests the null hypothesis that the error variance of the dependent variable is equal across groups.

Table 5. False-positive rates of filter and filter–wrapper methods.

Feature-Selection Methods	False-Positive Rates
CFS	0.015
IG	0.068
PCA	0.020
CFS-PSO	0.015
CFS-ABC	0.014
IG-PSO	0.047
IG-ABC	0.086
PCA-PSO	0.035
PCA-ABC	0.022
No Feature Selection	0.015

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zeinalpour, A.; Ahmed, H.A. Addressing the Effectiveness of DDoS-Attack Detection Methods Based on the Clustering Method Using an Ensemble Method. Electronics 2022, 11, 2736. https://doi.org/10.3390/electronics11172736

AMA Style

Zeinalpour A, Ahmed HA. Addressing the Effectiveness of DDoS-Attack Detection Methods Based on the Clustering Method Using an Ensemble Method. Electronics. 2022; 11(17):2736. https://doi.org/10.3390/electronics11172736

Chicago/Turabian Style

Zeinalpour, Alireza, and Hassan A. Ahmed. 2022. "Addressing the Effectiveness of DDoS-Attack Detection Methods Based on the Clustering Method Using an Ensemble Method" Electronics 11, no. 17: 2736. https://doi.org/10.3390/electronics11172736

APA Style

Zeinalpour, A., & Ahmed, H. A. (2022). Addressing the Effectiveness of DDoS-Attack Detection Methods Based on the Clustering Method Using an Ensemble Method. Electronics, 11(17), 2736. https://doi.org/10.3390/electronics11172736

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Addressing the Effectiveness of DDoS-Attack Detection Methods Based on the Clustering Method Using an Ensemble Method

Abstract

1. Introduction

Problem Statement

2. Materials and Methods

3. Literature Review

3.1. DDoS Attacks

3.2. Application of Clustering Algorithms in DDoS-Attack Detection

3.3. Application of CRISP-DM to Applied IT Problem

4. Data Analysis of the Experimentation

5. Study Validity

6. Results

6.1. Statistical Analysis Using One-Way ANOVA

6.2. Addition of the Filter Method in the B Phase and Comparison with the a Phase

6.3. Addition of the Wrapper Method in the BC Phase and Comparison with the a Phase

6.4. Comparison of Results across All DDoS-Attack Detection Methods

7. Discussion

8. Limitations and Implications

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A. Table of Independent Variables

Appendix B. Experimental Results

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI