# Using Rough Set Theory to Find Minimal Log with Rule Generation

^{*}

## Abstract

**:**

## 1. Introduction

- Developing a new algorithm using RST basic concepts to create minimal re-ducts;
- Offering a feasible feature selection methodology scalable to huge datasets, without sacrificing performance;
- Creating a minimal rule decision database that retains information content;
- Using three benchmark UCI datasets to evaluate the performance of the methodology;
- Comparing the result of the proposed model to recent works.

## 2. Related Works

## 3. Theoretical Background

#### 3.1. Rough Set

_{1}, u

_{2},…, u

_{n}} is called the universe set, which is a finite non-empty set of N objects (or instances), and A is (n + k) attribute set, which is non-empty. The set A (A = C ∪ D) is split into the following two subsets: conditional attribute set C and decision attribute D. The subset C = {a

_{1},a

_{2},…,a

_{n}} has n predictors or conditional attributes, while the subset D = {d

_{1},d

_{2},…,d

_{k}} has k output variables or decision attributes. For every single feature a ∈ A, there exists a domain which collects possible assigned values denoted by V

_{a.}

_{1},u

_{2}) ∈ U × U:∀a ∈ P, a(u

_{1}) = a(u

_{2})}.

_{i}) indicates the attribute value for the object i. This shows that if two objects belong to indiscernibility relation (u

_{1},u

_{2}) ∈ IND(P), then, by attributes P, u

_{1}is indistinguishable or unidentifiable (indiscernible) from u

_{2}. The relation mathematically is symmetric, reflexive, and transitive. Now let [u]

_{P}be the set representing the generated equivalence classes, where u∈U. This set divides U into distinct classes or blocks labeled as U/P.

^{γ}

_{P}(Q) = |POS

_{p}(Q)|/|U|, 0 ≤

^{γ}

_{P}(Q) ≤ 1, | | means cardinality

_{Q}using P. The closer

^{γ}

_{P}(Q) is to 1, the more dependent Q is on P. RST proposes two essential ideas for feature selection based on these fundamentals, which are the Core, and the Reduct.

^{γ}

_{R}(D) =

^{γ}

_{C}(D), where R ⊆ C

^{γ}

_{R′}(D) =

^{γ}

_{R}(D), if this condition is satisfied, the reduct is called the minimal reduct, where the features selected are the minimum that preserve the same value of dependency degree as the whole original feature set. However, we should remember that the definition allows the theory to generate a set of possible reuducts, RED

^{F}

_{C}(D), and any of them are allowed to be used.

_{C}(D) = ∩ RED

^{F}

_{C}(D)

_{ij}can be defined by:

_{ij}contains attributes for which x

_{i}and x

_{j}are different. If this matrix is adapted with any decision table, the definition will be:

#### 3.2. R Language

## 4. Research Methodology

#### 4.1. Problem Statement and Motivation

#### 4.2. Datasets

#### 4.3. Building a Minimal Log Size (Reduct)

- Splitting the dataset into N subsets and performing the proposed algorithm on each subset will overcome hardware limitations, since fewer entries means less memory space to upload the data, perform computations, and store the results. Keeping the whole high dimensional dataset in memory and performing all the previous steps, is mostly impossible;
- Reducing the number of calculations, since passing only the minimal elements in the discernability matrix to reducts calculation will not cause the computation of each possible attribute combination, and hence the equation ${{\displaystyle \sum}}_{i=1}^{N}(\begin{array}{c}N\\ i\end{array})$ = ${2}^{N}$ − 1 is no longer valid. This will certainly reduce the execution time. The proposed code is given in Algorithm 1:

Algorithm 1: IRS Algorithm |

Input: T = (U,A∪D): information table, N: number of iterations,M: number of datasets Output: Core–Reduct,1: For each dataset M do2: For each iteration N do3: Calculate IND _{N}(D)4: Compute DISC.Matrix _{N}(T)5: Do while (DISC.Matrix_{N}(T) ≠ Ø) and i ≤ j(RST discernibility matrix is symetric) 6: S _{i0,j0} = Sort (x_{i},x_{j}) ∈ DISC.Matrix_{N}(T)according to number of conditional attributes A 7: End while 8: Compute Reduct _{N}(S_{i0,j0})(calculating reducts for minimal condition atrridutes) 9: Reduct _{N} = Reduct_{N} ∩ Reduct_{N}(S_{i0,j0})10: End For N11: Core–Reduct = Core–Reduct ∩ Reduct _{N}minimal optimal reduct 12: End For M |

#### 4.4. Generating Minimal Decision Rules

Algorithm 2: Rule Generation Algorithm |

Input: Reduct_{N} (T): minimal reduct information table, M: number of datasetsOutput: Set-Rule_{Min}1: For each dataset M do2: read.table(Reduct _{N} (T))3: Splitting Reduct _{N} (T)training set 60% and a test set 40%. 4: RI.LEM2Rules.RST() function Create rules depending on training set of Reduct _{N} (T)5: predict() function Testing the quality of prediction depending on the test set of Reduct _{N} (T)6: mean() function. Checking the accuracy of predictions 7: End For M |

#### 4.5. Execution Time Comparison with Existing Methods

## 5. Conclusions and Future Works

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Lundgren, B.; Moller, N. Defining information security. Sci. Eng. Ethics
**2019**, 25, 419–441. [Google Scholar] [CrossRef] [Green Version] - Bass, T. Intrusion detection systems and multisensor datafusion. Commun. ACM
**2000**, 43, 99–105. [Google Scholar] [CrossRef] - Xi, R.-R.; Yun, X.-C.; Jin, S.-y.; Zhang, Y.-Z. Research survey of network security situation awareness. J. Comput. Appl.
**2012**, 32, 1–4. [Google Scholar] - Lai, J.-B.; Wang, Y.; Wang, H.-Q.; Zheng, F.-B.; Zhou, B. Research on network security situation awareness system architecture based on multi-source heterogeneous sensors. Comput. Sci.
**2011**, 38, 144. [Google Scholar] - Yen, S.; Moh, M. Intelligent Log Analysis Using Machine and Deep Learning. In Research Anthology on Artificial Intelligence Applications in Security; IGI Global: Hershey, PA, USA, 2021; pp. 1154–1182. [Google Scholar]
- Svacina, J.; Raffety, J.; Woodahl, C.; Stone, B.; Cerny, T.; Bures, M.; Shin, D.; Frajtak, K.; Tisnovsky, P. On Vulnerability and Security Log Analysis: A Systematic Literature Review on Recent Trends. In Proceedings of the International Conference on Research in Adaptive and Convergent Systems, Gwangju, Korea, 13–16 October 2020; Association for Computing Machinery: New York, NY, USA; pp. 175–180. [Google Scholar]
- Chuvakin, A. Security event analysis through correlation. Inf. Secur. J. Glob. Perspect.
**2004**, 13, 13–18. [Google Scholar] [CrossRef] - Klemettinen, M.; Mannila, H.; Toivonen, H. Rule discovery in telecommunication alarm data. J. Netw. Syst. Manag.
**1999**, 7, 395–423. [Google Scholar] [CrossRef] - Bao, X.-H.; Dai, Y.-X.; Feng, P.-H.; Zhu, P.-F.; Wei, J. A detection and forecast algorithm for multi-step attack based on intrusion intention. J. Softw.
**2005**, 16, 2132–2138. [Google Scholar] [CrossRef] [Green Version] - Gonzalez-Granadillo, G.; Gonzalez-Zarzosa, S.G.; Diaz, R. Security information and event management(siem): Analysis, trends, and usage in critical infrastructures. Sensors
**2021**, 21, 4759. [Google Scholar] [CrossRef] [PubMed] - Liu, J.; Gu, L.; Xu, G.; Niu, X. A Correlation Analysis Method of Network Security Events Based on Rough Set Theory. In Proceedings of the 3rd IEEE International Conference on Network Infrastructure and Digital Content, Beijing, China, 21–23 September 2012; Institute of Electrical and Electronics Engineers: Piscataway, NJ, USA, 2012; pp. 517–520. [Google Scholar]
- Yao, Y.; Wang, Z.; Gan, C.; Kang, Q.; Liu, X.; Xia, Y.; Zhang, L. Multi-source alert data understanding for security semantic discovery based on rough set theory. Neurocomputing
**2016**, 208, 39–45. [Google Scholar] [CrossRef] - Bao, L.; Li, Q.; Lu, P.; Lu, J.; Ruan, T.; Zhang, K. Execution anomaly detection in large-scale systems through console log analysis. J. Syst. Softw.
**2018**, 143, 172–186. [Google Scholar] [CrossRef] - Bania, R. Comparative review on classical rough set theory based feature selection methods. Int. J. Comput. Appl.
**2015**, 114, 31–35. [Google Scholar] - Dagdia, Z.C.; Zarges, C.; Beck, G.; Lebbah, M. A scalable and effective rough set theory-based approach for big data pre-processing. Knowl. Inf. Syst.
**2020**, 62, 3321–3386. [Google Scholar] - Raman, M.G.; Kirthivasan, K.; Sriram, V.S. Development of rough set–hypergraph technique for key feature identification in intrusion detection systems. Comput. Electr. Eng.
**2017**, 59, 189–200. [Google Scholar] [CrossRef] - Dutta, S.; Ghatak, S.; Dey, R.; Das, A.K.; Ghosh, S. Attribute selection for improving spam classification in online social networks: A rough set theory-based approach. Soc. Netw. Anal. Min.
**2018**, 8, 1–16. [Google Scholar] [CrossRef] - Anitha, A.; Acharjya, D. Crop suitability prediction in vellore district using rough set on fuzzy approximation space and neural network. Neural Comput. Appl.
**2018**, 30, 3633–3650. [Google Scholar] [CrossRef] - Nanda, N.B.; Parikh, A. Hybrid Approach for Network Intrusion Detection System Using Random Forest Classifier and Rough Set Theory for Rules Generation. In Proceedings of the 3rd International Conference on Advanced Informatics for Computing Research, Shimla, India, 15–16 June 2019; Springer: Cham, Switzerland, 2019; pp. 274–287. [Google Scholar]
- Dagdia, Z.C.; Zarges, C.; Beck, G.; Lebbah, M. A distributed rough set theory based algorithm for an efficient big data pre-processing under the spark framework. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, USA, 11–14 December 2017; pp. 911–916. [Google Scholar]
- Chen, D.; Zhang, L.; Zhao, S.; Hu, Q.; Zhu, P. A novel algorithm for finding reducts with fuzzy rough sets. IEEE Trans. Fuzzy Syst.
**2011**, 20, 385–389. [Google Scholar] [CrossRef] - Ahmed, S.; Zhang, M.; Peng, L. Enhanced Feature Selection for Biomarker Discovery in LC-MS Data Using GP. In Proceedings of the 2013 IEEE Congress on Evolutionary Computation, Cancun, Mexico, 20–23 June 2013; Institute of Electrical and Electronics Engineers: Piscataway, NJ, USA, 2013; pp. 584–591. [Google Scholar]
- Liu, H.; Yu, L. Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng.
**2005**, 17, 491–502. [Google Scholar] - Aghdam, M.H.; Ghasem-Aghaee, N.; Basiri, M.E. Text feature selection using ant colony optimization. Expert Syst. Appl.
**2009**, 36, 6843–6853. [Google Scholar] [CrossRef] - Bolón-Canedo, V.; Rego-Fernández, D.; Peteiro-Barral, D.; Alonso-Betanzos, A.; Guijarro-Berdiñas, B.; Sánchez-Maroño, N. On the scalability of feature selec-tion methods on high-dimensional data. Knowl. Inf. Syst.
**2018**, 56, 395–442. [Google Scholar] [CrossRef] - Thangavel, K.; Pethalakshmi, A. Dimensionality re-duction based on rough set theory: A review. Appl. Soft Comput.
**2009**, 9, 1–12. [Google Scholar] [CrossRef] - El-Alfy, E.-S.M.; Alshammari, M.A. Towards scalable rough set based attribute subset selection for intrusion detection using parallel genetic algorithm in mapreduce. Simul. Model. Pract. Theory
**2016**, 64, 18–29. [Google Scholar] [CrossRef] - Qian, Y.; Liang, X.; Wang, Q.; Liang, J.; Liu, B.; Skowron, A.; Yao, Y.; Ma, J.; Dang, C. Local rough set: A solution to rough data analysis in big data. International J. Approx. Reason.
**2018**, 97, 38–63. [Google Scholar] [CrossRef] - Hu, J.; Pedrycz, W.; Wang, G.; Wang, K. Rough sets in distributed decision information systems. Knowl. Based Syst.
**2016**, 94, 13–22. [Google Scholar] [CrossRef] - Dai, J.; Hu, H.; Wu, W.-Z.; Qian, Y.; Huang, D. Maximal-discernibility-pair-based approach to attribute reduction in fuzzy rough sets. IEEE Trans. Fuzzy Syst.
**2017**, 26, 2174–2187. [Google Scholar] [CrossRef] - Velayutham, C.; Thangavel, K. Improved rough set algorithms for optimal attribute reduct. J. Electron. Sci. Technol.
**2011**, 9, 108–117. [Google Scholar] - Pawlak, Z. Rough Sets: Theoretical Aspects of Reasoning about Data; Springer Science & Business Media: Dordrecht, The Netherlands, 2012; Volume 9. [Google Scholar]
- Pawlak, Z.; Skowron, A. Rudiments of rough sets. Inf. Sci.
**2007**, 177, 3–27. [Google Scholar] [CrossRef] - Peralta, D.; del Rio, S.; Ramirez-Gallego, S.; Triguero, I.; Benitez, J.M.; Herrera, F. Evolutionary feature selection for big data classification: A mapreduce approach. Math. Probl. Eng.
**2015**, 2015, 246139. [Google Scholar] [CrossRef] [Green Version] - Zbigniew, S. An Introduction to Rough Set Theory and Its Applications—A Tutorial. In Proceedings of the 1st International Computer Engineering Conference ICENCO’2004, Cairo, Egypt, 27–30 December 2004. [Google Scholar]
- Ray, K.S. Soft Computing and Its Applications: A Unified Engineering Concept; CRC Press: Boca Raton, FL, USA, 2014; Volume 1. [Google Scholar]
- Hothorn, T. Cran Task View: Machine Learning & Statistical Learning; The R Foundation: Vienna, Austria, 2021. [Google Scholar]
- Rhys, H.I. Machine Learning with R, the Tidyverse, and Mlr; Manning Publications: Shelter Island, NY, USA, 2020. [Google Scholar]
- Tuffery, S. Data Mining and Statistics for Decision Making; John Wiley & Sons: Hoboken, NJ, USA, 2011. [Google Scholar]
- Aphalo, P.J. Learn R: As a Language; CRC Press: Boca Raton, FL, USA, 2020. [Google Scholar]
- Abbas, Z.; Burney, A. A survey of software packages used for rough set analysis. J. Comput. Commun.
**2016**, 4, 10–18. [Google Scholar] [CrossRef] [Green Version] - Dubois, D.; Prade, H. Rough fuzzy sets and fuzzy rough sets. Int. J. Gen. Syst.
**1990**, 17, 191–209. [Google Scholar] [CrossRef] - Tang, J.; Wang, J.; Wu, C.; Ou, G. On uncertainty measure issues in rough set theory. IEEE Access
**2020**, 8, 91089–91102. [Google Scholar] [CrossRef] - Beaubouef, T.; Petry, F.; Arora, G. Information-theoretic measures of uncertainty for rough sets and rough relational databases. Inf. Sci.
**1998**, 109, 185–195. [Google Scholar] [CrossRef] - Parthaláin, N.M.; Shen, Q.; Jensen, R. A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans. Knowl. Data Eng.
**2010**, 22, 305–317. [Google Scholar] [CrossRef] [Green Version] - Razavi, S.; Jakeman, A.; Saltelli, A.; Prieur, C.; Iooss, B.; Borgonovo, E.; Plischke, E.; Piano, S.L.; Iwanaga, T.; Becker, W.; et al. The future of sensitivity analysis: An essential discipline for systems modeling and policy support. Environ. Model. Softw.
**2021**, 137, 104954. [Google Scholar] [CrossRef] - Rodriguez, J.D.; Perez, A.; Lozano, J.A. Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans. Pattern Anal. Mach. Intell.
**2009**, 32, 569–575. [Google Scholar] [CrossRef] [PubMed] - Sobol, I.M.; Tarantola, S.; Gatelli, D.; Kucherenko, S.; Mauntz, W. Estimating the approximation error when fixing unessential factors in global sensitivity analysis. Reliab. Eng. Syst. Saf.
**2007**, 92, 957–960. [Google Scholar] [CrossRef] - Guillaume, J.H.; Jakeman, J.D.; Marsili-Libelli, S.; Asher, M.; Brunner, P.; Croke, B.; Hill, M.C.; Jakeman, A.J.; Keesman, K.J.; Razavi, S.; et al. Introductory overview of identifiability analysis: A guide to evaluating whether you have the right type of data for your modeling purpose. Environ. Model. Softw.
**2019**, 119, 418–432. [Google Scholar] [CrossRef] - Saltelli, A.; Bammer, G.; Bruno, I.; Charters, E.; di Fiore, M.; Didier, E.; Espeland, W.N.; Kay, J.; Piano, S.L.; Mayo, D.; et al. Five ways to ensure that models serve society: A manifesto. Nature
**2020**, 582, 482–484. [Google Scholar] [CrossRef] - Iooss, B.; Janon, A.; Pujol, G. Sensitivity: Global Sensitivity Analysis of Model Outputs; R Package Version 1.22.0; The R Foundation: Vienna, Austria, 2018. [Google Scholar]
- Guyon, I.; Elisseeff, A. An introduction to variable and feature selection. J. Mach. Learn. Res.
**2003**, 3, 1157–1182. [Google Scholar] - Majeed, A.; Ur Rasool, R.; Ahmad, F.; Alam, M.; Javaid, N. Near-miss situation based visual analysis of siem rules for real time network security monitoring. J. Ambient Intell. Humaniz. Comput.
**2019**, 10, 1509–1526. [Google Scholar] [CrossRef] - Riza, L.S.; Janusz, A.; Bergmeir, C.; Cornelis, C.; Herrera, F.; Śle¸zak, D.; Benítez, J.M. Implementing algorithms of rough set theory and fuzzy rough set theory in the R package “roughsets”. Inf. Sci.
**2014**, 287, 68–89. [Google Scholar] [CrossRef] - Tsang, E.C.C.; Chen, D.G.; Yeung, D.S.; Wang, X.Z.; Lee, J.W.T. Attributes reduction using fuzzy rough sets. IEEE Trans. Fuzzy Syst.
**2008**, 16, 1130–1141. [Google Scholar] [CrossRef] - UCI Machine Learning Repository. 2005. Available online: http://www.ics.uci.edu/mlearn/MLRepository.html (accessed on 3 September 2021).

Event Name | Log Source | Event Count | Low Level Category | Source IP | Source Port | Destination IP | Destination Port | User Name | Magnitude |
---|---|---|---|---|---|---|---|---|---|

Tear down UDP connection | ASA @ 172.17.0.1 | 1 | Fire wall Session Closed | 8.8.8.8 | 53 | 172.18.12.10 | 53,657 | N/A | 7 |

Deny protocol src | R | 1 | Fire wall Deny | 172.20.12.142 | 56,511 | 172.217.23.174 | 443 | N/A | 8 |

Deny protocol src | ASA @ 172.17.0.1 | 1 | Fire wall Deny | 172.20.18.54 | 52,976 | 213.139.38.18 | 80 | N/A | 8 |

Deny protocol src | ASA @ 172.17.0.1 | 1 | Fire wall Deny | 172.20.15.71 | 53,722 | 52.114.75.79 | 443 | N/A | 8 |

Deny protocol src | ASA @ 172.17.0.1 | 1 | Fire wall Deny | 192.168.180.131 | 55,091 | 40.90.22.184 | 443 | N/A | 8 |

Built TCP connection | ASA @ 172.17.0.1 | 1 | Fire wall Deny | 172.18.12.19 | 59,201 | 163.172.21.225 | 443 | N/A | 8 |

Training Data Set | Minimal Attribute | Degree of Dependency ^{1} |
---|---|---|

First Training Set S1 (∩ three iterations) Reduct_{N = 1} | A1 = {Event Name, Source IP, Source Port, Destination IP, Magnitude } |A1| = 5 | 1 |

Second Training Set S2 (∩ three iterations) Reduct_{N = 2} | A2 = { Event Name, Source IP, Destination IP, Magnitude }|A2| = 4 | 0.9992941 |

Third Training Set S3 (∩ three iterations) Reduct_{N = 3} | A3 = {Event Name, Source IP, Source Port, Destination IP, Magnitude } |A3| = 5 | 1 |

Core-Reduct (A1∩ A2∩ A3) | A2 = { Event Name, Source IP, Destination IP, Magnitude }|A2| = 4 | 0.9992941 |

^{1}: a decision attribute, d, totally depends on a set of attributes A, written as A ⇒ d if all attribute values from d are distinctly identified by attribute values from A.

Training Data Set | Number of Decision Rules before Reduct | Number of Deccision Rules after Reduct | Prediction Accuracy |
---|---|---|---|

First Training Set | S1 = 905 | A1 = 596 | 0.9552733 |

Second Training Set | S2 = 878 | A2 = 509 | 0.9535073 |

Third Training Set | S3 = 813 | A3 = 481 | 0.9741291 |

Dataset | Number of Attributes | Number of Instances |
---|---|---|

Glass | 9 | 100 |

Wiscon | 9 | 699 |

Zoo | 16 | 100 |

Data | Num. of Attributes of the Dataset | All Reducts | Execution Time in Seconds | |||
---|---|---|---|---|---|---|

IRS | SPS and CDM | Classical DiscernibilityMatrix (CDM) | SPS | IRS | ||

Wiscon | 9 | 4 | 4 | 1362.1 | 24.0956 | 9.05 |

Glass | 9 | 2 | 2 | 23.3268 | 0.7931 | 0.7 |

Zoo | 16 | 35 | 35 | 106.6581 | 1.2574 | 0.9967 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Alawneh, T.N.; Tut, M.A.
Using Rough Set Theory to Find Minimal Log with Rule Generation. *Symmetry* **2021**, *13*, 1906.
https://doi.org/10.3390/sym13101906

**AMA Style**

Alawneh TN, Tut MA.
Using Rough Set Theory to Find Minimal Log with Rule Generation. *Symmetry*. 2021; 13(10):1906.
https://doi.org/10.3390/sym13101906

**Chicago/Turabian Style**

Alawneh, Tahani Nawaf, and Mehmet Ali Tut.
2021. "Using Rough Set Theory to Find Minimal Log with Rule Generation" *Symmetry* 13, no. 10: 1906.
https://doi.org/10.3390/sym13101906