You are currently on the new version of our website. Access the old version .
AlgorithmsAlgorithms
  • Article
  • Open Access

25 September 2021

Rough Estimator Based Asynchronous Distributed Super Points Detection on High Speed Network Edge

and
1
Jiangsu Police Institute, Nanjing 210031, China
2
School of Cyber Science and Engineering, Southeast University, Nanjing 211102, China
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
This article belongs to the Collection Parallel and Distributed Computing: Algorithms and Applications

Abstract

Super points detection plays an important role in network research and application. With the increase of network scale, distributed super points detection has become a hot research topic. The key point of super points detection in a multi-node distributed environment is how to reduce communication overhead. Therefore, this paper proposes a three-stage communication algorithm to detect super points in a distributed environment, Rough Estimator based Asynchronous Distributed super points detection algorithm (READ). READ uses a lightweight estimator, the Rough Estimator (RE), which is fast in computation and takes less memory to generate candidate super points. Meanwhile, the famous Linear Estimator (LE) is applied to accurately estimate the cardinality of each candidate super point, so as to detect the super point correctly. In READ, each node scans IP address pairs asynchronously. When reaching the time window boundary, READ starts three-stage communication to detect the super point. This paper proves that the accuracy of READ in a distributed environment is no less than that in the single-node environment. Four groups of 10 Gb/s and 40 Gb/s real-world high-speed network traffic are used to test READ. The experimental results show that READ not only has high accuracy in a distributed environment, but also has less than 5% of communication burden compared with existing algorithms.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.