Next Article in Journal
Expectations and limitations of Cyber-Physical Systems (CPS) for Advanced Manufacturing: A View from the Grinding Industry
Previous Article in Journal
Internet of Things (IoT) Cybersecurity: Literature Review and IoT Cyber Risk Management
Open AccessArticle

On Frequency Estimation and Detection of Heavy Hitters in Data Streams

Dpartment of Engineering for Innovation, University of Salento, 73100 Lecce, Italy
*
Authors to whom correspondence should be addressed.
Future Internet 2020, 12(9), 158; https://doi.org/10.3390/fi12090158
Received: 25 August 2020 / Revised: 12 September 2020 / Accepted: 14 September 2020 / Published: 18 September 2020
(This article belongs to the Section Big Data and Augmented Intelligence)
A stream can be thought of as a very large set of data, sometimes even infinite, which arrives sequentially and must be processed without the possibility of being stored. In fact, the memory available to the algorithm is limited and it is not possible to store the whole stream of data which is instead scanned upon arrival and summarized through a succinct data structure in order to maintain only the information of interest. Two of the main tasks related to data stream processing are frequency estimation and heavy hitter detection. The frequency estimation problem requires estimating the frequency of each item, that is the number of times or the weight with which each appears in the stream, while heavy hitter detection means the detection of all those items with a frequency higher than a fixed threshold. In this work we design and analyze ACMSS, an algorithm for frequency estimation and heavy hitter detection, and compare it against the state of the art ASketch algorithm. We show that, given the same budgeted amount of memory, for the task of frequency estimation our algorithm outperforms ASketch with regard to accuracy. Furthermore, we show that, under the assumptions stated by its authors, ASketch may not be able to report all of the heavy hitters whilst ACMSS will provide with high probability the full list of heavy hitters. View Full-Text
Keywords: data stream mining; heavy hitters; frequency estimation; sketches data stream mining; heavy hitters; frequency estimation; sketches
Show Figures

Figure 1

MDPI and ACS Style

Ventruto, F.; Pulimeno, M.; Cafaro, M.; Epicoco, I. On Frequency Estimation and Detection of Heavy Hitters in Data Streams. Future Internet 2020, 12, 158.

Show more citation formats Show less citations formats
Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Article Access Map by Country/Region

1
Search more from Scilit
 
Search
Back to TopTop