MDPI - Publisher of Open Access Journals

24 pages, 1010 KiB

Open AccessArticle

Sensitivity Estimation for Differentially Private Query Processing

by Meifan Zhang, Xin Liu and Lihua Yin

Appl. Sci. 2025, 15(14), 7667; https://doi.org/10.3390/app15147667 - 8 Jul 2025

Viewed by 204

Differential privacy is a robust framework for private data analysis and query processing, which achieves privacy preservation by introducing controlled noise to query results in a centralized setting. The sensitivity of a query, defined as the maximum change in query output resulting from [...] Read more.

Differential privacy is a robust framework for private data analysis and query processing, which achieves privacy preservation by introducing controlled noise to query results in a centralized setting. The sensitivity of a query, defined as the maximum change in query output resulting from the addition or removal of a single data record, directly influences the magnitude of noise to be introduced. Computing sensitivity for simple queries, such as count queries, is straightforward, but it becomes significantly more challenging for complex queries involving join operations. In such cases, the global sensitivity can be unbounded, which substantially impacts the accuracy of query results. While existing measures like elastic sensitivity and residual sensitivity provide upper bounds on local sensitivity to reduce noise, they often struggle with either low utility or high computational overhead when applied to complex join queries. In this paper, we propose two novel sensitivity estimation methods based on sampling and sketching techniques, which provide competitive utility while achieving higher efficiency compared to existing state-of-the-art approaches. Experiments on real-world and benchmark datasets confirm that both methods enable efficient differentially private joins, significantly enhancing the usability of online interactive query systems. Full article

(This article belongs to the Special Issue Advanced Technology of Information Security and Privacy)

► Show Figures

Figure 1

21 pages, 979 KiB

Open AccessArticle

Efficient and Secure Traffic Scheduling Based on Private Sketch

by Yang Chen, Huishu Wu and Xuhao Ren

Mathematics 2025, 13(2), 288; https://doi.org/10.3390/math13020288 - 17 Jan 2025

Viewed by 725

Abstract

In today’s data–driven world, the explosive growth of network traffic often leads to network congestion, which seriously affects service performance and user experience. Network traffic scheduling is one of the key technologies to deal with congestion problems. Traditional traffic scheduling methods often rely [...] Read more.

In today’s data–driven world, the explosive growth of network traffic often leads to network congestion, which seriously affects service performance and user experience. Network traffic scheduling is one of the key technologies to deal with congestion problems. Traditional traffic scheduling methods often rely on static rules or pre–defined policies, which make it difficult to cope with dynamically changing network traffic patterns. Additionally, the inability to efficiently manage tail contributors that disproportionately contribute to traffic can further exacerbate congestion issues. In this paper, we propose ESTS, an efficient and secure traffic scheduling based on private sketch, capable of identifying tail contributors to adjust routing and prevent congestion. The key idea is to develop a randomized admission (

R A

) structure, linking two count–mean–min (CMM) sketches. The first CMM sketch records cold items, while the second, following the

R A

structure, stores hot items with high frequency. Moreover, considering that tail contributors may leak private information, we incorporate Gaussian noise uniformly into the CMM sketch and

R A

structure. Experimental evaluations on real and synthetic datasets demonstrate that ESTS significantly improves the accuracy of feature distribution estimation and privacy preservation. Compared to baseline methods, the ESTS framework achieves a 25% reduction in average relative error and a 30% improvement in tail contributor identification accuracy. These results underline the framework’s efficiency and reliability. Full article

(This article belongs to the Special Issue Privacy-Preserving Techniques in AI, Blockchain and Cloud Systems with Formal Mathematical Analysis)

► Show Figures

Figure 1

19 pages, 1342 KiB

Open AccessArticle

Anomaly Detection over Streaming Graphs with Finger-Based Higher-Order Graph Sketch

by Min Lu, Qianzhen Zhang and Xianqiang Zhu

Mathematics 2024, 12(19), 3092; https://doi.org/10.3390/math12193092 - 2 Oct 2024

Viewed by 1645

Abstract

A streaming graph is a constantly growing sequence of edges, which forms a dynamic graph that changes with every edge in the stream. An anomalous behavior in a streaming graph can be modeled as an edge or a subgraph that is unusual compared [...] Read more.

A streaming graph is a constantly growing sequence of edges, which forms a dynamic graph that changes with every edge in the stream. An anomalous behavior in a streaming graph can be modeled as an edge or a subgraph that is unusual compared to the rest of the graph. Identifying anomalous behaviors in real time is essential to the early warning of abnormal or notable events. Due to the complexity of the problem, little work has been reported so far to solve the problem. In this paper, we propose Finger-based Higher-order Graph Sketch (FHGS for short), which is an approximate data structure for streaming graphs with linear memory usage, high update speed, and high accuracy and supports both edge and subgraph anomaly detection. FHGS first maps each edge into a matrix based on hash functions, and then counts its frequency in a time window with unique fingerprints for detecting anomalies. Extensive experiments confirm that our approach generate high-quality results compared to baseline methods. Full article

► Show Figures

Figure 1

23 pages, 847 KiB

Open AccessArticle

APT Attack Detection Scheme Based on CK Sketch and DNS Traffic

by Defan Xue, Yaping Chi, Bing Wu and Lun Zhao

Sensors 2023, 23(4), 2217; https://doi.org/10.3390/s23042217 - 16 Feb 2023

Cited by 3 | Viewed by 3181

Abstract

In recent years, Advanced Persistent Threat (APT) attacks against sensors have emerged as a prominent security concern. Due to the low level of protection provided by sensors, APT attack organizations are able to develop intrusion schemes that allow them to infiltrate, attack, lurk, [...] Read more.

In recent years, Advanced Persistent Threat (APT) attacks against sensors have emerged as a prominent security concern. Due to the low level of protection provided by sensors, APT attack organizations are able to develop intrusion schemes that allow them to infiltrate, attack, lurk, spread, and steal information from the target over an extended period of time. Through extensive research on the APT attack process and current defense mechanisms, it has been found that analyzing Domain Name Server (DNS) traffic in the communication control phase is an effective way of detecting APT attacks. However, analyzing APT attacks based on traffic usually involves the detection of a vast amount of DNS traffic, and current data preprocessing methods do not scale down data effectively, leading to low detection efficiency. In previous work, most efforts have been focused on calculating the features of request messages or corresponding messages without considering the association between request messages and corresponding messages. To address these issues, we propose a sketch-based APT attack traffic detection scheme. The scheme leverages the sketch structure to count and compress network traffic, improving the efficiency of APT detection. Our work also analyzes the limitations of traditional sketches in network traffic and proposes an improved sketch scheme. In addition, we propose several effective features for detecting APT attacks. We validate and evaluate our solution using 1,088,280 DNS traffic from a lab network and APT suspicious traffic from netresec and contagio, using eight machine learning models. The experimental results show that for the ExtraTrees model, our solution has a processing time of 0.0638 s and an accuracy of 0.97920, reducing the processing time by approximately 50 times and improving detection accuracy by a small margin compared to a dataset without sketch processing. Full article

(This article belongs to the Collection Cryptography and Security in IoT and Sensor Networks)

► Show Figures

Figure 1

17 pages, 3472 KiB

Open AccessArticle

Secure Medical Data Collection in the Internet of Medical Things Based on Local Differential Privacy

by Jinpeng Wang and Xiaohui Li

Electronics 2023, 12(2), 307; https://doi.org/10.3390/electronics12020307 - 6 Jan 2023

Cited by 9 | Viewed by 2093

Abstract

As big data and data mining technology advance, research on the collection and analysis of medical data on the internet of medical things (IoMT) has gained increasing attention. Medical institutions often collect users’ signs and symptoms from their devices for analysis. However, the [...] Read more.

As big data and data mining technology advance, research on the collection and analysis of medical data on the internet of medical things (IoMT) has gained increasing attention. Medical institutions often collect users’ signs and symptoms from their devices for analysis. However, the process of data collection may pose a risk of privacy leakage without a trusted third party. To address this issue, we propose a medical data collection based on local differential privacy and Count Sketch (MDLDP). The algorithm first uses a random sampling technique to select only one symptom for perturbation by a single user. The perturbed data is then uploaded using Count Sketch. The third-party aggregates the user-submitted data to estimate the frequencies of the symptoms and the mean extent of their occurrence. This paper theoretically demonstrates that the designed algorithm satisfies local differential privacy and unbiased estimation. We also evaluated the algorithm experimentally with existing algorithms on a real medical dataset. The results show that the MDLDP algorithm has good utility for key-value type medical data collection statistics in the IoMT. Full article

(This article belongs to the Special Issue Security and Privacy Preservation in Big Data Age)

► Show Figures

Figure 1

15 pages, 3178 KiB

Open AccessArticle

Curcumin in Wound Healing—A Bibliometric Analysis

by Faiza Farhat, Shahab Saquib Sohail, Farheen Siddiqui, Reyazur Rashid Irshad and Dag Øivind Madsen

Life 2023, 13(1), 143; https://doi.org/10.3390/life13010143 - 4 Jan 2023

Cited by 22 | Viewed by 6547

Abstract

Background: Curcumin has been widely used to treat a variety of diseases and disorders since ancient times, most notably for the purpose of healing wounds. Despite the large number of available reviews on this topic, a bibliometric tool-based meta-analysis is missing in the [...] Read more.

Background: Curcumin has been widely used to treat a variety of diseases and disorders since ancient times, most notably for the purpose of healing wounds. Despite the large number of available reviews on this topic, a bibliometric tool-based meta-analysis is missing in the literature. Scope and approach: To evaluate the influence and significance of the countries, journals, organizations and authors that have contributed the most to this topic, the popular bibliometric markers, including article count, citation count, and Hirsch index (H-index), are taken into account. Their collaborative networks and keyword co-occurrence along with the trend analysis are also sketched out using the VOSviewer software. To the best of our knowledge, this is the first bibliometric review on the topic and hence it is envisaged that it will attract researchers to explore future research dimensions in the related field. Key findings and conclusions: India provided the most articles, making up more than 27.49 percent of the entire corpus. The International Journal of Biological Macromolecules published the most articles (44), and it also received the most citations (2012). The Journal of Ethnopharmacology (28 articles) and Current Pharmaceutical Design (20 articles) were the next most prolific journals with 1231 and 812 citations, respectively. The results indicate a significant increase in both research and publications on the wound-healing properties of curcumin. Recent studies have concentrated on creating novel medicine-delivery systems that use nano-curcumin to boost the effect of the curcumin molecule in therapeutic targeting. It has also been observed that genetic engineering and biotechnology have recently been employed to address the commercial implications of curcumin. Full article

(This article belongs to the Section Pharmaceutical Science)

► Show Figures

Figure 1

18 pages, 528 KiB

Open AccessArticle

ACM: Accuracy-Aware Collaborative Monitoring for Software-Defined Network-Wide Measurement

by Jiqing Gu, Chao Song, Haipeng Dai, Lei Shi, Jinqiu Wu and Li Lu

Sensors 2022, 22(20), 7932; https://doi.org/10.3390/s22207932 - 18 Oct 2022

Cited by 3 | Viewed by 1815

Abstract

Software-defined measurement (SDM) is a simple and efficient way to deploy measurement tasks and collect measurement data. With SDM, it is convenient for operators to implement fine-grained network-wide measurements at the flow level, from which many important functions can benefit. The prior work [...] Read more.

Software-defined measurement (SDM) is a simple and efficient way to deploy measurement tasks and collect measurement data. With SDM, it is convenient for operators to implement fine-grained network-wide measurements at the flow level, from which many important functions can benefit. The prior work provides mechanisms to distribute flows to monitors, such that each monitor can identify its non-overlapped subset of flows to measure, and a certain global performance criterion is optimized, such as load balance or flow coverage. Many applications of network management can benefit from a function that can find large flows efficiently, such as congestion control by dynamically scheduling large flows, caching of forwarding table entries, and network capacity planning. However, the current network-wide measurements neglect the diversity of different flows as they treat large flows and small flows equally. In this paper, we present a mechanism of accuracy-aware collaborative monitoring (ACM) to improve the measurement accuracies of large flows in network-wide measurements at the flow level. The structure of the sketch is an approximate counting algorithm, and a high-measurement accuracy can be achieved by merging the results from multiple monitors with sketches, which is termed as collaborative monitoring. The core idea of our method is to allocate more monitors to large flows and achieve the load balance to provide accuracy-aware monitoring. We modeled our problem as an integer–linear programming problem, which is NP-hard. Thus, we propose an approximation algorithm, named the improved longest processing time algorithm (iLPTA); we proved that its approximation ratio is

(\frac{1}{2} + \frac{n}{l})

. We propose a two-stage online distribution algorithm (TODA). Moreover, we proved that its approximation ratio is

(1 + \frac{n}{l - 1})

. The iLPTA is an offline approximation algorithm used to assign monitors for each flow, which prove the validity and feasibility of the core idea. The TODA is an online algorithm that attempts to achieve the load balance by selecting the monitor with the smallest load to a large flow. Our extensional experiment results verify the effectiveness of our proposed algorithms. Full article

(This article belongs to the Section Sensor Networks)

► Show Figures

Figure 1

14 pages, 5142 KiB

Open AccessArticle

A Fast Deployable Instance Elimination Segmentation Algorithm Based on Watershed Transform for Dense Cereal Grain Images

by Junling Liang, Heng Li, Fei Xu, Jianpin Chen, Meixuan Zhou, Liping Yin, Zhenzhen Zhai and Xinyu Chai

Agriculture 2022, 12(9), 1486; https://doi.org/10.3390/agriculture12091486 - 16 Sep 2022

Cited by 2 | Viewed by 2593

Abstract

Cereal grains are a vital part of the human diet. The appearance quality and size distribution of cereal grains play major roles as deciders or indicators of market acceptability, storage stability, and breeding. Computer vision is popular in completing quality assessment and size [...] Read more.

Cereal grains are a vital part of the human diet. The appearance quality and size distribution of cereal grains play major roles as deciders or indicators of market acceptability, storage stability, and breeding. Computer vision is popular in completing quality assessment and size analysis tasks, in which an accurate instance segmentation is a key step to guaranteeing the smooth completion of tasks. This study proposes a fast deployable instance segmentation method based on a generative marker-based watershed segmentation algorithm, which combines two strategies (one strategy for optimizing kernel areas and another for comprehensive segmentation) to overcome the problems of over-segmentation and under-segmentation for images with dense and small targets. Results show that the average segmentation accuracy of our method reaches 98.73%, which is significantly higher than the marker-based watershed segmentation algorithm (82.98%). To further verify the engineering practicality of our method, we count the size distribution of segmented cereal grains. The results keep a high degree of consistency with the manually sketched ground truth. Moreover, our proposed algorithm framework can be used as a great reference in other segmentation tasks of dense targets. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

17 pages, 2453 KiB

Open AccessArticle

High-Level Design Optimizations for Implementing Data Stream Sketch Frequency Estimators on FPGAs

by Ali Ebrahim

Electronics 2022, 11(15), 2399; https://doi.org/10.3390/electronics11152399 - 31 Jul 2022

Cited by 5 | Viewed by 2299

Abstract

This paper presents simple yet effective optimizations for implementing data stream frequency estimation sketch kernels using High-Level Synthesis (HLS). The paper addresses design issues common to sketches utilizing large portions of the embedded RAM resources in a Field Programmable Gate Array (FPGA). First, [...] Read more.

This paper presents simple yet effective optimizations for implementing data stream frequency estimation sketch kernels using High-Level Synthesis (HLS). The paper addresses design issues common to sketches utilizing large portions of the embedded RAM resources in a Field Programmable Gate Array (FPGA). First, a solution based on Load-Store Queue (LSQ) architecture is proposed for resolving the memory dependencies associated with the hash tables in a frequency estimation sketch. Second, performance fine-tuning through high-level pragmas is explored to achieve the best possible throughput. Finally, a technique based on pre-processing the data stream in a small cache memory prior to updating the sketch is evaluated to reduce the dynamic power consumption. Using an Intel HLS compiler, a proposed optimized hardware version of the popular Count-Min sketch utilizing 80% of the embedded RAM in an Intel Arria 10 FPGA, achieved more than 3x the throughput of an unoptimized baseline implementation. Furthermore, the sketch update rate is significantly reduced when the input stream is skewed. This, in turn, minimizes the effect of high throughput on dynamic power consumption. Compared to FPGA sketches in the published literature, the presented sketch is the most well-rounded sketch in terms of features and versatility. In terms of throughput, the presented sketch is on a par with the fastest sketches fine-tuned at the Register Transfer Level (RTL). Full article

(This article belongs to the Special Issue Recent FPGA Architectures and Applications)

► Show Figures

Figure 1

25 pages, 802 KiB

Open AccessArticle

Interactive Graph Stream Analytics in Arkouda

by Zhihui Du, Oliver Alvarado Rodriguez, Joseph Patchett and David A. Bader

Algorithms 2021, 14(8), 221; https://doi.org/10.3390/a14080221 - 21 Jul 2021

Cited by 10 | Viewed by 3674

Abstract

Data from emerging applications, such as cybersecurity and social networking, can be abstracted as graphs whose edges are updated sequentially in the form of a stream. The challenging problem of interactive graph stream analytics is the quick response of the queries on terabyte [...] Read more.

Data from emerging applications, such as cybersecurity and social networking, can be abstracted as graphs whose edges are updated sequentially in the form of a stream. The challenging problem of interactive graph stream analytics is the quick response of the queries on terabyte and beyond graph stream data from end users. In this paper, a succinct and efficient double index data structure is designed to build the sketch of a graph stream to meet general queries. A single pass stream model, which includes general sketch building, distributed sketch based analysis algorithms and regression based approximation solution generation, is developed, and a typical graph algorithm—triangle counting—is implemented to evaluate the proposed method. Experimental results on power law and normal distribution graph streams show that our method can generate accurate results (mean relative error less than 4%) with a high performance. All our methods and code have been implemented in an open source framework, Arkouda, and are available from our GitHub repository, Bader-Research. This work provides the large and rapidly growing Python community with a powerful way to handle terabyte and beyond graph stream data using their laptops. Full article

(This article belongs to the Special Issue Scalable Graph Algorithms and Applications)

► Show Figures

Figure 1

21 pages, 513 KiB

Open AccessArticle

PPDC: A Privacy-Preserving Distinct Counting Scheme for Mobile Sensing

by Xiaochen Yang, Ming Xu, Shaojing Fu and Yuchuan Luo

Appl. Sci. 2019, 9(18), 3695; https://doi.org/10.3390/app9183695 - 5 Sep 2019

Cited by 1 | Viewed by 2390

Abstract

Mobile sensing mines group information through sensing and aggregating users’ data. Among major mobile sensing applications, the distinct counting problem aiming to find the number of distinct elements in a data stream with repeated elements, is extremely important for avoiding waste of resources. [...] Read more.

Mobile sensing mines group information through sensing and aggregating users’ data. Among major mobile sensing applications, the distinct counting problem aiming to find the number of distinct elements in a data stream with repeated elements, is extremely important for avoiding waste of resources. Besides, the privacy protection of users is also a critical issue for aggregation security. However, it is a challenge to meet these two requirements simultaneously since normal privacy-preserving methods would have negative influence on the accuracy and efficiency of distinct counting. In this paper, we propose a Privacy-Preserving Distinct Counting scheme (PPDC) for mobile sensing. Through integrating the basic idea of homomorphic encryption into Flajolet-Martin (FM) sketch, PPDC allows an aggregator to conduct distinct counting over large-scale datasets without disrupting privacy of users. Moreover, PPDC supports various forms of sensing data, including camera images, location data, etc. PPDC expands each bit of the hashing values of users’ original data, FM sketch is thus enhanced for encryption to protect users’ privacy. We prove the security of PPDC under known-plaintext model. The theoretic and experimental results show that PPDC achieves high counting accuracy and practical efficiency with scalability over large-scale data sets. Full article

(This article belongs to the Special Issue Intelligent Perception, Application and Security Mechanism in the Internet of Things)

► Show Figures

Figure 1

10 pages, 568 KiB

Open AccessArticle

A Review on Hot-IP Finding Methods and Its Application in Early DDoS Target Detection

by Xuan Dau Hoang and Hong Ky Pham

Future Internet 2016, 8(4), 52; https://doi.org/10.3390/fi8040052 - 25 Oct 2016

Cited by 1 | Viewed by 7829

Abstract

On the high-speed connections of the Internet or computer networks, the IP (Internet Protocol) packet traffic passing through the network is extremely high, and that makes it difficult for network monitoring and attack detection applications. This paper reviews methods to find the high-occurrence-frequency [...] Read more.

On the high-speed connections of the Internet or computer networks, the IP (Internet Protocol) packet traffic passing through the network is extremely high, and that makes it difficult for network monitoring and attack detection applications. This paper reviews methods to find the high-occurrence-frequency elements in the data stream and applies the most efficient methods to find Hot-IPs that are high-frequency IP addresses of IP packets passing through the network. Fast finding of Hot-IPs in the IP packet stream can be effectively used in early detection of DDoS (Distributed Denial of Service) attack targets and spreading sources of network worms. Research results show that the Count-Min method gives the best overall performance for Hot-IP detection thanks to its low computational complexity, low space requirement and fast processing speed. We also propose an early detection model of DDoS attack targets based on Hot-IP finding, which can be deployed on the target network routers. Full article

(This article belongs to the Special Issue Cyber Warfare)

► Show Figures

Figure 1

Search Results (12)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (12)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI