1. Introduction
Federated learning (FL) enables multiple parties to collaboratively train a shared model under the principle of “data stays local, only parameters travel” [
1,
2,
3]. This paradigm has attracted increasing attention in privacy-sensitive applications such as medical diagnosis, financial risk control, and industrial monitoring, where data centralization is restricted. Nevertheless, keeping raw data local does not automatically eliminate privacy risks: gradients or model updates uploaded by clients may still encode statistical information about local training data, which can be exploited by adversaries to infer whether specific samples participated in training (membership inference) and further reveal sensitive attributes or distributional characteristics of participants [
4,
5,
6,
7,
8,
9]. Such leakage poses serious threats to individual privacy and institutional data security. Therefore, providing quantifiable and controllable privacy protection while maintaining model utility remains a critical challenge for real-world FL deployment.
Differential privacy (DP) has been widely incorporated into FL as a principled mechanism that provides formal privacy guarantees and an explicit privacy–utility trade-off [
10,
11,
12]. In typical client-level DP-FL, each client update is norm-clipped to bound sensitivity, and Gaussian noise is injected before aggregation. Privacy accounting mechanisms (e.g., Rényi differential privacy, RDP) are further used to track the accumulated privacy loss across communication rounds [
11,
13]. Despite its practicality, many DP-FL deployments still rely on fixed global clipping thresholds. When client update magnitudes are relatively homogeneous, a fixed threshold may offer acceptable training stability. However, under pronounced Non-IID heterogeneity, update magnitudes and convergence stages can vary substantially across clients and training phases, making a single static threshold difficult to accommodate. Specifically, an overly small threshold may over-compress many client updates, causing excessive information loss and slowing convergence; an overly large threshold increases the effective clipping threshold, allowing a few large updates to exert disproportionate influence after clipping and potentially inducing aggregation instability or bias. Moreover, because the noise magnitude commonly scales with the clipping threshold, an excessively large threshold introduces redundant noise, which may drown out small-magnitude updates—especially in later training stages—thereby degrading model utility and complicating the privacy–utility balance.
To address these challenges, we propose AdaCT-DPFL, a differentially private federated learning method with adaptive clipping thresholds. AdaCT-DPFL estimates the global clipping threshold at the server side in each communication round via a quantile-based strategy over the distribution of client update norms, enabling the clipping strength to adapt to heterogeneous clients and evolving training phases. Meanwhile, the noise scale is adjusted in accordance with the estimated threshold, alleviating both over-clipping and noise redundancy induced by fixed thresholds. As a result, AdaCT-DPFL improves convergence stability and final performance under identical privacy constraints, and enhances robustness against membership inference attacks.
Contributions
We identify a key mismatch in client-level DP-FL under Non-IID heterogeneity: fixed clipping can simultaneously cause over-clipping for many clients and redundant noise for small updates in later rounds.
We propose AdaCT-DPFL, which estimates a round-wise global clipping threshold via quantile statistics of client update norms and synchronizes Gaussian noise calibration accordingly.
We clarify privacy accounting under adaptive thresholds using the RDP accountant, explaining how Clipping Thresholds vary under proportional noise calibration.
We validate the method on CIFAR-10 and a real-world rail defect dataset with black-box membership inference attacks using multiple attack classifiers.
2. Related Work
Although FL keeps raw data local, prior studies have shown that gradients or model updates can still leak information about local training distributions, leading to practical privacy risks in federated deployments [
4,
5,
6,
7,
8]. Membership inference attacks are commonly used to empirically assess such leakage in both centralized and federated settings [
5,
9], thereby motivating privacy mechanisms that provide rigorous protection beyond data locality.
To achieve formal and quantifiable privacy guarantees, DP has been extensively adopted in FL, establishing an explicit privacy–utility trade-off with principled accounting of privacy loss across training rounds [
10,
11,
12]. In client-level DP-FL, sensitivity is typically bounded by applying norm clipping to each client update, followed by Gaussian perturbation prior to aggregation, and cumulative privacy consumption is tracked by standard accounting techniques such as RDP [
11,
12,
13]. Recent DP-FL defenses further explore mitigating inference attacks using randomized response or related randomization mechanisms under DP settings [
14]. Beyond the classical DP-FL pipeline, prior work has explored system- and algorithm-level designs for privacy-preserving distributed optimization, and surveys further summarize design choices and practical considerations for privacy-preserving machine learning, including cryptographic approaches and their challenges [
12,
15,
16,
17].
A central challenge in DP-FL is mitigating the utility degradation induced by clipping and noise. Existing studies have proposed improvements from multiple angles, including adaptive clipping and local training refinement to better match heterogeneous update scales, adaptive local-DP formulations, and mechanisms designed to improve robustness and performance under heterogeneous or edge environments [
18,
19,
20]. Other work targets realistic training issues such as imbalance and evolving dynamics, proposing training adjustments under DP constraints to alleviate performance loss [
21,
22]. Moreover, dynamic or personalized federated learning with adaptive DP has been studied to better handle client heterogeneity by adapting learning behavior or privacy configurations across clients and stages [
23]. In parallel, research on differentially private optimization investigates optimization-aware strategies (e.g., adaptive preconditioning) under privacy constraints, providing insights into the interaction between DP noise and optimizer dynamics [
24]. Among these adaptive methods, DP-FedPUAC [
18] adapts clipping on the client side by adjusting the clipping bound together with local iteration optimization, while AdaClip-style approaches similarly adapt clipping using heuristic or per-client statistics to stabilize DP-SGD/DP-FL training [
25]. In contrast, our method estimates a round-wise global clipping threshold using quantile statistics over participating clients’ update norms and synchronizes Gaussian noise calibration with the round-wise threshold, which differs from adaptive DP optimization (e.g., adaptive preconditioning [
24]) that mainly modifies optimizer dynamics under DP constraints rather than performing round-wise global threshold estimation with matched noise calibration.
Nevertheless, many practical DP-FL deployments still adopt fixed global clipping thresholds or static noise calibration. Under Non-IID heterogeneity and time-varying training dynamics, static thresholds may over-clip a large portion of clients while allowing a few large updates to exert disproportionate influence after clipping. Furthermore, when the noise scale is coupled with the clipping threshold, static threshold choices may inject redundant perturbations during phases where most updates are small, suppressing effective learning signals and undermining the privacy–utility balance. These limitations motivate globally consistent adaptive thresholding mechanisms that can track the evolving distribution of client update magnitudes.
3. Problem Formulation
Let denote the number of clients. Client holds a local dataset with samples, and the total number of samples is , the global model parameters at round are denoted by . After local training in round , client produces a model update , and its -norm is . The clipping threshold at round is . We use to denote the noise multiplier, so the Gaussian noise standard deviation is .
The local empirical risk of a client is defined as
where
denotes the loss function. Client-level differentially private federated learning optimizes the global objective:
Differential Privacy (DP) [
13,
16,
24] is a formal privacy framework providing theoretical guarantees against identifying private data within datasets. Within DP,
is the privacy budget and
is the upper bound on the failure probability. For any pair of adjacent datasets
and
that differ in one record, and any measurable subset
of outputs, a randomized mechanism
satisfies
if:
In client-level DP-FL, each client computes a local update
and applies
-norm clipping to bound its sensitivity:
Gaussian noise is then injected before aggregation:
where
is the noise multiplier determined by the privacy accountant for a target
, and the corresponding noise standard deviation is
. In many DP-FL methods, the clipping threshold
and noise multiplier
are set as fixed hyperparameters throughout training. However, under Non-IID heterogeneity, update magnitudes can vary substantially across clients and training phases, making a single static threshold difficult to accommodate. Similarly, in common Non-IID partitioning schemes for federated image classification (e.g., Dirichlet-based partitioning), class proportions across clients are often highly imbalanced, resulting in skewed label distributions. Such statistical heterogeneity is reflected in the optimization process, leading to pronounced scale discrepancies in update norms across clients and communication rounds, and exhibiting stage-wise variations throughout training.
To characterize the effective update strength under fixed configurations, the post-clipping update norm can be written as.
Based on this, we introduce an intuitive signal-to-noise ratio (SNR) proxy, where the injected Gaussian noise scale is proportional to the clipping bound
under the standard DP-SGD mechanism [
26]:
Note that Equation (7) is only used as an intuitive proxy to explain the mismatch between effective update magnitude and injected noise; privacy protection is quantitatively evaluated by the MIA ROC-AUC of membership inference attacks in
Section 5.
When a portion of clients exhibit update magnitudes below the clipping threshold in most rounds, clipping has limited impact on their updates. However, noise intensity remains governed by the predefined privacy configuration, resulting in low effective update strength. This makes critical information more susceptible to being drowned out by noise, thereby diminishing their training contribution. Conversely, for clients frequently reaching the clipping threshold, their updates typically remain near the threshold scale after clipping. Under identical noise intensity, they exhibit higher effective update strength, granting them greater influence in aggregation. Simultaneously, the distinctive features of their uploaded updates become clearer, making privacy attacks like membership inference easier to execute by providing exploitable discriminative evidence. As training progresses into later stages, the magnitude of updates from most clients generally decreases. Maintaining a fixed threshold configuration will further exacerbate the problem of insufficient effective updates from the majority of clients, slowing convergence and limiting final performance. Simultaneously, the few clients that still maintain large update magnitudes exhibit stronger relative effective update strength, leading to greater update distinguishability and increased privacy attack risks.
4. Methods
4.1. Overall Framework and Design Approach
To mitigate the mismatch between fixed clipping thresholds and static noise calibration under Non-IID heterogeneity—where update magnitudes vary across clients and training stages and a single global configuration can lead to over-clipping or redundant perturbation—we propose AdaCT-DPFL, a differentially private federated learning method with round-wise adaptive clipping thresholds. The overall workflow is illustrated in
Figure 1, and can be summarized as a round-wise communication procedure:
- (1)
Broadcast: The server broadcasts the current global model to selected clients .
- (2)
Local training: Each client performs local training and computes its update .
- (3)
Norm reporting: Each client computes the update norm and sends (a scalar) to the server for threshold estimation.
- (4)
Threshold estimation: The server aggregates and computes a round-wise global clipping threshold using a quantile strategy.
- (5)
Threshold broadcast: The server broadcasts to clients in .
- (6)
Clipping and noise addition: Each client clips its update using and adds Gaussian noise calibrated as .
- (7)
Aggregation: Clients upload perturbed updates to the server, which aggregates them to update the global model .
The AdaCT-DPFL algorithm comprises two functional modules:
- (1)
Threshold-Adaptive Estimation Module: The server aggregates statistics of client update norms, computes a per-round global clipping threshold using a quantile-based strategy, and broadcasts the threshold to all clients.
- (2)
Adaptive Noise Perturbation Module: Each client clips its update using and injects Gaussian noise with a standard deviation of (where is the noise multiplier) before uploading the perturbed update to the server for global aggregation.
To improve clarity and reproducibility, we summarize the complete procedure of AdaCT-DPFL in Algorithm 1.
| Algorithm 1. AdaCT-DPFL |
Input: Global rounds ; local epochs ; participating client set at round ; quantile coefficient , ; noise multiplier ; learning rate ; client datasets ; initial global model . Output: final global model . Server-side Initialize for do select participating clients broadcast to all Phase A (norm reporting): parallel for each do Local-Train client model update (Equation (8)) norm of the client update (a scalar, Equation (9)) Send scalar to server. end parallel form (Equation (10)) set of client update norms compute from quantile-based threshold estimation (Equation (11)) broadcast to all Phase B (clipping and perturbation): parallel for each do (Equation (4)) (Equation (5)) Upload to server. end parallel aggregate and update end for Return . Note: is the noise multiplier; noise standard deviation at round is . |
Next, we detail the two key components in Algorithm 1: (i) quantile-based estimation of the round-wise global clipping threshold
(
Section 4.2), and (ii) synchronized noise calibration and perturbation of clipped updates (
Section 4.3).
4.2. Adaptive Clipping Threshold
Let the set of selected (online) clients at round
be
. After receiving the global model
, each client
performs local training and obtains a local model
. The client update is defined as
Each client computes the update norm
The server collects the norms from participating clients and forms the norm set
The round-wise global clipping threshold is then estimated by a quantile strategy:
Here controls the clipping intensity. Because is derived from the distribution of client update norms in the current round, it adapts to cross-client heterogeneity and stage-wise training dynamics. When most updates are large (e.g., early training), increases to avoid widespread over-clipping; as training progresses and updates become smaller, decreases accordingly to reduce unnecessary perturbation. Quantile estimation is also robust to a small number of extreme norms, helping prevent abrupt oscillations.
In our design,
bounds sensitivity at round
. Noise is calibrated proportionally to
(
Section 4.3), keeping the effective noise-to-clipping ratio (noise multiplier) consistent for privacy accounting.
4.3. Adaptive Noise Addition
After receiving
, each client performs
-norm clipping as shown in Equation (4):
This ensures that the clipped update satisfies . The clipping operation explicitly bounds the magnitude of each client’s uploaded update within , preventing a single client’s large update from dominating the global model during aggregation.
By compressing update magnitudes from different clients and training phases into a comparable dynamic range, cross-client update discrepancies are effectively suppressed, thereby weakening the differential statistical cues exploitable by membership inference attacks and improving aggregation stability and convergence performance under Non-IID settings. Moreover, during training, as gradients diminish, the relative impact of injected noise becomes increasingly pronounced; as the number of training iterations increases, the update norms typically decrease accordingly. Therefore, to reduce noise interference for each client model and improve model accuracy, the noise scale is dynamically adjusted according to the clipping threshold. The noisy update uploaded to the server is defined shown in Equation (5):
where
denotes the noise multiplier determined by the privacy accountant for a target
. Therefore, the Gaussian noise standard deviation at round
is
. This proportional calibration enables the clipping bound and perturbation strength to evolve together across rounds while following a consistent privacy accounting configuration.
Privacy accounting clarification. Although varies across rounds, Equation (5) calibrates noise proportionally to , so the effective noise multiplier used by the accountant remains the same. Consequently, the privacy budget is accumulated across rounds using the standard RDP accountant configuration (sampling rate, noise multiplier, and number of rounds), while the adaptive threshold mainly improves utility by reducing over-clipping in large-update phases and redundant perturbation in small-update phases.
5. Experiments and Discussion
5.1. Experimental Setup
To validate the effectiveness of the proposed AdaCT-DPFL algorithm, we compare it with two representative client-level differentially private federated learning methods: DP-FedAvg [
11] and Fed-DPA [
23]. Privacy budgets are uniformly computed and controlled using the Rényi Differential Privacy (RDP) accountant implemented in Opacus 1.5.3. All experiments are conducted in Python 3.10 and PyTorch 2.6.0 on a single NVIDIA GeForce GTX 1660 GPU (NVIDIA Corporation, Santa Clara, CA, USA).
AdaCT-DPFL introduces minimal additional overhead: each client only reports a scalar update norm per round for the global clipping threshold estimation, and the server-side quantile computation for is lightweight over scalars. Thus, the method is practical for cross-silo deployments such as multi-section railway condition monitoring, where client participation is limited and communication is relatively stable. In practice, the norm-reporting step can be protected by secure aggregation or by adding small noise to the reported norms, following common practice in adaptive clipping literature. Overall, AdaCT-DPFL does not require changes to model architectures or client data pipelines, making it feasible for real-world deployment.
5.1.1. Attack Model
After federated training, we assume that the global model is deployed as an online inference service. The adversary has black-box query access to the target model: for a given input sample, it can only obtain the predicted probability vector, but has no access to model parameters, gradients, or the training data. In addition, the adversary cannot control any client. The goal is to determine whether a queried sample was included in the target model’s training set, i.e., to perform a binary classification between member (participated in training) and non-member (did not participate) samples, thereby conducting a membership inference attack.
Under this threat model, we construct the membership inference dataset for each target model as follows. Samples used during federated training are collected as the member set, while samples from the test set that were not used for training are collected as the non-member set. The target model then performs forward inference on both sets, and we record the output probability vector for each sample. We assign label 1 to member samples and label 0 to non-member samples, forming an attack dataset with member/non-member labels. Finally, the dataset is split into the attacker’s training and testing sets to train and evaluate the membership inference attacker.
To exploit privacy-related signals in the target model outputs, we adopt three attack classifiers—Random Forest (RF), Gradient Boosting (GB), and Decision Tree (DT)—using the full predicted probability vector as the input feature. These models learn a mapping from probability distribution patterns to member/non-member labels. Compared with threshold-based attacks that rely on a single statistic (e.g., loss or maximum confidence), using the full probability vector leverages not only the confidence on the true label but also the relative relationships among class probabilities. This enables more fine-grained characterization of overfitting behavior and provides more discriminative evaluation of the privacy protection capability of different DP-FL algorithms under the same experimental setting.
5.1.2. Evaluation Metrics
This paper evaluates the proposed method from two dimensions: model utility and privacy protection. For utility, classification accuracy (Accuracy, denoted as Acc) on the global test set is used to measure the target model’s predictive performance. For privacy, the ROC-AUC of the membership inference attacker on the attack test set serves as the attack effectiveness metric, measuring the attacker’s ability to distinguish between member and non-member samples, thereby reflecting the target model’s privacy leakage risk.
For the global classification model obtained through federated learning, evaluation is conducted using the classification accuracy on the global test set, expressed as:
where
,
,
, and
represent true positive, false negative, false positive, and true negative classes, respectively. Accuracy ranges from
, with higher values indicating better overall prediction performance.
Treating the membership inference attacker as a binary classifier distinguishing “member/non-member,” the area under the ROC curve (ROC-AUC) is used to measure its discriminative capability. The ROC curve plots the false positive rate (FPR) on the
x-axis and the true positive rate (TPR) on the
y-axis, where
ROC-AUC provides an overall metric for the relationship across different decision thresholds, exhibiting threshold-independent characteristics. The ROC-AUC ranges from . A higher ROC-AUC value closer to 1 indicates stronger attacker discrimination capability and easier attack success, corresponding to higher privacy leakage risk for the target model. Conversely, a ROC-AUC closer to 0.5 suggests attacker discrimination capability random guessing, indicating better privacy protection effectiveness for the target model.
5.2. Experimental Validation
Two datasets were used for validation.
- (1)
CIFAR-10 Image Dataset: The CIFAR-10 dataset [
27] consists of 10 classes of RGB color images. It contains 60,000
color images, with 50,000 for training and 10,000 for testing. The ten categories are airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck, as shown in
Figure 2.
The CIFAR-10 training set is partitioned across clients into Non-IID subsets using a Dirichlet distribution with concentration parameter , serving as each client’s local training data. The CIFAR-10 test set is used to evaluate the global model performance. On CIFAR-10, we employ a convolutional neural network consisting of three convolutional layers and three fully connected layers. A lightweight CNN is employed for CIFAR-10 experiments to ensure training stability under DP noise and fair performance comparisons among DP-FedAvg, Fed-DPA and AdaCT-DPFL. Importantly, AdaCT-DPFL modifies the clipping-threshold estimation and noise calibration mechanism rather than the model architecture. Therefore, the proposed method is architecture-agnostic and can be integrated with stronger backbones (e.g., ResNet-style networks) and realistic FL workloads without changing its core design. We will further evaluate larger backbones and cross-device settings in future work.
- (2)
Real-world rail defect dataset
The experimental dataset consists of rail vibration signals collected from multiple track sections managed by a Chinese railway locomotive depot. Accelerometers mounted on a rail defect inspection vehicle recorded vibration responses induced by wheel–rail interactions during train operation. Each track section is treated as one client, and its data form the local dataset of that client, thereby constructing a cross-section federated learning setting. Sample labels are determined based on on-site maintenance records and manual verification, covering three typical defect types: rail corrugation, fish-scale defects, and spalling. The raw vibration signals are first processed by outlier removal and normalization, and then segmented into fixed-length time windows to construct input samples. The data acquisition and preprocessing pipeline is illustrated in
Figure 3.
After client partitioning, each client further splits its data into training and testing sets with an 8:2 ratio. The training samples remain on the client for local training, while the testing samples are aggregated to form a global test set for unified evaluation. Due to variations in inspection frequency, operating conditions, and sensor installation across different track sections, clients exhibit both sample-size imbalance and feature distribution shifts, resulting in a typical non-IID setting. On the rail defect dataset, we employ a lightweight convolutional neural network consisting of two convolutional layers and two fully connected layers.
5.2.1. Analysis of CIFAR-10 Dataset Experimental Results
Experimental parameters for the CIFAR-10 dataset were set as follows: 10 clients, 30 communication rounds, 4 local training epochs per round, a batch size of 64, and a learning rate of .
Membership inference attacks were conducted on the target models trained by DP-FedAvg, Fed-DPA, and the proposed AdaCT-DPFL method under different privacy budgets (
). Results are reported in
Table 1. As
increases, the ROC-AUC of membership inference attacks generally increases, indicating that attacks become more successful under weaker privacy protection. Under the same privacy budget, the proposed method consistently achieves lower ROC-AUC across different attacker configurations and remains closer to 0.5 overall, suggesting that the attacks approach random guessing and that the proposed method provides stronger protection against membership inference. This improvement is attributed to adaptive clipping, which suppresses the stage-wise dominance of a small number of anomalous updates during aggregation in non-IID settings, thereby weakening memorization of training samples and reducing discriminative cues exploitable by attackers.
Table 2 reports the classification accuracy of the three algorithms on the CIFAR-10 dataset under different privacy budgets. As
increases, the accuracy of all three methods improves overall, indicating that the target model attains better utility under weaker privacy protection (i.e., reduced relative noise perturbation). Compared with DP-FedAvg and Fed-DPA, the proposed method achieves higher classification accuracy across all privacy budget settings. This gain mainly arises from the dynamic matching of the adaptive clipping threshold and adaptive noise scale across training stages: it mitigates information loss caused by over-clipping in the early stage, while reducing the interference of injected noise on small-magnitude updates in the later stage. Consequently, the proposed method improves classification accuracy under the same privacy budget.
Figure 4 shows the accuracy convergence curves over communication rounds for the three algorithms on the CIFAR-10 dataset. Compared with DP-FedAvg and Fed-DPA, the proposed method achieves a higher and more stable final accuracy. These results suggest that adaptive thresholding stabilizes the training dynamics and enables more efficient convergence under non-IID settings.
In summary, the results on CIFAR-10 demonstrate that, under the same privacy budget, the proposed method further reduces the discriminative power of membership inference attacks while effectively improving classification accuracy. Moreover, it exhibits more stable convergence during training, thereby validating the effectiveness of the proposed approach.
5.2.2. Analysis of Experimental Results on the Rail Dataset
For the rail defect dataset experiments, the hyperparameters were set as follows: 10 clients, 40 rounds of global communication, 5 local training epochs per round, a batch size of 16, and a learning rate of .
Membership inference attacks were conducted on target models trained by DP-FedAvg, Fed-DPA, and the proposed AdaCT-DPFL under different privacy budgets (
), with results reported in
Table 3. Under the same
, AdaCT-DPFL achieved lower ROC-AUC across all three attackers (RF, GB, and DT). Averaged over all
settings, AdaCT-DPFL achieved an average ROC-AUC of 0.528, which is lower than that of DP-FedAvg and Fed-DPA by 0.9% and 0.4%, respectively, indicating that the attackers’ discrimination capability becomes closer to random guessing. These results demonstrate the stronger privacy protection of the proposed method.
Table 4 presents the classification accuracy comparison results of the three algorithms on the rail defect dataset. Under different
settings, the proposed method consistently achieves higher accuracy than DP-FedAvg and Fed-DPA. Taking
as an example, the proposed method achieves an accuracy of 85.86%, representing improvements of 4.16% and 1.34% over DP-FedAvg and Fed-DPA, respectively. Under a larger privacy budget (
), the proposed method achieves an accuracy of 88.84%, representing improvements of 2.38% and 1.04% over DP-FedAvg and Fed-DPA, respectively.
Figure 5 presents the accuracy convergence curves on the rail defect dataset. Compared with DP-FedAvg and Fed-DPA, the proposed method captures informative patterns more effectively in the early training phase and reaches higher accuracy within the same number of communication rounds. During the middle and late stages, DP-FedAvg and Fed-DPA exhibit less stable convergence with more pronounced fluctuations. In contrast, AdaCT-DPFL achieves faster and more stable convergence by coordinating adaptive clipping thresholds and adaptive noise scaling to better accommodate heterogeneous update magnitudes across clients, thereby reducing aggregation oscillations caused by mismatched clipping and redundant noise.
In summary, this section compares DP-FedAvg, Fed-DPA, and the proposed method on the rail defect dataset from three aspects: resistance to membership inference attacks, classification accuracy, and convergence behavior. The results show that, under the same privacy budget, the proposed method further reduces the attacker’s discriminative capability and achieves better accuracy and more stable convergence than the baselines, demonstrating superior overall performance under non-IID settings with pronounced client heterogeneity.
6. Conclusions
This paper proposes AdaCT-DPFL, a differentially private federated learning framework with quantile-based adaptive clipping thresholds and synchronized noise scaling. By dynamically estimating the clipping threshold according to the distribution of client update norms in each communication round, the proposed method mitigates the mismatch between heterogeneous update magnitudes and fixed privacy configurations under Non-IID settings. Extensive experiments on CIFAR-10 and a real-world rail defect dataset demonstrate that, under identical privacy budgets, AdaCT-DPFL achieves improved privacy–utility trade-offs compared with DP-FedAvg and Fed-DPA. Specifically, it reduces the discriminative capability of membership inference attacks while maintaining competitive and more stable classification performance. The adaptive threshold mechanism effectively alleviates over-clipping in early training stages and redundant noise perturbation in later stages. However, this work has several limitations. First, the current privacy analysis is primarily based on RDP accounting under a round-wise adaptive threshold, and a tighter theoretical bound for dynamically varying sensitivity remains to be investigated. Second, experiments are conducted under moderate-scale FL settings; larger cross-device scenarios with partial participation should be further explored. Third, stronger attack models, including adaptive adversaries and gradient-based inference attacks, should be evaluated. Future research will focus on (1) deriving formal privacy bounds under adaptive sensitivity; (2) extending the framework to personalized and hierarchical FL architectures; (3) integrating secure aggregation to provide end-to-end protection; and (4) exploring adaptive threshold learning mechanisms beyond quantile estimation, such as learning-based or Bayesian threshold selection strategies.