- Article
BAF–FedLLM: Behavior-Aware Federated Modeling of Student Actions via Privacy-Preserving Large Language Model
- Wei Ji,
- Zuobin Ying and
- Hanying Gan
Analyzing fine-grained student actions across institutions can drive timely feedback, early warning, and personalized support, yet it is constrained by privacy regulations, heterogeneous curricula, and non-IID behavior logs. This paper introduces BAF–FedLLM, a behavior-aware federated modeling framework that adapts large language models to next-action and outcome prediction without centralizing student data. The key idea is to treat multichannel interaction streams as semantically typed action tokens linked by a learned ActionGraph, and to align their temporal structure with an LLM through behavior prompts that inject domain context (task, resource, pedagogy, and affordance cues). We propose three novel components: (i) BP–FIT, a behavior-prompted federated instruction tuning scheme that trains low-rank adapters locally and aggregates them with secure masking and Rényi–DP accounting to ensure client-level privacy; (ii) ProtoAlign, a cross-client prototype contrastive objective that shares only noisy class-conditional anchors via secure aggregation to mitigate drift under non-IID partitions; and (iii) CBR, a causal behavior regularizer that penalizes intervention-sensitive shortcuts by enforcing invariance of predicted risks across detected instructional regimes. We further derive convergence guarantees for federated instruction tuning with noisy, partial participation and provide end-to-end privacy bounds. On three public education datasets (EdNet, ASSISTments, and OULAD) with institution-level partitions, BAF–FedLLM improves next-action AUC by 4.2–7.1% over strong federated baselines while reducing expected calibration error by up to 28% and communication by through adapter sparsity, under a typical privacy budget of
at
. These results indicate that behavior-aware prompting and prototype alignment make LLMs practical for privacy-preserving student action analysis at scale, offering a principled path to deployable, regulation-compliant analytics across diverse learning ecosystems.
9 February 2026







