Interpretable SLM-Driven Trust Framework for Smart Cities: Managing Distributed Energy Resources in Networked Microgrids

Iqbal, Razi; Hamill, Nathan Stuart

doi:10.3390/smartcities8060186

Open AccessArticle

Interpretable SLM-Driven Trust Framework for Smart Cities: Managing Distributed Energy Resources in Networked Microgrids

by

Razi Iqbal

^*

and

Nathan Stuart Hamill

Department of Computer Science, Central Michigan University, Mount Pleasant, MI 48859, USA

^*

Author to whom correspondence should be addressed.

Smart Cities 2025, 8(6), 186; https://doi.org/10.3390/smartcities8060186

Submission received: 19 September 2025 / Revised: 23 October 2025 / Accepted: 30 October 2025 / Published: 5 November 2025

(This article belongs to the Special Issue Energy Strategies of Smart Cities)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

Design a framework that leverages an SLM for parsing unstructured DER data into key trust attributes like availability, reliability, productivity, stability and reputation.
Develop a neural network-based trust framework that uses extracted indicators to compute precise and adaptive trust scores.
Integrate both intrinsic and extrinsic explainability methods to provide transparency in the model’s decision making process that increases user confidence in AI-driven trust delivery.

What are the implications of the main findings?

The transformation of unstructured data into meaningful metrics for improved automation in an NMG environment.
The calculation of overall trust for the DER and also enhancing the interpretability of the model.

Abstract

Networked Microgrids (NMGs) have revolutionized the energy landscape by enhancing grid flexibility and decentralizing power generation, playing a pivotal role in the development of smart cities. Distributed Energy Resources (DERs) are a fundamental component of a typical NMG; hence, their trustworthiness is of utmost importance for the reliable and efficient operation of NMGs within smart city environments. However, the processing and analysis of unstructured data when performing trust assessments of these DERs is still not well explored. This research fills this gap by proposing a new trust framework that leverages the advanced capabilities of Neural Networks to assess the trustworthiness of DERs in NMGs. Furthermore, the proposed framework analyzes and converts the unstructured data from DERs into a structured format for generating trust scores for DERs. There are two primary components of this framework: (1) an SLM (Small Language Model)-based module for data analysis, (2) a neural network-based module for trust score calculation. These two components provide an end-to-end process for transforming an unstructured input into meaningful trust metrics. Several experiments were conducted to evaluate the performance of the proposed framework, and it turned out that the results produced by the proposed framework were highly precise, accurate and consistent. Furthermore, the proposed framework outperformed the existing frameworks in size and efficiency, making it a promising solution for trustworthy DER management in smart city microgrid ecosystems.

Keywords:

networked microgrids (NMGs); distributed energy resources (DERs); trust; small language models (SLMs); neural networks

1. Introduction

The rapid advancement of smart cities relies on innovative energy systems that ensure sustainability, resilience and efficiency. Networked Microgrids (NMGs) are the cornerstone in this domain due to their interconnected microgrid clusters that are reliable, efficient and provide operational flexibility [1]. NMGs utilize Distributed Energy Resources (DERs) such as solar panels, wind turbines, generators, etc., to generate localized energy solutions that align with smart city objectives of reducing carbon footprints and optimizing resource utilization in a highly dynamic urban environment.

Motivation

In smart city energy systems, NMGs often operate in decentralized or hybrid configurations, where individual DERs, such as a solar panel on a resident’s house, contribute power to the local grid, enhancing energy resilience and sustainability. For example, in Copenhagen, Denmark, community-owned solar panels and wind turbines integrate with municipal grids in a hybrid setup, enabling localized power generation that supports the city’s carbon neutrality by 2025 goal (see Copenhagen’s Smart City Energy Plan: https://urbandevelopmentcph.kk.dk, accessed on 23 October 2025). In such systems, the trustworthiness of each DER is critical to ensure stable operation, as a single faulty or malicious unit (e.g., a compromised solar inverter) can disrupt power supply or compromise grid security. Even in centralized grid configurations, where a state utility may control power input, assessing individual DER trustworthiness ensures reliable contributions, facilitating efficient energy management and fault prevention across the NMG. This motivates the need for a trust evaluation framework capable of processing unstructured DER data, such as SCADA logs, to derive actionable trust metrics for smart city applications.

Moreover, DERs remain vulnerable to tampering whether through physical interference, software exploits or data manipulation even when operated by private power companies or integrated into virtual power plants (VPPs). For instance, in the U.S., the National Renewable Energy Laboratory (NREL) has demonstrated that compromised rooftop solar inverters in residential microgrids can inject false telemetry data or destabilize frequency regulation [2]. Thus, individual DER trust assessment serves as an essential layer of defense, enabling internal validation within private utilities, resilience in VPPs and overall grid integrity in smart city deployments [3]. This motivates the need for a trust evaluation framework capable of processing unstructured DER data, such as SCADA logs, to derive actionable trust metrics for smart city applications.

Several trust frameworks tailored for smart grids and microgrids are available in the existing literature; however, most of them rely on structured telemetry data or predefined metrics [4]. In real-world smart city deployment, DERs often generate unstructured data, including log files, technical reports, status messages and other textual information. This unstructured data contains rich contextual information about DER behavior that mostly remains unexplored due to the limitations of conventional data processing techniques, as illustrated in Case A in Figure 1. Furthermore, most of the existing trust frameworks lack either explainability or adaptability to dynamically changing trust contexts or are very sensitive to noise in the training data, as shown in Case B in Figure 1. Consequently, there exists a pressing need for a trust evaluation system that can effectively interpret unstructured data and derive actionable insights from it to support smart city energy infrastructure.

Recent advances in Language Models (LMs) and Neural Networks have demonstrated remarkable success in analyzing unstructured data using techniques like Natural Language Processing (NLP), classification and information extraction [5]. These techniques provide unique opportunities to reimagine trust assessment methodologies especially in systems like NMGs where non-tabular data is common.

In this research, we propose a novel framework specifically designed for NMGs. The framework comprises two key components:

An SLM module capable of processing unstructured DER data to extract meaningful trust indicators including availability, reliability, productivity, stability and reputation.
A Neural Network-based trust framework that uses these trust indicators to calculate overall trust for the DER and also enhances the interpretability of the model.

This two-stage architecture bridges the gap between raw DER-generated data and informed decision making at the control center. The SLM ensures contextual extraction while the neural network provides adaptability, interpretability and accuracy in trust assessment. In order to access the effectiveness of the proposed framework, we conducted a series of experiments. The results indicated that our framework achieves a high precision and consistency in trust evaluation, outperforming traditional methods in performance and efficiency. Furthermore, by transforming unstructured data into meaningful metrics, the framework significantly improves interpretability and automation in an NMG environment. The major contributions of this research are listed below:

Design a framework that leverages an SLM for parsing unstructured DER data into key trust attributes like availability, reliability, productivity, stability and reputation.
Develop a neural network-based trust framework that uses extracted indicators to compute precise and adaptive trust scores.
Integrate both intrinsic and extrinsic explainability methods to provide transparency in the model’s decision making process that increases user confidence in AI-driven trust delivery.

The rest of this paper is organized as follows: Section 2 presents related work in the domain of contextual information extraction and trust assessment. Section 3 describes the design and architecture of the proposed framework. Section 4 presents the results and discussion on the experiments conducted to assess the suitability of the proposed framework. Finally, Section 5 concludes the paper and outlines future research directions.

2. Related Work

2.1. Language Models

Recent studies on Language Models (LMs) have highlighted their potential across multiple domains of life when it comes to productivity and efficiency. Unlike traditional AI systems that mostly rely on predefined rules, LMs can process and interpret diverse forms of data that enable them to not only perform rapid data analysis but also support informed decision making in complex and dynamic environments. With IoT devices generating an enormous amount of structured and unstructured data, there is a growing interest in leveraging LMs to extract actionable insights from heterogeneous sources. This section provides a summary of the existing studies conducted on the various features, applications and challenges of LMs with a particular focus on their role in data processing in IoT systems.

Zong et al. [6] studied the application of LLMs (Large Language Models) in IoT through three case studies: DDoS attack detection, macroprogramming over IoT systems and sensor data processing. Their study revealed that the GPT model provides good accuracy with few-shot learning but can provide even better accuracy when fine-tuned. They concluded that the GPT model shows efficacy in processing a vast amount of sensor data by offering rapid and high quality responses compared with using traditional machine learning models.

Shirali et al. [7] highlighted the importance of utilizing LLMs in event abstraction and integration. Their approach aims to create event records from raw sensor data collected from diverse IoT devices and merge them into a single event log for further processing. They not only used LLMs for detecting and labeling sensor data but also developed methods for abstracting events based on data streams which are suitable for online applications.

Kök et al. [8] explored the integration of LMs with edge, fog and cloud computing paradigms to show how this synergy can assist in providing scalable solutions for complex IoT applications. They introduced a system model for predictive maintenance and monitoring for industrial application that used Tree of Thought (ToT) reasoning method to enhance the cognitive capabilities of LLMs. They used predefined goals at each layer to select the most efficient route for an IoT task that allowed the model to derive rational results from complex tasks.

Ashiwal et al. [9] proposed a system that uses an LM for extracting vulnerability information from unstructured sources and converts them into a machine-readable format. In order to demonstrate their proposed system, they used supplier emails as unstructured data to feed the LLM, which in turn extracted useful information and transformed them into a structured CSAF VEX format that provides information about a product vulnerability. Their system contributes to improving security in software by enabling actionable vulnerability information integration into a DevSecOps workflow.

Wang et al. [10] introduced a data analytics system for unstructured data using natural language queries. They defined logical operators with LLM-based implementation to handle semantic reasoning. Their experiments showed that their model can reduce query execution time while maintaining a high accuracy for real-world unstructured data analytics. Furthermore, their system can perform semantic cost modeling and cardinality estimation in parallel, which demonstrates its effectiveness and scalability.

As illustrated in Table 1, most existing studies treat unstructured data as a secondary concern and the ones that focus on processing unstructured data rely heavily on high-cost models like GPT without optimizing it for a domain-specific context. This leads to issues of scalability, a high cost and limited contextual awareness. The framework proposed in this paper addresses these gaps by introducing a Small Language Model (SLM) which is much smaller in size than a traditional LLM and still provides excellent accuracy in information extraction from both structured and unstructured data sources.

2.2. Trust Frameworks

Trust has always been a topic of interest in computing as it plays a key role in the overall security and privacy of a system. Several trust frameworks have been proposed to evaluate and manage trust in various IoT applications including microgrids and broader cyber–physical systems. Traditional trust assessments in energy systems often rely on rule-based techniques that normally lack feature importance and adaptability and offer little insight into the decision making of the systems. A number of recent studies have introduced machine learning models such as Random Forest, Support Vector Machine (SVM) and Deep Neural Networks (DNNs) to enhance the overall performance and efficiency of these systems. This section provides a summary of existing trust framework techniques that not only proposed an efficient trust framework but also put effort into making these models more interpretable.

Faisal et al. [11] explored the integration of machine learning models with SHAP to improve both the accuracy and transparency of energy consumption forecasting. They evaluated linear regression, Support Vector Machine and Decision Tree Regressor using XAI to visualize and interpret model decisions. Their approach enhances trust in AI-driven forecasting by providing informed decision making in energy management.

Mohammad et al. [12] evaluated various XAI methods comparing trade-offs between interpretability and performance. Their study revealed that SHAP provides high trust but is less suitable for real-time applications, whereas LIME offers faster but less trustworthy explanations. The study outlined key challenges such as integrating XAI with IoT-powered digital twins along with discussing stability issues in energy systems maintenance.

Amir et al. [13] utilized the high-order synchrosqueezing transform technique to extract key features from high-voltage signals such as the time of arrival, magnitude and polarity of traveling waves, which are then fed as inputs to a Quantum Deep Neural Network (QDNN). Combining deep learning and quantum computing provided the models with a high fault localization accuracy in noisy environments. The authors have used SHAP to enable operators to understand the influence of each input feature on the model’s predictions. The approach proved to be robust across varying voltages, offering high performance and interpretability.

Joy et al. [14] studied the need for resilient energy systems during extreme weather conditions and grid disruptions. They identified critical performance indicators like reliability, stability and flexibility and emphasized the importance of advanced control strategies. They proposed an integrated framework that combines a microgrid design and operational strategies for increased resilience. Their framework also provided practical guidelines for practitioners, policy makers and researchers.

Aslam et al. [15] introduced a trust assessment framework for the Social Internet of Things (SIoT) that integrates social networking concepts with the IoT to enable efficient information and resource sharing in a dynamic environment. Unlike other existing frameworks, where the focus is mainly on the trustworthiness of the service provider, their framework evaluates the trust for the service itself. They used a parameter called service trust which was mathematically derived by combining various Quality of Service (QoS) metrics. By addressing the potential of services to act maliciously, their framework enhances the reliability of service selection in SIoT environments.

Li et al. [16] proposed a method of quantifying the explainability of machine learning models by comparing their explanations with human intuition. Unlike existing approaches that focus on validating the models technically, this method emphasizes alignment with human decision making to enhance user trust. The study revealed discrepancies between human reasoning and machine explanations that highlighted potential issues in the current machine learning model selection process.

Parisineni et al. [17] introduced a methodology that enhances the interpretability of complex machine learning models. Their methodology combines the strengths of LIME and SHAP to generate explanations, allowing for the clear interpretation of individual predictions. The methodology not only focuses on local but global model understanding across multiple instances, offering a practical and scalable solution that supports user trust and the deeper comprehension of AI-driven decisions.

Sanchez et al. [18] proposed FederatedTrust, an algorithm integrated into the FederatedScope framework, for computing trustworthiness scores. They addressed the challenges of evaluating the trustworthiness in Federated Learning (FL) models for IoT and edge computing, where centralized ML/DL methods face limitations due to distributed and sensitive data. Several experiments using the FEMNIST and N-BaIoT datasets with varying configurations validated the algorithm’s utility in real-world IoT security scenarios, emphasizing its role in ensuring trustworthy FL models for smart city applications.

Gourav et al. [19] proposed a recommendation model using hyperedge and transitive closure to enhance trust in social networks, leveraging an Influence Product Graph (IPG) to improve the recommendation accuracy for products. This graph-based trust propagation evaluated on the Epinions and FilmTrust datasets and demonstrated how relational structures can enhance decision making.

Table 2 summarizes the key features, such as the use of attention mechanisms, SHAP-based explainability and dynamic trust evaluation in the existing literature. The summary clearly highlights that most of the existing trust frameworks lack either explainability or adaptability to dynamically changing trust context or are very sensitive to noise in the training data. There is an obvious need for the design of a trust framework that offers robust and dynamic trust scoring, integrated interpretability combining built-in attention with post-hoc SHAP explanations and resilience to data noise.

3. Proposed NMG Architecture

The main focus of this research is to propose an innovative trust framework tailored to NMGs, leveraging advanced computational techniques to convert unstructured data into structured trust metrics and provide comprehensive trustworthiness evaluations. Figure 2 illustrates the proposed framework for the NMG architecture.

As shown in Figure 2, data from diverse DERs is received by the control center which preprocesses and cleans the collected data and generates metrics like availability, reliability, productivity, stability and reputation using SLM capabilities [20]. The trust framework then uses these metrics to generate a trust evaluation. These evaluations are then analyzed and sent to the DER operator along with explainable results for further decision making.

The following sections dive into the details of the components of the proposed NMG architecture to achieve precise, accurate and efficient trust assessments.

3.1. Small Language Model (SLM)

The SLM employed in this framework has a transformer-based architecture inspired by GPT models and is designed to process unstructured logs from DERs into trust metrics such as availability, reliability, productivity, stability and reputation.

As shown in Figure 3, the input to the proposed framework comprises unstructured data from heterogeneous DERs in NMGs, including SCADA logs, technical reports, status messages and operational emails. A sample of solar DER log entries is presented in Figure 4. These raw textual inputs are tokenized using a GPT-2-based tokenizer (vocabulary size: 50,257) and embedded into 256-dimensional dense vectors with positional encoding to preserve the temporal context.

The core processing is performed by a lightweight Small Language Model (SLM) consisting of four transformer blocks. Each block includes (1) layer normalization with residual connections for training stability, (2) multi-head self-attention (8 heads, head dimension = 32) to model contextual dependencies (e.g., associating fault events with reduced reliability), (3) feedforward layers (256 → 1024 → 256) with GELU activation for non-linear feature learning.

Dropout (

p = 0.1

) is applied at multiple stages to improve generalization. The final output layer applies mean pooling across the sequence, followed by a linear regression head with sigmoid activation to generate five bounded trust metrics:

\begin{matrix} z = W h + b \\ \hat{y} = σ (z) \\ \hat{y} = [{\hat{y}}_{Availability}, {\hat{y}}_{Reliability}, {\hat{y}}_{Productivity}, {\hat{y}}_{Stability}, {\hat{y}}_{Reputation}] \end{matrix}

(1)

This end-to-end pipeline transforms unstructured, real-world DER data into interpretable and actionable trust scores, enabling the proactive management of NMGs in smart cities without the reliance on structured telemetry or manual feature engineering.

To formalize the workflow and highlight the steps taken to design an SLM for processing unstructured logs into actionable trust metrics, we present Algorithm 1: SLM Architecture for Transforming Unstructured Logs to Structured Metrics.

Given the complexity of logs and the need for accurate metric predictions, the model’s architecture and training parameters are tailored to handle the unique challenges of the NMG environment. Table 3 presents the configuration and setup details of the SLM we created for this study.

Algorithm 1 SLM Architecture for Transforming Unstructured Logs to Structured Metrics

1:: Input: Unstructured log, context size $C = 1024$
2:: Output: Structured metrics ${availability, reliability, productivity, stability, reputation}$
3:: Step 1: Tokenization
4:: Convert log text to token IDs using tokenizer
5:: Resulting token sequence: $t \in Z^{L}$ , where $L \leq C$
6:: Step 2: Embeddings
7:: Compute token embeddings: $E_{tok} = W_{tok} \cdot t$
8:: Add positional embeddings: $E_{pos} = W_{pos} \cdot p$ , where $p$ is position indices
9:: Combine and apply dropout: $E = Dropout (E_{tok} + E_{pos})$
10:: Step 3: Transformer Blocks (Repeat for 4 layers)
11:: For each block:
12:: Normalize input: $x_{norm 1} = LayerNorm (x)$
13:: Compute multi-head attention (8 heads): $A = MultiHeadAttention (x_{norm 1})$
14:: Add residual: $x = x + Dropout (A)$
15:: Normalize again: $x_{norm 2} = LayerNorm (x)$
16:: Apply feedforward: $F = FFN (x_{norm 2}) = GELU (W_{1} x_{norm 2} + b_{1})$
17:: $O = W_{2} F + b_{2}$
18:: Add residual: $x = x + Dropout (O)$
19:: Step 4: Final Normalization and Logits
20:: Normalize output: $h_{norm} = LayerNorm (h)$
21:: Compute logits: $L = W_{out} h_{norm}$
22:: Step 5: Regression Output
23:: $h_{pooled} = mean (h_{norm}, \dim = 1)$
24:: $M = sigmoid (W_{head} h_{pooled} + b_{head})$
25:: $M$ is a vector ${availability, reliability, productivity, stability, reputation} \in [0, 1]$
26:: Step 6: Metric Assignment
27:: $metrics = {availability, reliability, productivity, stability, reputation} = M$

3.2. Trust Framework

Most existing trust frameworks focus on developing a model that can either compute a trust score for a node or classify it as trustworthy or untrustworthy. However, these traditional trust models are mostly rule-based and lack explanation regarding how individual components contribute to the overall trust assessment. Recent models attempting to explain the contributions of the components fail to provide feature prioritization due to a limited depth of interpretability. In order to overcome these difficulties, we propose a trust framework that not only classifies a node as trustworthy or untrustworthy but also enhances interpretability by integrating intrinsic (attention mechanism) and extrinsic (SHAP) explainability methods [21].

As illustrated in Figure 5, the framework takes metrics like availability, reliability, productivity, stability and reputation as inputs. The equation below represents these inputs in a feature vector format where

x^{j}

denotes the five metrics received from the LLM, as described in the previous section, and

y^{j}

is the binary trustworthiness label (1 for trustworthy and 0 otherwise).

D = {(x^{j}, y^{j})}_{j = 1}^{N} w h e r e x^{j} \in R^{d} (d = 5)

(2)

The attention layer then computes probabilities with weights

W_{a} \in R^{d \times d}

and biases

b_{a} \in R^{d}

as below:

\begin{matrix} z = W_{a} X + b_{a} \\ a = s o f t m a x (z) = \frac{e x p (z)}{\sum_{k = 1}^{d} e x p (z_{k})} \end{matrix}

(3)

These features are then scaled by multiplying them with attention weights to emphasize the most important features, as illustrated in the equation below:

x_{w e i g h t e d} = x \cdot a

(4)

The weighted features are then normalized such that the mean is 0 and the variance is 1:

x_{n o r m} = B a t c h N o r m a l i z a t i o n (x_{w e i g h t e d}) = γ . \frac{x_{w e i g h t e d} - μ_{B}}{\sqrt{σ_{B}^{2} + ϵ} + β}

(5)

In the above equation,

μ_{B}

,

σ_{B}

are batch mean and variance.

γ

,

β

are learned parameters and

ϵ

is a small constant to ensure numeric stability.

Two fully connected hidden layers use the ReLU activation function [22] to learn the patterns as below:

\begin{matrix} h_{1} = r e l u (W_{1} x_{n o r m} + b_{1}) \\ h_{2} = r e l u (W_{2} B a t c h N o r m a l i z a t i o n (h_{1}) + b_{2}) \end{matrix}

(6)

Finally, the output layer uses the sigmoid function for calculating the trustworthiness probabilities as below:

p = σ (w_{0}^{T} h_{2} + b_{0}) = \frac{1}{1 + e x p (- (w_{0}^{T} h_{2} + b_{0}))}

(7)

The trust label is

\hat{y} = 1

if

p \geq θ

where

θ = 0.7

else 0.

The model is trained to minimize binary cross-entropy loss [23] over M training samples:

L = - \frac{1}{M} \sum_{j = 1}^{M} [y^{j} l o g (p^{j}) + (1 - y^{j}) l o g (1 - p^{j})]

(8)

For the optimization and early stopping of validation loss, we used patience = 10 with the Adam optimizer [24] where T is the number of test samples and

I

is the indicator function:

A c c u r a c y = \frac{1}{T} \sum_{k = 1}^{T} I ({\hat{y}}^{k} = y^{k})

(9)

Algorithm 2 provides a detailed overview of the proposed trust framework’s operation.

Algorithm 2 Trust Scoring with Attention-Based Neural Network and SHAP

1:

Input: Dataset

D = {(x^{(j)}, y^{(j)})}_{j = 1}^{N}

with 5 features; threshold

θ = 0.7

2:

Output: Trust scores

{p^{(k)}}

, binary labels

{{\hat{y}}^{(k)}}

, SHAP

{ϕ_{i}^{(j)}}

, attention weights

{a^{(j)}}

3:

Step 1: Data Preparation

4:

Split

D

into training (70%) and testing (30%):

X_{train}, X_{test}

5:

Step 2: Model Definition

6:

Define attention-based neural network:

Input: $x$
Attention: $a = softmax (W_{a} x + b_{a})$
Weighted input: $x_{weighted} = x ⊙ a$
Hidden layers: BatchNorm → Dense(32, ReLU) → BatchNorm → Dense(16, ReLU)
Output: $p = σ (W_{o}^{⊤} h_{2} + b_{o})$

7:

Step 3: Model Training

8:

Train the model using binary cross-entropy loss:

L = - \frac{1}{M} \sum_{j = 1}^{M} [y^{(j)} log p^{(j)} + (1 - y^{(j)}) log (1 - p^{(j)})]

9:

Use Adam optimizer and early stopping (patience = 10)

10:

Step 4: Inference

11:

for each

x^{(k)}

in

X_{test}

do

12:

Compute trust score:

p^{(k)}

13:

Assign label:

{\hat{y}}^{(k)} = 1

if

p^{(k)} \geq θ

, else 0

14:

end for

15:

Step 5: SHAP Explainability

16:

Select 100 background samples from

X_{train}

17:

Estimate baseline prediction:

ϕ_{0} = E [p (x)]

18:

Initialize SHAP GradientExplainer with model and background

19:

Compute SHAP values for test subset:

p^{(j)} \approx ϕ_{0} + \sum_{i = 1}^{d} ϕ_{i}^{(j)}

20:

Step 6: Attention Interpretation

21:

For each sample in test subset, extract attention vector:

a^{(j)} = softmax (W_{a} x^{(j)} + b_{a})

22:

Compute average weights:

{\bar{a}}_{i} = \frac{1}{S} \sum_{j = 1}^{S} a_{i}^{(j)}

4. Results and Discussion

Several experiments were conducted to evaluate the performance of both the SLM and trust framework proposed for this architecture. This section dives into the details and discussion of the experiments.

4.1. Experimental Evaluation of SLM

As illustrated in the previous sections, the primary function of the SLM is to transform unstructured data into trust metrics such as availability, reliability, productivity, stability and reputation. The IEEE Standard Definitions for Use in Reporting Electric Generating Unit Reliability, Availability, and Productivity [25] outlines typical units required to measure these trust metrics for a solar DER, as illustrated in Table 4. Given the scarcity of accessible DER log datasets for this project, we synthetically generated the log data, adhering to the IEEE standard [25]. While synthetic data provides a controlled environment, future work will validate the framework using real-world SCADA logs from operational NMGs, capturing complexities like data noise. The data is primarily used to prove the effectiveness of our algorithms at this development stage, where exact real-world data is not yet essential. We calculated the actual ranges of trust metrics such as availability, reliability, productivity, stability and reputation to evalute the performance of the SLM. Below are the formulas used for calculating those ranges based on the data generated for a solar DER as mentioned earlier:

Availability: The Availability Generation Factor (AGF) is the ratio of available hours to total active hours which is expected to be between 0.95 and 0.99 based on the following formula:

A G F = \frac{E G - (P U G + M U G + F U G)}{E G}

(10)

Reliability: Forced Outage Rate (

F O R

) is the proportion of time a unit was in a forced outage state relative to the time it was in service or on forced outage which is expected to be between 0.98 and 0.99 based on the following formula:

\begin{matrix} F O R = \frac{F O H}{S H - E P R R H + F O H + R U H} \\ R e l i a b i l i t y = 1 - F O R \end{matrix}

(11)

Productivity: Performance Index (PI) is the ratio of the gross actual generation produced to the expected generation for the period which is expected to be between 0.75 and 0.95 based on the following formula:

P I = \frac{G A A G}{E G}

(12)

Stability: The overall effectiveness that can be measured as below which is expected to be between 0.75 and 0.90 based on the following formula:

S t a b i l i t y = A G F * F O R * P I

(13)

Reputation: This uses the count of messages classified as expected and unexpected as below which is expected to be between 0.98 and 0.99 based on the following formula:

R e p u t a t i o n = \frac{C E x M S G}{C E x M S G + (a * C U x M S G)}

(14)

In order to evaluate the performance of the proposed custom SLM, we conducted a comparative study against existing state-of-the-art Small Language Models, including Gemma2 2B, DeepSeek R1 1.5B, Phi-3-mini 3.8B and a fine-tuned variant of DeepSeek R1 1.5B. To measure prediction variability, five independent runs were performed during these experiments. The Mean Squared Error (MSE) was computed to quantify the deviation between the predicted and the true values (calculated from the formulas provided above for each metric). As illustrated in Figure 6, the existing models showed significant variability and error across multiple runs. Although fine-tuned DeepSeek R1 1.5B demonstrated relatively better consistency, its MSE values remained higher than the Custom SLM across most of the metrics. The proposed custom SLM consistently achieved the lowest MSE values across all five metrics. It demonstrated strong alignment with the true values, especially with availability and stability, where other models struggled.

In addition to predictive accuracy, the computational footprint of Language Models play a critical role in deployment. Several experiments were performed to compare the disk space and runtime memory requirements for the existing models and the custom SLM. Figure 7 presents the results of these experiments. While existing Language Models often achieve a moderate accuracy, their computational overhead makes them impractical for deployment in resource-limited environments, especially in SCADA systems. Even the fine-tuned model that achieved a decent accuracy requires significant disk and runtime memory, which is a big ask in SCADA systems. In contrast, the custom SLM provides a balance by a achieving high accuracy while maintaining an exceptionally small footprint.

This dual advantage highlights the practicality of the custom SLM in real-world monitoring, control and trust management tasks where both robustness and efficiency are essential. To validate the trust metrics against realistic DER failure scenarios, we augmented the synthetic dataset with failure events, such as outages, etc., based on the IEEE standard definitions. These events were cross-referenced with thresholds from critical metrics, e.g., reliability < 0.7 indicates potential failure.

4.2. Experimental Evaluation of Trust Framework

In order to assess the robustness and performance of the proposed trust framework, a series of experiments were conducted that simulated real-world noisy conditions that are common in a Networked Microgrid environment. The evaluation was based on measuring the classification performance of the trust framework against two reference models, a Baseline classifier and a Random Forest classifier, which are commonly used for evaluating trust frameworks. All the models are evaluated under increasing Gaussian noise conditions to simulate potential sensor fluctuation, varying weather conditions, weak communication signals and other unknown circumstances. The noise levels tested were

σ = [0.00, 0.01, 0.02, 0.05, 0.10, 0.15]

. At each noise level, experiments were repeated several times to measure accuracy, F1 score and Area Under the ROC Curve (AUC).

Figure 8 illustrates the comparison of the accuracies of the model under varying Gaussian noise conditions. As depicted in the figure, our proposed trust framework (Attention NN) outperforms other models in noisy conditions. It is able to maintain a reasonable accuracy even in the worst noise conditions.

Both Random Forest and Attention NN experience a gradual decline in performance as noise increase. However, the Attention NN maintains a superior F1 score even at a high noise level, as demonstrated in Figure 9. The Baseline model remains consistent as it is set to always predict the majority class, which makes it perform poorly in non-noisy conditions as well.

The Attention NN consistently matches or exceeds the performance of the Random Forest model in AUC scores even at a high noise level, while the Baseline model remains flat, as illustrated in Figure 10.

These findings highlight the importance of the Attention NN framework for NMGs where environmental factors are significant. The Attention NN not only demonstrates robustness but, when combined with SHAP, enables high interpretability.

The trust framework contributes to smart city applications by enabling reliable and efficient energy management in NMGs. The trust metrics provide actionable insights for urban energy systems such as identifying DERs at risk of failure (e.g., with a low reliability due to inverter faults) to prioritize maintenance. This helps in enhancing grid resilience and supports sustainable energy distribution in smart cities. The experiments demonstrate the ability of the proposed framework to process diverse SCADA logs, showing potential efficient urban microgrid deployment.

5. Conclusions

This research introduces a novel trust framework for assessing the trustworthiness of DERs within NMGs. By leveraging an SLM to transform unstructured DER data, such as SCADA-like logs, into structured metrics and a neural network for trust score computation, the proposed framework addresses a significant gap in the analysis of unstructured data for trust assessment. Comparative analysis shows that the custom SLM outperforms existing state-of-the-art models including Gemma2 2B, DeepSeek R1 1.5B, Phi-3-mini 3.8B and a fine-tuned variation of DeepSeek R1 1.5B, in terms of mean squared error across five trust metrics: availability, reliability, productivity, stability and reputation. The custom SLM showed a significant alignment with true values which is complemented by an exceptionally small computational footprint, making it suitable for deployment in resource-constrained environments such as SCADA systems.

Furthermore, the proposed trust framework (Attention NN) maintained a high accuracy, F1 scores and AUC even at high noise levels, demonstrating resilience against real-world challenges like sensor fluctuations, weather variations and weak communication signals. The integration of SHAP with the Attention NN further enhances the interpretability, which helps in providing valuable insights into the factors affecting the trust scores in NMGs in the context of smart cities.

The current evaluation relies on synthetic SCADA logs to ensure reproducibility and alignment with IEEE Standards. However, real-world SCADA logs may contain complexities, especially data noise. Future work will incorporate datasets from operational NMGs to validate the SLM’s performance under realistic conditions. Moreover, the proposed framework will be optimized to work with other types of unstructured data like emails, status messages and technical reports, etc., along with testing it with highly anomalous data. Furthermore, future work could include testing the proposed framework against other DERs like wind turbines and generators, etc., along with its deployment in live NMG environments.

Author Contributions

Conceptualization, R.I.; methodology, R.I. and N.S.H.; software, R.I. and N.S.H.; validation, R.I.; formal analysis, R.I.; investigation, N.S.H.; resources, N.S.H.; data curation, N.S.H.; writing—original draft preparation, R.I.; writing—review and editing, N.S.H.; visualization, R.I.; supervision, R.I.; project administration, R.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

Dataset available on request from the authors.

Acknowledgments

GenAI has been used for purposes such as generating sample SCADA-like logs and partial code generation for experimental evaluation. During the preparation of this manuscript, the authors used ChatGPT-5 for the purposes of improving parts of the text from the Abstract and Conclusion section. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

NMG	Networked Microgrid
DER	Distributed Energy Resource
SLM	Small Language Model
NN	Neural Network
LM	Language Model
NLP	Natural Language Processing
IoT	Internet of Things
LLM	Large Language Model
ToT	Tree of Thought
SVM	Support Vector Machine
DNN	Deep Neural Network
SHAP	SHapley Additive exPlanations
LIME	Local Interpretable Model-agnostic Explanations
XAI	Explainable Artificial Intelligence
QDNN	Quantum Deep Neural Network
SIoT	Social Internet of Things
QoS	Quality of Service
SCADA	Supervisory Control And Data Acquisition
ReLU	Rectified Linear Unit
AUC	Area Under the ROC Curve

References

Bordbari, M.J.; Nasiri, F. Networked microgrids: A review on configuration, operation, and control strategies. Energies 2024, 17, 715. [Google Scholar] [CrossRef]
McFly, S.; Peterson, J.; Reynolds, T. Distributed Energy Resource Cybersecurity Framework and Cyber Range Integration; Technical Report NREL/TP-5R00-82545; National Renewable Energy Laboratory (NREL): Golden, CO, USA, 2022.
Ali, W.; Din, I.U.; Almogren, A.; Zareei, M.; Roshan-Biswal, R. MicroTrust: Empowering Microgrids with Smart Peer-to-Peer Energy Sharing Through Trust Management in IoT. IEEE Access 2024, 12, 134985–134996. [Google Scholar] [CrossRef]
Boakye-Boateng, K.; Ghorbani, A.A.; Lashkari, A.H. A trust-influenced smart grid: A survey and a proposal. J. Sens. Actuator Netw. 2022, 11, 34. [Google Scholar] [CrossRef]
Jha, R.K. Strengthening smart grid cybersecurity: An in-depth investigation into the fusion of machine learning and natural language processing. J. Trends Comput. Sci. Smart Technol. 2023, 5, 284–301. [Google Scholar] [CrossRef]
Zong, M.; Hekmati, A.; Guastalla, M.; Li, Y.; Krishnamachari, B. Integrating large language models with internet of things: Applications. Discov. Internet Things 2025, 5, 2. [Google Scholar] [CrossRef]
Shirali, M.; Sani, M.F.; Ahmadi, Z.; Serral, E. LLM-based event abstraction and integration for IoT-sourced logs. In Proceedings of the International Conference on Business Process Management, Krakow, Poland, 1–6 September 2024. [Google Scholar]
Kök, İ.; Demirci, O.; Özdemir, S. When iot meet llms: Applications and challenges. In Proceedings of the 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 15–18 December 2024. [Google Scholar]
Ashiwal, V.; Finster, S.; Dawoud, A. Llm-based vulnerability sourcing from unstructured data. In Proceedings of the 2024 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW), Vienna, Austria, 8–12 August 2024. [Google Scholar]
Wang, J.; Feng, J. Unify: An unstructured data analytics system. In Proceedings of the 2025 IEEE 41st International Conference on Data Engineering (ICDE), Hong Kong, China, 19–23 May 2025. [Google Scholar]
Faisal, G.B.; Thamir, T.A.; Mohd, F.; Mohamad, K.I.; Abdul, S.D. Utilizing Machine Learning and SHAP Values for Improved and Transparent Energy Usage Predictions. Comput. Mater. Contin. 2025, 83, 3553–3583. [Google Scholar] [CrossRef]
Mohammad, R.S.; Hamid, M.; Hamid, R.S. Explainable Artificial Intelligence for energy systems maintenance: A review on concepts, current techniques, challenges, and prospects. Renew. Sustain. Energy Rev. 2025, 216, 115668. [Google Scholar] [CrossRef]
Amir, H.P.; Farhad, N. Explainable AI-Driven Quantum Deep Neural Network for Fault Location in DC Microgrids. Energies 2025, 18, 908. [Google Scholar] [CrossRef]
Joy, D.B.; Bo, N.J.; Zheng, M. A Framework for Resilient Community Microgrids: Review of Operational Strategies and Performance Metrics. Energies 2024, 18, 405. [Google Scholar]
Aslam, M.J.; Din, S.; Rodrigues, J.J.; Ahmad, A.; Choi, G.S. Defining service-oriented trust assessment for social internet of things. IEEE Access 2020, 8, 206459–206473. [Google Scholar] [CrossRef]
Li, Z.; Bouazizi, M.; Ohtsuki, T.; Ishii, M.; Nakahara, E. Toward Building Trust in Machine Learning Models: Quantifying the Explainability by SHAP and References to Human Strategy. IEEE Access 2023, 12, 11010–11023. [Google Scholar] [CrossRef]
Parisineni, S.R.A.; Pal, M. Enhancing trust and interpretability of complex machine learning models using local interpretable model agnostic shap explanations. Int. J. Data Sci. Anal. 2024, 18, 457–466. [Google Scholar] [CrossRef]
Pedro, M.S.S.; Alberto, H.C.; Ning, X.; Gérôme, B.; Gregorio, M.P.; Burkhard, S. FederatedTrust: A solution for trustworthy federated learning. Future Gener. Comput. Syst. 2024, 152, 83–98. [Google Scholar]
Gourav, B.; Himanshu, A.; Rinkle, R. A graph-based model to improve social trust and influence for social recommendation. J. Supercomput. 2020, 76, 4057–4075. [Google Scholar]
Popov, R.O.; Karpenko, N.V.; Gerasimov, V.V. Overview of small language models in practice. In Proceedings of the 7th Workshop for Young Scientists in Computer Science and Software Engineering, Kryvyi Rih, Ukraine, 27 December 2024. [Google Scholar]
Andrade, J.R.; Rocha, C.; Silva, R.; Viana, J.P.; Bessa, R.J.; Gouveia, C.; Almeida, B.; Santos, R.J.; Louro, M.; Santos, P.M.; et al. Data-driven anomaly detection and event log profiling of SCADA alarms. IEEE Access 2022, 10, 73758–73773. [Google Scholar] [CrossRef]
Banerjee, C.; Mukherjee, T.; Pasiliao, J.E. An empirical study on generalizations of the ReLU activation function. In Proceedings of the 2019 ACM Southeast Conference, Kennesaw, GA, USA, 18–20 April 2019. [Google Scholar]
Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30, 73758–73773. [Google Scholar]
Bock, S.; Weiß, M. A proof of local convergence for the Adam optimizer. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019. [Google Scholar]
IEEE Power and Energy Society. IEEE Standard Definitions for Use in Reporting Electric Generating Unit Reliability, Availability, and Productivity; IEEE SA Standards Board: Piscataway, NJ, USA, 2023. [Google Scholar]

Figure 1. Traditional NMG trust framework/unstructured data processing in NMGs.

Figure 2. Proposed NMG architecture based on SLM and Neural Network.

Figure 3. GPT-based SLM architecture.

Figure 4. Sample SCADA-like log entries from a solar DER.

Figure 5. Proposed trust framework combining attention mechanism + SHAP.

Figure 6. Mean Squared Error comparison of custom SLM with popular models.

Figure 7. Size comparison of custom SLM with popular models.

Figure 8. Comparison of model accuracy under Gaussian noise conditions.

Figure 9. Comparison of F1 score Under Gaussian noise conditions.

Figure 10. Comparison of model AUC score under Gaussian noise conditions.

Table 1. Existing studies for utilizing LMs in IoT systems in the literature.

Reference	Features	Limitations
Zong et al. [6]	DDoS attack detection Macroprogramming Sensor Data Processing	Lacks unstructured data processing Relies on resource heavy GPT-4
Shirali et al. [7]	Event Abstraction Log Integration	Processes structured data only Relies on resource heavy GPT-4
Kök et al. [8]	LLM, Fog, Cloud Synergy ToT for LLM reasoning	Resource heavy computations Lacks context-awareness
Ashiwal et al. [9]	Unstructured data processing Actionable vulnerability info extraction	Single Unstructured data (Emails) Small dataset testing
Wang et al. [10]	Unstructured data processing Effective and scalable	Reliance on expensive LLMs Manual operator definitions

Table 2. Existing studies on trust frameworks in the literature (attention/SHAP/dynamic trust).

Reference	Approach	Attention	SHAP/XAI	Trust Feature	Robustness to Noise
Faisal et al. [11]	Ensemble ML + SHAP	No	Yes	Static	Moderate
Mohammad et al. [12]	Blackbox DNN + SHAP	No	Yes	Static	Moderate
Amir et al. [13]	Self-Attention in DNN	Yes	Yes	Partial	Moderate
Joy et al. [14]	Control Methodologies	No	No	No	Low
Aslam et al. [15]	Service-Oriented Trust	No	No	Dynamic	Moderate
Li et al. [16]	Quantifying Human Intuition	No	Yes	Static	Low
Parisineni et al. [17]	SHAP + LIME	No	Yes	No	Low
Sanchez et al. [18]	Federated Trust	No	No	Static	Low
Gourav et al. [19]	Recommendation Model	No	No	Static	Low

Table 3. Configuration and setup of SLM.

Parameter	Features
Vocabulary Size	50,257
Context Length	1024 tokens
Embedding Dimensions	256
Number of Transformer Layers	4
Number of Attention Heads	8
Head Dimension	32
Feedforward Dimensions	1024
Dropout Rate	0.1
Number of Epochs	5
Batch Size	2
Early Stopping Patience	10
Temperature	0.3

Table 4. Typical metrics for calculating trust metrics in solar DER.

Metric	Unit
Available Hours (AHs)	Hours
Active Hours (ACTHs)	Hours
Resource Unavailable Hours (RUHs)	Hours
Expected Generation (EG)	kWh
Forced Outage Hours (FOHs)	Hours
Forced Unavailable Generation (FUG)	kWh
Maintenance Unavailable Generation (MUG)	kWh
Planned Unavailable Generation (PUG)	kWh
Planned Outage Hours (POHs)	Hours
Service Hours Generating (SHs)	Hours
Equivalent Partial Reserve Reduction Hours (EPRRHs)	Hours
Maximum Generation (MG)	kWh
Partial Reserve Reduction Generation (PRRG)	kWh
Gross actual generation (GAAG)	kWh
Count of Messages classified as “expected” (CExMSG)	Integer
Count of Messages classified as “unexpected” (CUxMSG)	Integer

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Iqbal, R.; Hamill, N.S. Interpretable SLM-Driven Trust Framework for Smart Cities: Managing Distributed Energy Resources in Networked Microgrids. Smart Cities 2025, 8, 186. https://doi.org/10.3390/smartcities8060186

AMA Style

Iqbal R, Hamill NS. Interpretable SLM-Driven Trust Framework for Smart Cities: Managing Distributed Energy Resources in Networked Microgrids. Smart Cities. 2025; 8(6):186. https://doi.org/10.3390/smartcities8060186

Chicago/Turabian Style

Iqbal, Razi, and Nathan Stuart Hamill. 2025. "Interpretable SLM-Driven Trust Framework for Smart Cities: Managing Distributed Energy Resources in Networked Microgrids" Smart Cities 8, no. 6: 186. https://doi.org/10.3390/smartcities8060186

APA Style

Iqbal, R., & Hamill, N. S. (2025). Interpretable SLM-Driven Trust Framework for Smart Cities: Managing Distributed Energy Resources in Networked Microgrids. Smart Cities, 8(6), 186. https://doi.org/10.3390/smartcities8060186

Article Menu

Interpretable SLM-Driven Trust Framework for Smart Cities: Managing Distributed Energy Resources in Networked Microgrids

Highlights

Abstract

1. Introduction

Motivation

2. Related Work

2.1. Language Models

2.2. Trust Frameworks

3. Proposed NMG Architecture

3.1. Small Language Model (SLM)

3.2. Trust Framework

4. Results and Discussion

4.1. Experimental Evaluation of SLM

4.2. Experimental Evaluation of Trust Framework

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI