A Trust-Inﬂuenced Smart Grid: A Survey and a Proposal

: A compromised Smart Grid, or its components, can have cascading effects that can affect lives. This has led to numerous cybersecurity-centric studies focusing on the Smart Grid in research areas such as encryption, intrusion detection and prevention, privacy and trust. Even though trust is an essential component of cybersecurity research; it has not received considerable attention compared to the other areas within the context of Smart Grid. As of the time of this study, we observed that there has neither been a study assessing trust within the Smart Grid nor were there trust models that could detect malicious attacks within the substation. With these two gaps as our objectives, we began by presenting a mathematical formalization of trust within the context of Smart Grid devices. We then categorized the existing trust-based literature within the Smart Grid under the NIST conceptual domains and priority areas, multi-agent systems and the derived trust formalization. We then proposed a novel substation-based trust model and implemented a Modbus variation to detect ﬁnal-phase attacks. The variation was tested against two publicly available Modbus datasets (EPM and ATENA H2020) under three kinds of tests, namely external, internal, and internal with IP-MAC blocking. The ﬁrst test assumes that external substation adversaries remain so and the second test assumes all adversaries within the substation. The third test assumes the second test but blacklists any device that sends malicious requests. The tests were performed from a Modbus server’s point of view and a Modbus client’s point of view. Aside from detecting the attacks within the dataset, our model also revealed the behaviour of the attack datasets and their inﬂuence on the trust model components. Being able to detect all labelled attacks in one of the datasets also increased our conﬁdence in the model in the detection of attacks in the other dataset. We also believe that variations of the model can be created for other OT-based protocols as well as extended to other critical infrastructures.


Introduction
The Smart Grid is the transformation of the traditional grid which can be combined with cyber devices to automate monitoring and control as well as include a two-way communication between systems [1]. The Smart Grid's performance, just like that of the traditional grid, is centred on factors such as distribution, transmission, and generation. The coupling of the traditional power grid's physical components and the cyber infrastructure has made the creation and continuous improvement of the Smart Grid possible. The diverse nature of the Smart Grid introduces varying applications and the integration of components such as electric vehicles, renewable energy resources, and variants of distributed power generators. Smart Grid has also introduced and improved vendorindependent standards that devices must conform to, thus allowing the seamless operation and integration of these devices into the Smart Grid.
Unfortunately, the cyber infrastructure's integration into the power grid increases the attack vector of the Smart Grid, thereby making the security of the Smart Grid of paramount importance. In response, research has been undertaken under varying topics such as encryption [2], generation and management of cryptographic keys [3], privacy [4], risk assessment [5], and trust. Trust within the Smart Grid is important for determining whether an action, transaction, or communication is malicious or not. In the case of the notorious Stuxnet [6], there is the possibility that trust could have been implemented in devices to ascertain the legitimacy of malicious commands before responses or actions are taken on those commands.
We also observed that research on trust has not received the considerable attention that it deserves within the Smart Grid even though it abounds in other research areas such as E-commerce. Furthermore, other branches of security within the Smart Grid have largely received more contributions than trust. As of the time of writing this paper, no study assessing trust within the Smart Grid exists. As of the time of this study, we noticed that there were limited trust models that detected operational faults within the substation. However, these models could neither determine whether the faults were malicious or not, nor detect obvious or stealthy malicious attacks within the substation. Such attacks were predominant within advanced persistent threats (APTs).
The contributions of this study are as follows: • We present a mathematical formalization of trust within the context of Smart Grid devices. • We categorize the existing trust-based literature within the Smart Grid under the NIST conceptual domains and priority areas, multi-agent systems, and the derived trust formalization. • We present a proposed novel substation-based trust model and implement a Modbus variation to detect final-phase attacks. We believe other protocol variants of the trust model can be created and developing these will be addressed in future work. • The variation is tested against two publicly available Modbus datasets (EPM and ATENA H2020) under three kinds of tests, namely external, internal, and internal with IP-MAC blocking. • The tests were performed from a Modbus server's point of view and a Modbus client's point of view. • All attacks were detected and the behaviour of attacks was revealed based on their impact on the trust model's components.
In this paper, we provide a background on the priority areas and conceptual domains of the Smart Grid as described by the National Institute of Standards and Technology (NIST) in Sections 2 and 3. We then present a background on trust, its definitions, and trust-related attacks in Section 4. We categorize existing literature from Sections 5-8. We present a proposed trust model and its Modbus variation from Sections 9-11. Implementation and results are presented in Sections 12 and 13, respectively. We provide our conclusions and future work in Sections 14 and 15. We also included a table of notations in Appendix A to be used as reference for the equations in the paper.

NIST Priority Areas On Smart Grid
The inclusion of a cyber infrastructure introduced a deficiency of myriad standards, which made maintaining the efficiency of the Smart Grid extremely challenging. In light of that, NIST identified nine key priority areas to be focused on to tackle these challenges [7]. These areas are discussed in this section.

Energy Storage
One major challenge in the power industry is the storage of energy. Because of the immense difficulties posed by such storage, supply and demand are carefully balanced. This challenge brings about the need to invest and investigate new technologies to store energy, which will improve the efficiency within the grid from supplier to consumer.

Wide-Area Situational Awareness (WASA)
Monitoring various components within the Smart Grid is salient to ensure their optimization. This guarantees that processes of demand and supply, as well as utilization forecasts, are facilitated. Thus, novel technologies and strategies are required to create tools that monitor and display these components within the Smart Grid.

Advanced Metering Infrastructure (AMI)
Power usage by consumers is a key parameter in observing demand within the Smart Grid. In the traditional grid, meters were manually read and recorded before being computed to know the actual utilization within a given period. The introduction of the Smart Grid assures the near-real-time monitoring of power usage with AMI. AMI creates a dual-channel network between the smart meters and business systems of utility providers. This enables the collection and distribution of meaningful data to customers and utility providers as well as competitive retail suppliers. Such information can be used to implement residential demand responses. Even though there are many different designs of AMI, it consists of communications software and hardware and their associated system and data management software.

Distributed Energy Resources (DERs)
DERs are resources that generate and/or store electricity for a local distribution system or a facility within that system. As such, DERs connect to these systems. DERs include combined heat and power (CHP) generators, electric vehicles/plug-in electric vehicles (PEVs), battery storage systems, solar panels, microgrids, and battery storage systems [8,9]. Because these technologies are relatively new, they continuously evolve.
One key concern is using these resources to ensure a resilient, safe, and uninterrupted power grid and safeguarding the efficient generation, utilization, and storage of power from these resources.

Distribution Grid Management
Distribution grid management systems integrate customer operations, networked distribution systems, and transmission systems with actual physical components, such as transformers, feeders, circuit-breakers and relays, to enable real-time functionalities such as the monitoring of system performances and load utilization [7]. Thus, the automation of distribution systems is important to operations of the Smart Grid, especially where systems such as AMI and PEVs are deployed to provide benefits such as reductions in peak loads, providing field engineers with malfunctioning devices' locations, and increased reliability.

Network Communications
Communication within the Smart Grid is important to ensure real-time monitoring, operations, and maintenance within the Smart Grid. Therefore, various technologies such as fibre-optics, wireless, and cellular (currently trending is 5G) are required in strategic areas or locations to aid in Smart Grid operations. Different routing algorithms are also required to ensure fast communication for the time-sensitive operations of some devices within the Smart Grid. Access to public and private communication networks will be required with various restrictions in place. Furthermore, critically important is ensuring that there is no collision or loss of messages during their transmission. Power network interfaces are required for long-distance transmission, and cost-effective solutions are always required. The efficient translation of protocols is also required as well as global standards to ensure that vendors can comply, thereby making communication seamless.

Demand Response and Consumer Energy Efficiency
Technologies to balance supply and demand are being used by electricity suppliers and system planners. These technologies allow them to provide incentives (mostly financial) and mechanisms for consumers that lead to the efficient use of power during unstable power periods or peak periods. By providing detailed information to clients about consumption, they can save energy by engaging in practices and investing in devices that ensure the efficient utilization of power. Offering time-based rates such as critical peak rebates, variable peak pricing and time-of-use pricing can allow customers to take part in demand response efforts. Customers could allow utility companies to use direct load control programs to cycle water heaters and air conditioners on and off during peak periods in exchange for lower bill charges or incentives that may be financial or non-financial.

Electric Transportation
Clean energy ensures reduced carbon emissions, reduced dependency on fossil fuel to drive the economy, and reduced carbon footprint for nations. Thus, the large-scale usage and patronage of PEVs are essential in ensuring that this happens. Technologies to ensure the cost-effective mass creation of these electric vehicles and their storage capacity are crucial to ensure that this happens.

Cybersecurity
In a world where everything is being relocated to the cyber-domain, cybersecurity is critical to ensure the safety, availability, and reliability of the Smart Grid. It is very important to ensure that the operations of the Smart Grid are not adversely affected when security is applied within the grid. Cybersecurity plays a critical role in the operations of previously mentioned areas ( Figure 1). There has been research into (but not limited to) network communication [10,11], demand response [12,13], PEVs [14,15], AMI [16,17] and DER [18,19]. This research includes encryption [19], privacy [20], intrusion detection and prevention [21], and trust. In this paper, we present a survey on the research on trust within the Smart Grid, especially within the priority areas and conceptual domains of the Grid. In terms of systems and trust, it is required that systems be cognitive to be able to trust each other. It is for this reason that we also investigate the application of trust in multi-agent systems' research within the Smart Grid. We also propose a trust model for substations within the Smart Grid.

NIST Conceptual Domain Model
The conceptual domain model represents seven logical domains within the Smart Grid [7]. These domains represent the present and near-future view of the Smart Grid ( Figure 2). The domains communicate with each other through interfaces. Figure 3 shows the mapping of legacy systems in the grid to the conceptual domains.

Generation Domain
This is the domain where power or electricity is generated from renewable or nonrenewable forms of energy, and applications in this domain are the first processes when it comes to the delivery of power to customers [22]. It is from here that power is transferred to the transmission or the distribution domain. Thus, the connections with those two domains must remain reliable because power cannot be served to customers without it. Applications that can be found in this domain are asset management, protection, measurement, records/logging and control.

Transmission Domain
The transmission domain is responsible for the bulk transfer of electrical power to the distribution domain from the generation domain through the use of multiple substations. A transmission network is usually managed and operated by a transmission-owning entity with the primary responsibility to ensure stability on the electrical grid by balancing supply (power generation) with demand (power consumption) across the transmission network. A Supervisory Control and Data Acquisition (SCADA) system, which comprises a communication network, control devices and field monitoring devices, is used to monitor the transmission network.

Distribution Domain
The distribution domain is electrically connected between the transmission domain and the customer domain. The electrical distribution system may be structured in a varied number of ways such as meshed, looped or radial-and each structure affects the reliability of the system. Initially, the communications interfaces within this domain were unidirectional and hierarchical, but now they work in a bi-directional manner. Typical applications within this domain are measurement and control, substation, DERs, distribution generation and storage.

Operations Domain
This domain ensures that the power system runs smoothly. A regulated utility is assigned the responsibility of ensuring this. Even though some of the functions in this domain may be provided by the service provider as the Smart Grid continuously evolves, there will always be core functions maintained in this domain. Typical applications in this domain are customer support, fault management, operation planning, monitoring, network calculations, maintenance and construction, analysis, records and assets, control, extension planning and reporting, and statistics.

Service Provider Domain
The service provider domain provides support to other domains such as home energy generation, the management of energy use, and billing and customer account management. Its communication with the operations and markets domain is critical for situational awareness, system control and enabling economic growth. Typical applications in the service provider domain include building management, customer management, installation and management, account management, billing and building management.

Markets Domain
The sale and purchase of grid assets are conducted in the Markets domain, hence its importance to ensure that communications within this domain are transparent and reliable. There is the balance of supply and demand as well as the exchange price within the power system that is ensured by this domain. It must also be noted that due to the evolving nature of the Smart Grid, the market domain is bound to evolve, which in turn will define the Smart Grid in the future. The market domain communicates with the entity that controls the assets (operations domain), the customer domain and the other domains that supply the assets. The efficient matching of demand for power with the consumption of power is dependent on the domain of the market; thus, the communication flow between that domain and the domains that supply the power is critical. Bulk generation and DERs (which are usually served through aggregators) are examples of power suppliers, with DER more likely to become greater partakers as the interactive nature of the grid increases. Typical applications in the market domain include market management, DER aggregation, market operations, trading, ancillary operations and retailing.

Customer Domain
The customer is the main beneficiary of the Smart Grid and is the reason the Grid was created. The sole purpose of the customer is to consume the electricity generated by the grid. The customer domain is usually divided into home, commercial/building and industrial domains due to the difference in their energy demands. Each sub-domain has a meter and an interface that connects to other domains for utility-to-customer interactions. This may be done over the Internet or the AMI. Home or building automation is one of the applications in the customer domain that relies on these interfaces to function. Home automation allows the control of appliances within the house. Industrial automation, which is similar to home automation, also allows the control of industrial processes such as manufacturing. The interfaces also allow the storage of energy in thermal energy units and batteries as well as the generation of energy from renewable sources such as solar panels that are close to the customer. Although the customer domain communicates with and is electrically connected to the distribution and generation domains, it communicates with the service provider, operations, and market domains.

Trust
The world would not function without trust. Without trust, it would be difficult for interactions and/or transactions to exist. As a concept, trust is fundamental in the building and maintenance of stability in human relations. Trusting someone or something helps create interactions between people and organizations. In the digital age, with the current existence of virtual markets and communities, the interest in trust has matured and as such, can be expanded into other domains. Thus, any effort undertaken towards the proper management of trust by sharing information that enables interactions between participants in the open environment is essential and challenging. It is worth noting that trust is only useful in uncertain situations where people or agents must cooperate to achieve goals.

Trust Definition and Formalization
According to the literature, trust has many definitions. A definition from the social sciences states that trust is the degree of subjective belief about the behaviours of a particular entity [23]. Trust is also defined as an agency's subjective probability of performing a particular act [24]. In this paper, we define the trusting entity as the agent and the entity being trusted as the subject. Marsh [25] describes three levels of trust, namely basic trust, general trust, and situational trust. Basic trust is the general trusting disposition of an agent. General trust is the trust that an agent has on a subject at a certain time. Situational trust is the trust that the agent has on the subject, taking into account a certain situation.
It must be noted that trust has been applied in different contexts, thus the notion that trust has many definitions. Thus, the design of trust models is required to be within a context or in terms of the system being designed. Thus, the factors being chosen to design the trust model must be on objective grounds to ensure that the trust being modelled is also objective. Hence, the difficulty in modelling trust. Regardless, trust models must have a component that must accept the risk because, without the assessment of risk, there is no trust.
NIST defines risk as: A measure of the extent to which an entity is threatened by a potential circumstance or event [26]. Thus, for an agent, a i , and a subject, a j , we define the risk, r ij , of a transaction, α ij , involving a i and a j as a function as shown in (1). There must also be a component of knowledge, k t ij , within the trust model. Before and after a transaction, knowledge about α ij and previous transactions (k ij ) with the subject, the environment (k e ), knowledge of a j , k a j , and the time period (t), are also of prime importance in determining trust. We formulate knowledge as shown in (2). k ij is a collection of transactions before the current transaction, and this is formulated in (3).
Thus, with risk, r, and knowledge, k t ij , the decision on trust can be made. Therefore, trust, T ij , can be expressed as the output of a function that takes a tuple of elements as shown in (4) where T ij is the previous trust value between a i and a j . The T ij has an influence on the decision for a i to trust a j to undertake α ij . Trust is represented as a continuous variable over a specified range usually −1 ≤ T ≤ 1 or 0 ≤ T ≤ 1 where 1 represents complete trust, −1 represents complete mistrust and 0 represents no trust. It must be noted that the transitive property of trust may or may not exist. In a situation where it does not exist, for three agents a i , a j , and a k , the fact that a i trusts a j and a j trusts a k does not mean that a i trusts a k (see (5)). In a situation where transitivity exists, it means that a i trusts a j and a j trusts a k , therefore, a i trusts a k (see (6)).
Trust can be directly or indirectly evaluated. Direct trust is calculated based on direct interactions between the agent and the subject. The default definition of trust is direct trust and that is formulated in (4). In the situation where no interaction exists between the agent, a i , and subject, a j , trust is built based on opinions from other agents about the subject; this is termed indirect trust. As formalized in (7), in an environment of n agents, trust is computed based on the recommendation of, at most, n − 2 agents.

Trust-Based Attacks
In ensuring that trust mechanisms do not work in an environment, adversaries employ different attacks or strategies [27,28]. Some of these attacks are as follows: • Misleading feedback attack: In this attack, a compromised agent feeds bad reports or recommendations to other nodes to denigrate agents with good reputations. It is also known as bad-mouthing attack or betrayal attack. • Sybil attack: This attack involves a malicious agent within the system creating fake identities to create a larger influence over other agents using false rankings. • Newcomer attack: This attack involves the malicious agent reintroducing itself as a new agent within the system in an attempt to erase its history of bad scores. • Ballot-stuffing attack: In this attack, malicious agents collude by providing inaccurate recommendations or reports in an attempt to take over the system. It is also known as collusion attack. • On-off attack: This attack involves a malicious agent repeatedly switching between being honest and dishonest in an attempt to be undetected. It is also known as inconsistency attack.

Trust: State of the Art in Smart Grid
In this section, we present literature on trust within the Smart Grid, categorized by the priority areas, conceptual domains, and trust definitions-after which we briefly discuss our observations. We searched the IEEE, Science Direct, Scopus, Web of Science, ACM, and Springer Link databases to find literature by using the keywords trust, reputation, trust management, mistrust, and trust model. We further reduced the papers by pairing each keyword with each of the following keywords: cyber-physical systems, critical infrastructure, distributed energy resources, micro-grids, smart grid, smart meters, substations, advanced metering infrastructure, building automation and control systems, distribution automation, and industrial control systems. We streamlined the list by reading the abstracts to ensure that the papers were relevant to the subject matter. The remaining papers were scrutinized and categorized or left out if they were not relevant to the subject matter.

Research Areas
Cheng et al. sought to detect the credibility of data from different sources by establishing trust from the said sources [29]. Though they were not specific about which part of Smart Grid they were working on, their work implied that it could be used in all areas of the Smart Grid because it deals with big data. In their paper, they used trust and credibility interchangeably. Even though the knowledge component exists in terms of previous trust values and a forgetting rate, the measure of risk on the data from the data source and the data source itself was not computed. There were no tests against trust-based attacks.
Moving away from big data to secure routing, another paper sought to compute trust for secured routing in wireless-based communications in the Smart Grid [30]. Networkbased features such as the average transmission rate, buffering capacity and time-to-live (TTL) are used to compute trust. Their algorithm first computes direct trust between nodes; indirect trust based on recommendations from other nodes; and finally uses that information to compute how to route information from one node to another within the Smart Grid communication infrastructure. This algorithm would work best in AMI but not in the generation and distribution domains of the Smart Grid where communications are more wired than wireless. This paper improved their previous trust model to identify benign and malicious nodes based on various features using a combination of Bayes, Dempster-Schafer and Fuzzy theory [31]. They employed a water cycle algorithm (WCA) to improve its efficiency and tested it using an NS-2 simulator. The parameters used are clear indicators of the knowledge component of trust; however, there was no measurement for risk to show the impact should a node be wrongfully trusted. The algorithm was also not tested against trust-based attacks.
Another paper also proposed a fuzzy logic-based trust model to ensure secure routing in the network [32]. It computes a global trust value by computing direct and indirect trust to allow nodes to make decisions on compromised nodes. They tested their work against trust-based attacks, but their algorithm had no risk component.
Still focusing on routing, Xiang et al. presented a trust-based geographical routing protocol which placed trusted nodes in a trust list [33][34][35]. To be part of the trusted list, the node was required to have a good performance ratio as well as a good recommendation from other nodes. Based on that list, a routing algorithm is implemented to route from one trusted node to another. Their work did not include a risk component and was not tested against trust-based attacks, even though it was tested against WSN-based DOS attacks. Their experiment was simulated using a Java-based simulator called J-Sim.
Though not creating their trust model, Bello et al. explored the impact of transitivity in network topology in the performance evaluation of the famous EigenTrust model [36]. They demonstrated that a network containing established transitivity connections implied that a benevolent node was quickly identified by a node, thereby reducing the average energy consumption. An improved version was tested against trust-based attacks and showed that structural similarity has an impact on robustness against trust-based attacks and malicious nodes [37].
In trying to detect a compromised node in a network, a trust management model was proposed based on fuzzy logic using the packet error rate, interaction duration and packet loss rate as features [38] to compute trust. There was no risk component in the calculation, and neither was the algorithm tested against trust-based attacks. The trust model was simulated using Xfuzzy-3.5.
Moving away from networks, and still within AMI, Pliatsios et al. computed trust based on three features, namely consumption, polling, and connection to detect malicious devices [39]. The continuous-time Markov chain was utilized to compute the trust value of a node. It was purely tested with numerical parameters. The trust value of a device was decreased or increased in unit steps within the range of 1-3 (inclusive) depending on the behaviour of the device. The state of the Markov chain stores the state of a previous interaction. However, the risk component does not exist to determine the extent of a possible threat on or from the device. Furthermore, an on-off attack can be used to ensure that the device's trust value is maintained.
In tackling meter tampering within the AMI, Pradhan et al. did the reverse of calculating trust by using mistrust [40]. Their algorithm involved comparing the presented data with houses and actual data from smart meters to see whether a house is being truthful or not. A dishonest house is added to a mistrust table. Their algorithm has no risk component and was not tested against trust-based attacks.
In tackling cascading power failures, a trust management toolkit was proposed, which computes a trust value using the simple trust algorithm [41] which uses the threshold of grid values as input [42]. With the trust values being attained and Djikstra's shortest path algorithm, it allows the flow of power in an optimal direction to prevent cascading failures. This work was improved upon to create a special protection system (SPS) that implemented a trust mechanism that is con-resistant and mitigates transient instabilities (being aperiodic of time) within the grid by using load-shedding strategies [43]. One of the key features in calculating trust values was ensuring that a node reports a frequency value around a specific threshold. There was no risk component, and their work was not tested against trust-based attacks.
Other papers assume that trust is already manifest in firewalls, intrusion detection systems (IDSs) and other security devices and therefore apply the term trust nodes for these devices. Thus, their research involves placing them in vantage points within the AMI [44][45][46][47][48] or SCADA network [49] and computing an optimal routing algorithm for them, especially when a node is compromised. These papers do not include any computations of trust because they assume that trust is already embedded in the devices.

Discussion
Concerning Table 1, it can be observed that the majority of the papers reviewed focused on AMI and network communications areas. Only one paper [29] fits across all the priority areas. Only two papers [42,43] were specifically focused on distribution grid management. Trust in the research areas of energy storage, electric transportation, demand response and consumer energy efficiency, WASA and DER is lacking.
In Table 2, research by Cheng et al. [29] covers all seven conceptual domains. Only two papers specifically cover transmission, distribution, generation and operation domains. The rest were focused on customer and service provider domains.
In Table 3, none of the papers had a risk component for computing trust, and only two of the papers [32,37] tested their work against trust-based attacks. The knowledge component of most of the papers did not include previous transactions or states; thus, trust was computed based on the values of parameters that were provided for computation. Only two papers [38,39] implemented direct trust, and the rest computed both direct trust and indirect trust.

Trust: State of the Art in Substations
Substations, aside from other functions, are responsible for transforming low voltage into high voltage or vice versa [50]. They are considered integral to the transmission and distribution of power within the Smart Grid. Substation automation systems (SASs), consisting of the station level, process level, and bay level, enable the integration of substations into the Smart Grid. The station level contains SCADA and some variations of HMI; the bay level comprises IEDs; and the process level comprises high-voltage primary devices (see Figure 4). IEDs are responsible for controlling circuit breakers which are responsible for the connection or disconnection of power lines. It is SCADA that controls the IEDs by sending commands to them.

Research Areas
Trust has been stated as an important reflection of the state of the substation, the execution of legitimate commands of devices within the substation and the dissemination of sensitive substation information [51]. To detect malicious nodes in the protection zones of substations, trust was implemented in wireless sensor nodes [52,53] by using their wireless range. It must be noted that most substations that exist, at the time of this paper, do not use wireless sensor nodes in protection zones for substations but rather use IEDs which are serial-based or Ethernet-based.
Another paper presented the measurement of trust between substations by the use of behavioural pattern analysis [54,55]. The analysis used machine learning and statistical tools and used logs from the security gateway of substations as the source of data. These logs contained communication between substations. They computed a threat value to substations based on which the inverse was the trust value. However, the analysis is external to the substations, and therefore, an attack within a substation is likely to be over before an analysis is completed. Furthermore, most attacks originate from SCADA with legitimate commands, and these can go undetected.
Nasr et al. [56] built a system to secure SCADA from deontological threats. The system aims to limit the access of an attacker or a naive/unskilled operator to a critical substation. The performance of an operator in controlling remote substations and resolving alarms is considered in determining the operator's trustworthiness.
Rashid et al. designed a trust system for securing IEC 61850 GOOSE communication [57]. The untested trust system comprised modules that mimicked firewall policies, checked frame formats and access control.

Discussion
None of the papers tested their work against trust-related attacks nor did they include a risk component in their models (see Table 4). The knowledge component of most of the papers did not include previous transactions or states and as a result, trust was computed based on the values of parameters that were provided for computation. Only two papers implemented both direct and indirect trust. None of the papers tested their work against trust-related attacks.

Multi-Agent Systems (MASs)
A multi-agent system (MAS) is a system consisting of two or more intelligent agents [58]. An intelligent agent is described as an entity with four characteristics, namely social ability, reactivity, pro-activeness, and autonomy. Social ability requires that the agent should be able to interact with other agents. This is often mistaken as just the exchange of messages. However, it requires the ability to cooperatively interact and negotiate or in simple terms; agents should be able to converse. Reactivity requires that when there are changes to the environment in which the agent is in, the agent must react promptly; and based on its goals and those changes, the agent must take some appropriate action. Pro-activeness requires that the agent must change its dynamically behaviour to achieve its goals. Autonomy requires that agents must operate without any intervention from humans or any external system.
An MAS has an overall objective or goal to which each agent's goals within the MAS must contribute to the achievement of that overall objective. There are three kinds of MAS architectures, namely centralized, decentralized, and hybrid. Centralized architecture has agents reporting to a central agent from whom the agents await instructions. Decentralized architecture has agents communicating with each other in a clustered manner, with each having the same level of priority. In the case of centralized architecture, the demise of the central agent spells the demise of the MAS. The optimization of MAS goals is challenging with a decentralized architecture because of the local nature of the connection between agents. The hybrid architecture combines the two previous architectures to utilize their advantages.

MAS Tools
The development of intelligent agents and MASs requires tools to make this feasible. The major software frameworks identified are presented in this section. 7.1.1. JADE Java Agent Development Framework (JADE) is a software framework fully developed in the Java language [64][65][66]. JADE uses middleware to simplify the implementation of MAS, which ensures its implementation across a platform-independent distributed system. It also incorporates a set of graphical tools that are essential in remote configuration, debugging, and deployment. JADE is also free to use and is compliant with the specifications of the Foundation for intelligent physical agents (FIPA).

ZEUS
Zeus [65] is an open source agent development platform developed with the Java language. It is FIPA-compliant and supports knowledge query and manipulation language (KQML). It has, however, been discontinued.

VOLTTRON
VOLTTRON [65,67] is a framework specifically designed for use in electrical power systems. It was developed by the Pacific Northwest National Laboratory (PNNL), and it is available in Python. It is a modular, open source platform that is intended to support transactions between networked elements over the grid.

Aglets
Initially developed at the IBM Tokyo Research Laboratory, Aglets is a mobile agent platform and library that is written in Java [68] that eases the development of agent-based applications. Aglets includes a stand-alone server called Tahiti and a library that enables the developer to build mobile agents, as well as include the Aglets technology within their applications. 7.1.5. JACK JACK [69,70] is a commercially licensed agent-oriented development environment. It was developed in Java and acts as a Java extension that provides classes for implementing agent behaviour. It provides a graphical user interface for creating agents within projects. It is highly portable and platform independent.

MASs with Trust in the Smart Grid
The application of trust within MASs will have a positively impactful role on security within the Smart Grid. However, there has been extremely limited research in this area. The few studies which were identified are mentioned in this section.

Research Areas
Zhao et al. [71] implemented both direct-based and reputation-based trust mechanisms to create a modified version of the contract net protocol (CNP) [72]. The new trust-based CNP model, which was implemented in distributed MAS architecture, was used in Smart Grid scheduling to ensure improved decision quality which led to improved energy efficiency. With the direct trust mechanism, the time and rating value of the trustee were used to calculate the direct trust. These values are stored individually by each agent. The recommendation trust requires the trust rating of the trustee from all other agents in the MAS. The values generated by the trust mechanisms are fed into the CNP model, which is used to calculate which agent a task is delegated to. The model was tested via simulation using JADE; therefore, a real-world test was not made. This model has not been tested against trust model-related attacks.
In another paper, an MAS-based negotiation mechanism was implemented to combat jamming attacks in the Smart Grid power market [73]. Their work involved using the trust-based CNP [71] during local marginal price (LMP) [74] negotiations. Their work was simulated on a PJM 5-bus system [75], and it was not tested against any trust-related attacks.
Pereira et al. implemented a trust model in testing the resilience of control systems in power purchasing in cyber-physical systems [76]. The trust is used to calculate the cost of power to be sold by a producer agent to a consumer agent based on the trust level of the consumer. The model was tested using the JADE and GridLab-D power distribution and analysis tool [77].
In another study, trust was used in the secure operation of state estimation algorithms in networked microgrids [78]. Each microgrid within the network was modelled as an agent. Each agent implements direct trust when an agent provides state estimation values that are below a certain threshold. A malicious node is then isolated by the peer agents from the network. The historical data based on which the behaviour of a node was based are not specified, and the tool used for simulation was also not specified. Their work was not tested against trust-related attacks to test its resilience.
Matei et al. [79] proposed a trust-based security mechanism for protecting the state estimation process against false data injection attacks by using a multi-agent filtering scheme. The agents assign a trust metric that is used to disregard messages from lowtrusted agents. The mechanism involved a mathematical simulation and was not tested against trust-related attacks. Cunningham et al. [80] wanted to see the impact of trust in a hierarchical agent-based socio-technical system. They ran a scenario replicating the 2003 Northeast Blackout which, in the history of North America, was the largest blackout [81]. The system is comprised of the elements responsible for the handling of the blackout. Each element was identified as an agent. The trust value was a score based on how an agent successfully or unsuccessfully handled a task. Their work was simulated using JADE, and it was not tested against any trust-related attacks. Hussain et al. [82] implemented trust in the inclusion of DERs in Smart Grid. The update of the trust score of an agent was dependent on the adherence to the Service License Agreement between it and other agents. Their work was simulated using the JACK-AOS [83] multi-agent platform and was not tested against trust-related attacks.
Borowski et al. [84] implemented reputation-based trust in an agent-based backup protection scheme that aims to mitigate the effects of faults and faulty agents in substations. Their work was simulated using NS-2 [85], EPOCHS [86] and PSCAD/EMTDC [87] but was not tested against trust-related attacks.

Discussion
In stark contrast to Section 5, MAS-based trust within the AMI and network communication priority areas do not exist as shown in Table 5. Furthermore, energy storage, electric transportation, and WASA priority areas are still uncharted territories when it comes to trust. There are only three papers each for DER and distribution grid and only two for demand response and consumer energy efficiency areas. Clearly, this shows that a lot of work is required on trust in MAS-based environments in the Smart Grid. Table 6 shows that the generation, customer and service provider domains have yet to be explored while the markets domain only has two papers. Three papers were focused on the transmission, distribution, and operation domains, while only one was focused on the operations domain and only two focused on only the distribution domain.
Only one paper includes the risk component in its trust model, as shown in Table 7. The knowledge component of most of papers did not include previous transactions or states; therefore, trust was computed based on the values of parameters that were provided for computation. There were three papers that exclusively focused on direct trust, and one paper focused on indirect trust. Five papers focused on both types of trust.
All their works were simulated, and JADE was the most used framework among the tools, as shown in Table 8. Other types of frameworks or applications were used, but they were not discussed because they were not specifically designed for MAS. Six of the papers implemented a decentralized MAS architecture, while three of them implemented a centralized architecture.

Motivation
Sections 5 and 6 demonstrate the scarcity of trust-related research within the Smart Grid. Even more so, Section 8 shows the scarcity of trust-related MAS research in the Smart Grid. Trust is essential, especially with respect to communication among IEDs and SCADA. As future work, it would be important for vendors to make IEDs securecentrically autonomous by encompassing trust to have a security-related impactful role within substations. In the situation of existing IEDs that are resource-constrained, the integration of intelligent agents with IEDs could make this possible.
Trust among devices within the substation must be defined differently. The key parameters required to compute trust within devices are reliant on the communication among devices and SCADA. The type of communication can be a request, command or a response from a device or SCADA. As such, the risks involved in the acceptance of each communication that is received has to be computed to calculate trust. Furthermore, a history of communications is required to be stored to be used as a reference to compute trust. Concerning the formulation of trust in Section 4.1, trust among IEDs (and also SCADA) can be seen as a tuple with some modifications, as shown in (8).
m ij is the message being analyzed before it can be trusted and accepted, d i is the agent device, d j is the subject device, r ij is the risk involved should the message be accepted or trusted, and h ij is the history of communication between d i and d j .
A simple conceptual algorithm is presented in Algorithm 1 where d i receives m ij from d j and computes T ij based on m ij . If T ij equals or exceeds the threshold value, m ij is received and acted upon, otherwise it is dropped and an alert is raised. It must be noted that trust can be scaled on a continuum such that certain actions are taken when certain thresholds on that scale are exceeded [90]. Actions can range by sending warnings, raising alarms or in the worst case scenario, refuse to communicate with a non-trusted device.

Algorithm 1 Pseudo-algorithm for trust computation for agent device
Drop m ij Raise alarm end if In Figure 5, we present a proposed trust model that can be implemented in a substation environment. We define consequence as the measure of damaging impact an action has on a substation. Consequence represents the risk involved when a current action/message is taken within the substation and requires some parameters from familiarity as input. Consequence requires knowing the state of the substation (environment state) and the dependencies (criticality) within the substation to calculate the risk or consequence of the action to be undertaken. We define familiarity as a measure of the consistency of actions/messages of different types between devices. Familiarity, in this situation, maps to the history of communication or existing knowledge in the trust formalization presented to date. According to Yonelinas [91] and Zhan et al. [92] factors that influence familiarity are exposure intensity, exposure frequency, and similar exposure. Exposure frequency is defined as the frequency with which messages/actions are exposed to the devices.
Exposure intensity is defined as the length of time in which the messages/actions are exposed to the devices. Similar exposure is the measure of the similarity of the messages/actions being exposed to the devices. The mathematical formulation of this model and the results are discussed in the remaining sections this paper.
The environment state is computed using standard computations to ensure fault protection scenarios such as overvoltage, undervoltage, etc. [93]. Computation of the environment is out of the scope of this paper.

Criticality
To determine the dependency of devices within the substation, we need to provide a ranking of each device in terms of how critical it is within the substation. The higher the ranking, the higher the cascading effect within the substation. To achieve this, we utilized an artifact from the literature to create the criticality rankings for a substation [94]. According to the paper, for a list of n number of devices, D is defined in Equation (9).
For each d i , a list of devices (including d i ) that are functionally dependent on d i are generated as shown in Equation (10). The reverse is also performed where the list of devices that functionally influence d i are also identified as shown in Equation (11). An intersection between R d i and An d i is identified using Equation (12). Within m number of rounds, each d having the same devices in R d i and I d i are given similar ranking, l (see Equation (13)). This results in a set of criticality rankings, L as shown in Equation (14). Devices in a single line diagram ( Figure 6) were ranked as shown in Table 9 where devices starting with IED are the primary focus of this paper and the others can be ignored (details can be found in [94]). Table 9. Criticality ranks of substation devices.

Substation Model
We define the substation, Ξ, as a three-tuple entity in Equation (15) where M, S, and N represent sets of clients, servers, and network devices, respectively, (Equations (16)- (18)). N interconnects S and M. There exists a set of queries, Q, and a set of corresponding responses, R, defined in Equations (19) and (20). Periodically, m i , sends Q to s i and receives R from s i . Each m i -s i pair may have a unique pair of Q and R. A query and its associated response have either read (ϑ = 0) or write (ϑ = 1) operations. Queries and responses made by the attacker are defined in Equations (21)

Attack Scenarios
With the substation, the ultimate goal of the attacker is gaining control of an element(s) of S to cause an outage within the Smart Grid. In most cases, the IED is that device. We present two scenarios where s i (or more than one) is compromised.

Compromised Network, A N
When N is compromised, the attacker, m or s , sends Q and/or R to a device or uses any compromised element in M or S to do so. Unfortunately, there are no publicly recorded incidents of such nature; thus, we use this literature-sourced scenario [96]. In this scenario, A N , the attacker is oblivious to the substation's architecture and as a result, requires cyber attacks to identify S before transmitting Q and/or R . It is assumed that the attacker has already achieved this. Therefore, the possible attacks are identifiable in A M and below: • Man-in-the-middle (MitM) attack: m (or s ) impersonates a device to send q or r ; • Maliciously crafting packets: m (or s ) sends maliciously crafted q (or r ) to drop a payload or trigger a buffer overflow; • Query flooding: m (or s ) exhausts a device's resources with a bombardment of Q or R .

Compromised Client
One notable device in M is SCADA. Publicly available documented attacks of utility companies have identified SCADA as the entry point preceded by successful social engineering attacks. The most notable attacks are Stuxnet, BlackEnergy [97], and Havex [98]. In this scenario, A M , the attacker controls m i to become m i before transmitting Q . SCADA's compromise guarantees the attacker an architecture-wide view of the substation. Rarely identified publicly, it is also possible for an attacker to compromise s i to become s i to transmit R . Thus, the considered attack scenarios are: • Reconnaissance: For ϑ = 0, m i transmits q to s i to all existing Modbus addresses. • Loading Malicious Firmware: m i makes s i inaccessible by loading a malicious firmware. This can be performed by utilizing a device-specific software within SCADA or embedding malicious bytes in q . The former option is not within the scope of this paper. • Baseline Replay Attack: m i (or s i ) replays Q or R to a device after profiling the substation to avoid detection. • Write attack: Without reconnaissance and for ϑ = 1, q is sent to s i to all existing Modbus addresses. Another scenario requires a completed reconnaissance attack. q , where ϑ = 1, is sent to target an address of a specific s i . It can be also executed after a baseline replay attack.

Modbus TCP
Due to its documentation being readily available and it being used by modern and legacy substations (which form a significant percentage of substations worldwide [99]), Modbus TCP [100]-which is the TCP variant of Modbus [101]-is used. Furthermore, reinforcing our selection is the fact that there is current literature that is centred around its security [102], vulnerabilities [103], attack mitigation [104,105], and utilization in testbeds [106,107]. Utilizing TCP port 502, its implementation requires a client-server architecture. Modbus does not support unsolicited responses from servers. The Modbus TCP frame/packet consists of the Modbus Application Header (MBAP) header and the Protocol Data Unit (PDU) with their sizes and those of their components specified in Figure 7. The function code determines the request type that is sent to the server and the server responds using the same function code. The address(es) and/or the value written to/read being accessed from the server are specified in the data section of the PDU. The minimum Modbus request size is 12 bytes and that for response is 10 bytes and a maximum of 260 bytes for both. Table 10 shows a selection of the function codes selected for this work based on multiple datasets that were reviewed. When q i or r i is transmitted, a set of features, Z (Equation (23)), is created and used to compute exposure intensity, E i , as shown in Equation (29), where E i → [0, 1]. An alert description, κ E i , associated with the value of E i . The description of each feature is available in the table of notations. The sender's current message's arrival time, t i , the sender's previous message's arrival time, t i−1 , the sender's first message's arrival time, t 0 , the sender's last message's arrival time, t n , and the recipient's dispatched message's time, t d i , are required to define the features in Equations (24)- (28).

Similar Exposure
When q i or r i is transmitted, a Moore machine is defined in Equation (30).
When m i transmits Q to s i , a Moore machine, Υ = {ρ, σ, δ, ρ 0 , Ψ, λ}, is defined to parse through q i as follows (the definition of each symbol can be found in the table of notations): • ρ defined in Equation (31) represents a set of states where each state represents q i or r i where ρ 0 is the initial state. Accept states are not required due endless transmissions of q i or r i . • σ, defined in Equation (32), is a set of input alphabets extracted from q i or r i . • δ is the transition function defined in Equation (33). • A set of features, Ψ, is an output of λ (Equation (36)). • The output function, λ, is defined in Equation (34) which is the output function that maps ρ to Ψ. Equations (35)-(39) define the mappings.
Finally, E s is defined in Equation (43) where E i → [0, 1] and based on the generated value, the associated κ E s is generated.

Exposure Frequency
For each q i or r i that is received, Γ is defined in Equation (44).
For each q i or r i , each feature is defined as follows: Exposure frequency, E f , is finally defined in Equation (45)-where E f → [0, 1] and the corresponding κ E f is generated.

Familiarity
Using all the exposures, we define familiarity,

Consequence-Based Definitions
In determining consequence-related values, any q i or r i , where ϑ = 1 is transmitted within a non-permitted time or scenario in a value of 1. For scenarios or periods where ϑ = 0, the ratio of the criticality of the device (see Equation (14)) to the highest criticality ranking, , is used unless in exceptional cases.

Environment Status Attack Value
The function E(p) → {0, 1} determines the p's state-and is a substation property. The environment flag, τ, is evaluated as shown in Equation (48)

Replay Attack Value
Here, the count of replay, y, increases by 1 if q i = q i−1 or r i = r i−1 . Therefore, using y, with y T being its threshold, the replay attack value, ω, is calculated in Equation (49).

Reconnaissance Attack Value
Using ι, ι max and ι T , for any q i or r i , the reconnaissance ranking, ξ, is described in Equation (50).

Query Flooding Attack Value
Utilizing ψ us and ζ pt , the query flooding rating, χ, is calculated in Equation (51)

Packet Manipulation Attack Value
Using l f and id, for any q i or r i , the score of the datagram manipulation, φ, is estimated in Equation (52

Consequence
Applying the use of τ, ω, xi, χ, and phi, the consequence, C i , is calculated in Equation (53)

Trust
Trust, T i in Equation (54), is prescribed as an ordered set of values (tuple) with β i as the score of the trust. The values of κ describe what negatively altered trust. β i is interpreted in Equation (55), where β i → [−1, 1], θ I is the original state prior to the calculation of trust, β o i is the score of the previous trust, β T i is the threshold of the trust score, µ represents the weight of forgiveness where µ → [0, 1], and θ µ represents the condition/state of the forgiveness. The attributes of forgiveness are deferred for later works. r ij in Equation (4) maps to C i , T ij maps to β o i , and the additional parameters linked to the three exposures. This is primarily due to the inherent information these exposures contain about those parameters.

Implementation
Prior to our model assessment, assumptions made were as follows: • The network communication of this substation is predictable because Q is pre-set by engineers.

•
The pristineness of this substation; therefore, ϑ = 1 queries will be considered as malicious.
• The existence of a determinate number of devices inside the network of the substation for the Modbus communication; hence, H, is additionally bounded. These pairs can be categorised into two: the client group, H m , and the server group, H s . Additionally, H s is restricted from sending arbitrary responses. IP-MAC pairs outside this group are considered malicious and grouped as H a . • Attacks that are neither Modbus nor IT-related are publicly disclosed by numerous CVE and CWE mitigation techniques; accordingly, they are considered outside of the sphere of the undertaking in this paper. • The networking port utilized for Modbus communication by a device is restricted to the port number stated in the Modbus specification document. • The attacker has penetrated the substation, achieved persistence, and has successfully evaded detection.
Datasets with both malicious and normal traffic were critical in our ability to effectively test our suggested model. The EPM dataset was one of two datasets that met our requirements [108] and the other being the ATENA H2020 dataset [109]. We took the following steps: • The reference features (Equations (23), (36) and (44)) for the exposures in Section 11.4 were generated using the benign traffic captures of the two datasets. • Based on established documentation of the datasets and careful analysis of every network capture file (pcap file) using Wireshark, H m , H s , and H a could be identified. • From H m and H s , members that were compromised were grouped as H . The rest of the members were the target devices, H t . • Per each dataset, we concentrated on communications that were concerned with H t and generated sub-capture files containing their communication with the other groups.
Because H was limited in the datasets, we relied on three types of tests to cover the attack scenarios (see Section 11.2) mentioned in this paper: • External Attack Test: Here, the existing condition is maintained as H m , H s , and H a ; hence, H a complies with the attack scenario A N mentioned in Section 11.2.1. Evidently, the outcome is that Q or R sent from H a will be flagged as expected without probing into the Modbus frame (see the first definition of Equation (43)). • Internal Attack Test: For this test, we have H m (Equation (56)) and H s (Equation (57)) to depict A M as described in Section 11.2.2. Any r i or q i sent from these groups be flagged accordingly.
• Internal Attack Test with IP-MAC Blacklisting: The test and groups are the same as the internal test with the exception that any device that has β i < β T is added to a group of blacklisted MAC-IP pairs, H b ; and is closed from further communication.
A Java application of the trust model was built to test the generated sub-pcap file. We used pcap4j library [110] to parse the Modbus packets. We then mapped a trust scale from the literature [90] to the Multi-State Information Sharing and Analysis Center (MS-ISAC) Alert Information [111] (Figure 8). The threshold flags for the exposures were set to 0.6 and β T = 0. All three tests were implemented for trust computation on the server side because there were external devices and internal devices that were used as attackers. Only the internal test was implemented for trust computation on the server side because there were only internal devices that were used for attack. Furthermore, the IP-MAC blacklist was well demonstrated on the server-side test so presenting it in this paper was deemed redundant.

Evaluation
This section discusses the implementation results in Section 12. An abridged description of each dataset is given before the discussion of the results. A summary of the trust score and alert descriptions/flags for sub-captures with unique characteristics are provided due to page limitations.

EPM Dataset
The Modbus dataset was used to test our work and the convert channel dataset of the EPM dataset was ignored [108]. This dataset has six benign capture files and five capture files that contain both benign and malicious traffic. The following attacks were implemented in the dataset. These were reconnaissance (characterization), commandand-control, moving malicious files, sending fake commands, and exploits. With the exception of reconnaissance, all other attacks were labelled; thus, we were able to provide the percentage of attacks detected by our model for those attacks.
We were able to do all three kinds of test from the server-side point of view. However, for the client-side point of view, only the internal tests were done because by default, clients do not send queries to external entities. Furthermore, the tests were only done on the command-and-control and moving malicious files attacks because those which involved a client device but the other attacks did not. On the server-side, we identified all malicious client-side messages and we did the reverse for the client-side. Tables 11 and 12 showed that our model detected all the labelled attacks. We will explain our observations and delve deeper into the following sections.  All packets sent from a member of H were flagged as an IP-MAC Mismatch; thus, they were also assigned a Complete Distrust score and a Red-Severe alert level (Figure 9a). Figure 9b shows whether the packet is Modbus-related (Q or Q ) or not. These attacks affected E s .
In scenarios where attacks were from a member of H m , the model flagged them and gave the appropriate scores. In the reconnaissance attack, a member of H m sent packets using a non-Modbus port that were flagged accordingly and assigned Complete Distrust scores (Figure 10. This attack also affected E s . Assuming that the substation environment was in a normal state, any q where ϑ = 1 was flagged as an APT threat, as displayed in Figure 11-it also affected E f .

Internal Attack Test towards Server
For exploiting moving files and CnC ( Figure 12) attacks, all packets from a member of H were given a Complete Distrust score and Red-Severe alert level which affected E s . Characterization, however, showed a member of H sent and replayed Q which were given a Low Medium Distrust score and Yellow -Elevated alert level, as shown in Figure 13. In the send-fake-command scenario, a member of H sends a write request and is flagged accordingly, as shown in Figure 14.

Internal Attack Test with IP-MAC Blacklisting towards Server
For the attacks, when the trust score of q is below the threshold, a member of H becomes a member of H b and that is visible for exploit, CnC and moving-file captures when an unknown port was used ( Figure 15). This means that all kinds of packets bearing the blacklisted member's IP and MAC addresses were dropped; thus explaining the relatively larger number of flagged packets. The same goes for a send-fake-command scenario as well (Figures 16 and 17).

Internal Attack Test towards Client
Attack scenarios where actual client devices were used are CnC and moving-files attacks. As such, those were the captures that we used to test our model. It can be shown in Figures 18 and 19 that malicious packets used a different port and as such were flagged by the model accordingly.

ATENA H2020 Dataset
In this dataset, ICMP flooding, TCP SYN flooding, Modbus query flooding, and MitM attacks were implemented. Of the four attacks, Modbus queryflooding and MitM attacks were focused more on Modbus. Regardless, capture files involving all four attacks had some Modbus packets in there so we focused on those. The dataset was grouped into three sets of capture files. The length of each capture file was either 30 min, 1 h, or 6 h. The attack duration was in series of either 1, 5, 15, or 30 min. We observed that only one read-access function code was implemented in this dataset; thus, we deactivated Equation (49) for consequence.

External Attack Test towards Server
MitM captures shows that requests from members of H a were detected and assigned IP-MAC Mismatch and Red-Severe as shown in Figure 22. Furthermore, some requests from members of H m were flagged Length Mismatch and Red-Severe because they contained packets that had more than one Modbus frame ( Figure 23). In Figure 24, our model flagged some requests from members of H m as Query Flooding of Known Read Query, EI: Below Threshold (E i affected) and a Blue-Guarded in the query flooding captures. Investigations show that these were a result of delayed requests due to query flooding attacks from members of H a . In the clean captures (Figure 25), there were two malicious requests from members of H m that contained multiple Modbus frames; thus, they were flagged with Length Mismatch and Red-Severe.

Internal Attack Test towards Server
In MitM captures, Figure 26 shows that a member of H a performed a baseline replay but did not perform any final attack and as such, was not detected by the model. However, when a baseline replay was performed and a final attack was done, it was detected and it showed the packet and multiple Modbus frames ( Figure 27).
For query flooding captures, there are unknown writing requests that were flagged Unknown Write Query and Red-Severe as shown as Figure 28. Figure 29 shows Q sent by a member of H a within less than time periods that is finally flagged with Query Flooding Attack and Red-Severe exceeding ζ T pt . For the sake of simplicity, we set ζ T pt to five requests even though it will vary from substation to substation. The first four requests were marked as Blue-Guarded and the fifth was marked as Orange-High. Figure 30 shows requests being flagged with Red-Severe because the packet manipulation attack by H a triggered φ.

Internal Attack Test with IP-MAC Blacklisting towards Server
Figures starting from Figures 31-33 show that the packets that are dropped after β are less than β T and the device is placed in H b . Like the EPM dataset, it provides clarity on the attack caused by the violation. It also reveals that a compromised or attack device can be well behaved before acting to impair the target device.

Internal Attack Test towards Client
In this test, there were no external devices posing as servers; thus, we performed the internal test only. Attacks towards the client side from a member of H s were mostly affecting E i . Further probing of the packets revealed that there were delayed responses to requests (Figures 34 and 35). We also observed that there replayed responses which saw a high increase time in query-response time. There were a few instances wherein the Modbus frame size exceeded the maximum frame size (Figure 36).

Testing with Criticality Variation
The results presented to date had i in reflecting a low criticality-ranked device in Table 9. Since most critical ranked devices provide i = 1, those will generate a significant number of false alarms, and we use the weights specified in Section 11.5 to adjust to a suitable value. We implemented this on the client side to raise the necessary alarms for the critical IED. It can be observed that the results are more sensitive and this can be used to promptly raise alarms for critical devices for action to be taken on them. Comparing the Figures 34 and 37, it can be seen that Figure 37 is more sensitive.

Discussion
The results from our work showed that it was possible to characterize the attack of the datasets. Tests on the server side of the EPM dataset showed that E s and E f were the most affected because the attacks were more focused on TCP ports and Modbus read-only queries (see Figures 10-17). However, tests on the client side show that only E s was affected (see . The labelling of this dataset allowed us to determine the accuracy of our model as shown in Tables 11 and 12. Such confidence allows us to boldly claim similar accuracy with the ATENA H2020 dataset even though that dataset is not labelled. On the server side test, we observed that E i , E s , and E f were affected by the attacks which shows how comprehensive the attacks were (see . However, on the client side, it was mostly E i that was affected by the attacks (see Figures 34-36).  Furthermore, we identified that a description of the mix of attacks was not referenced by the authors. We noticed that different works-not on trust, however-do not give specifics as our work has done in the wake of identifying attacks [112][113][114]. Furthermore, the comparison of our work with other trust models was challenging because there was only one trust model [84] that was utilized in a scenario such as ours. Notwithstanding, our trust model which computed trust made on the ratio of responses to requests and that would fail against baseline replay attacks. Their model would also not detect attacks contained in responses.
We noticed that the ATENA H2020 dataset had the same transaction ID throughout, and such an implementation makes it easy for an attacker to include malicious packets because there are no similarities in the MBAP header. We recommend that transaction IDs are made sequential to enable the tracking of packets. We also recommend that each request must utilize one session per request to mitigate TCP session hijackings. We also recommend that stacked Modbus PDU requests be dropped by an application's Modbus implementation.

Conclusions
In this paper, we present a categorized review of literature related to trust within the Smart Grid. This categorization was guided by the trust definitions according to the literature and the NIST priority areas and conceptual domains. From the presented paper, it is very clear that a lot of work needs to be done in the field of trust within the Smart Grid as well as making efforts to have it implemented in a cognitive environment whereby components can be adaptable to situations.
We also presented and tested a novel trust model for substations that detects attacks within the substation. We stated that familiarity and consequence are required to compute trust. We included the output of the novel risk assessment tool to compute the consequence of an attack on a substation. Using the model, we tested our work on two publicly available datasets using three kinds of tests. The external test is one in which purely attacker devices (not compromised substation devices) are assumed to be not part of the substation's network. The second is the internal test wherein all devices are assumed to be part of the substation's network. The final test is the internal test with the IP-MAC block which assumes the position of the second test but blacklists any device that sends a malicious message.
Our model also revealed the behaviour of the datasets which has not been done in other trust models and not detailed as such in papers that used those datasets.

Future Work
We believe that our model can be embedded in a device's logic, extended to other OT-based protocols such as DNP3 (future work), and implemented in other critical infrastructures. Queries made out of order during troubleshooting will create false alarms; thus, this is a weakness of our work and will be addressed in future work. We aim to look at the community computation of trust for future work for multiple devices to manage trust-based attacks. The trust transferability of a device from one substation to another is also marked for future work. We also observed that a Modbus dataset containing network captures and attack scenarios specific to substations is required and that will be addressed in future work.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The