Intrusion Detection System for the Internet of Things Based on Blockchain and Multi-Agent Systems

: With the popularity of Internet of Things (IoT) technology, the security of the IoT network has become an important issue. Traditional intrusion detection systems have their limitations when applied to the IoT network due to resource constraints and the complexity. This research focusses on the design, implementation and testing of an intrusion detection system which uses a hybrid placement strategy based on a multi-agent system, blockchain and deep learning algorithms. The system consists of the following modules: data collection, data management, analysis, and response. The National security lab–knowledge discovery and data mining NSL-KDD dataset is used to test the system. The results demonstrate the e ﬃ ciency of deep learning algorithms when detecting attacks from the transport layer. The experiment indicates that deep learning algorithms are suitable for intrusion detection in IoT network environment.


Introduction
The Internet of Things (IoT) can be considered as an interconnected system based on approved protocols which exchange information [1] among the devices operating via the internet. Recent advancements in IoT introduces the concept of smartness to devices, sensors, homes, streets and even cities. IoT is one of the leading and growing fields of modern computing and communication technology and has made major contributions to various domains, from agricultural sector to vehicles automation. Nowadays, IoT is also referred [2] as the Internet of Everything (IoE) as it deals with any type of connected device in daily life. It is anticipated that in 2025 [2] the number of connected devices may reach up to 21.5 billon.
IoT is a combination [3] of several layers including a network layer. The design of the network layer is based on the traditional layers of Internet communication and is mainly responsible for transferring data packets between hosts. Moreover, the network layer is complex and a vulnerable portion in IoT architecture leading to various security issues. Nevertheless, several security frameworks are in place to address the security issues [4]. These frameworks require installation in the IoT architecture and/or the devices in order the devices to operate effectively resolve security threats. Unfortunately, most security frameworks require considerable computational power, as well as storage [5]. However,
Network-based IDS Host-based IDS detects intrusion behavior by scanning log and audit records. This kind of IDS is usually used on important hosts to protect the host security from all directions. The advantage of the host-based IDS is that it provides more detailed information, lower false alarm rates, and has less complexity than network-based IDS. However, it reduces the efficiency of the application system and relies excessively on the log data and monitoring capability of the host [15].
Because of the characteristics of the IoT and because many IoT devices can be connected to the network, network-based IDSs need attention. Network-based IDS can detect the abnormal behavior and data flow in the network, to find potential intrusions. It does not change the host configuration and does not affect the performance of the business system. Even if the network IDS fail, it will not affect normal business operation. A problem is that network-based IDS only checks its direct connection to the network segment without looking at other network segments. It is also difficult to process encrypted sessions with network-based IDS. [16] 2. 1

.2. Detection Methods
Based on the nature of various intrusion attacks (Figure 1), intrusion detection methods are classified into four major categories: signature-based methods, specification-based methods, anomaly-based methods and hybrid methods [16].

History of Intrustion Detection System
Generally, IDS includes both software and hardware mechanisms and IDS is responsible for identifying malicious activities by monitoring network environments and systems. In other words, IDS is used for detecting cyber-attacks and providing immediate alerts. Overall, IDS acts like a safeguard to the networks and systems. IDS is normally deployed after the firewall and is used with an intrusion prevention system. IDS is not a new term in the fields of IoT research regarding security and privacy. A significant number of publications have appeared in recent years. Cyber security experts have been concerned about the security and privacy of IoT environments for some time. This has led to the introduction of the concept of IDS embedding into IoT architectures and devices to deal with cyber-attacks [11][12][13]. Researchers are mostly interested in inventing new mechanisms and models to counter intruders in conventional network protocols. However, traditional IDS mechanisms are incompatible with IoT devices connected through IPv6 and other complex network structures [14]. More comprehensive research on the use of machine learning methods is essential for IDS to secure and protect privacy in IoT.

Classes of IDS
IDS is classified in two main categories as follows: 1. Host-based IDS 2. Network-based IDS Host-based IDS detects intrusion behavior by scanning log and audit records. This kind of IDS is usually used on important hosts to protect the host security from all directions. The advantage of the host-based IDS is that it provides more detailed information, lower false alarm rates, and has less complexity than network-based IDS. However, it reduces the efficiency of the application system and relies excessively on the log data and monitoring capability of the host [15].
Because of the characteristics of the IoT and because many IoT devices can be connected to the network, network-based IDSs need attention. Network-based IDS can detect the abnormal behavior and data flow in the network, to find potential intrusions. It does not change the host configuration and does not affect the performance of the business system. Even if the network IDS fail, it will not affect normal business operation. A problem is that network-based IDS only checks its direct connection to the network segment without looking at other network segments. It is also difficult to process encrypted sessions with network-based IDS. [16] 2. 1

.2. Detection Methods
Based on the nature of various intrusion attacks (Figure 1), intrusion detection methods are classified into four major categories: signature-based methods, specification-based methods, anomaly-based methods and hybrid methods [16].   Signature-based methods first scan the data in the network and compare it with a feature database. If the scanned data is found to match the features in the signature database, the data will be treated as an intrusion. The advantage is that it can accurately determine the type of attack. It is relatively convenient to use, and the demand for resources is comparatively small.
Specification-based methods require the system administrators to set rules and thresholds in advance. IDS detect the status of the current system and network according to the rules and thresholds set by administrators [17]. If the threshold is exceeded or the rules are violated, the IDS will detect an abnormal situation and act accordingly.
Anomaly-based methods depend on identifying abnormal patterns and by comparing traffic patterns. The advantage of using this method is that it enables the detection of new and unknown intrusions. However, the primary limitation is that the method tends to result in high false positive rates. Research is now focusing on applying machine learning algorithms in anomaly-based intrusion detection methods to improve the robustness of this kind of method. By employing machine learning algorithms, anomaly-based intrusion detection methods can monitor the ongoing intrusion footprints and compare them with existing datasets to be alert to potential future attacks.
Hybrid methods refer to the use of any combination of the above-mentioned detection methods in the same IDS. This approach can help to overcome the shortcomings of a single method thereby enhancing the reliability of the entire IoT system. However, the obvious drawback is that the entire IDS will become very large and complex. This will make the whole system more difficult to operate and will require more resources. Especially when there are many protocols involved in the IoT system, the intrusion detection process will have large resources and time demands.

Placement Strategy
Centralized IDS (Figure 1) consists of several monitoring nodes for understanding the behavior of the host or the network. The data (either alerts or system logs or network logs) will then be shared with the main server or node for further analysis. As the system becomes more complex, the central server load will increase, reducing system performance and potentially even causing security risks. In the IoT environment, most of the data interaction occurs in the network layer and the perceptual layer, and the distribution of the IoT facilities is complex and scattered. The centralized IDS can therefore not effectively detect intrusions of IoT devices [18].
Distributed IDS (DIDS) consists of several modules, each of which is responsible for different tasks. They usually share the monitoring tasks. For example, in the IoT environment, the threats faced by the network layer, the perception layer, and the application layer are not the same, so the corresponding intrusion detection modules need to be different as well. This can greatly increase the security of the system, without the need for excessive data storage and sharing among the modules. The primary advantage is high scalability, which means it can be adapted to future IoT environments. The disadvantage is that the resource consumption and communication costs can be high [18].
Some hybrid placement strategies have been presented in recent research. The first strategy for hybrid placement separates the network into different regions. There will be a node that has an intrusion detection function and monitors nodes in the same cluster. The responsibility of these nodes also is to monitor the data package coming from their neighbor nodes. If any anomalous behavior is found, it is concluded that the neighbor nodes were compromised by attackers. The border router collects information from the nodes and makes the final decision about the intrusion. It also should be noted that hybrid IDS with distributed components is more accurate than centralized IDS. However, the hybrid system has large resource demands [19].

Previous Research on IDS for IoT
Improving the IDS of IoT [17] is a major focus of this research. This section provides an overview of the improvements and the developments of IDS for IoT. These can be classified based on target threats, placement strategy and detection method. An unsupervised and hybrid intrusion discovery framework to recognize selective-forwarding and sinkhole assaults for IoT was proposed by Bostani in 2017 [20]. It uses a half breed situation technique, pursuing an anomaly-based middle agent, separated with a few specification-based components which share their own location information to a centralized interruption location module in order to anticipate any inconsistencies. It applies an unsupervised optimum-path timberland (OPF) calculation with MapReduce engineering.
Yulong et al. proposed employing a modern interruption discovery technique for the IoT environment agreeing with an automation demonstration [21]. This strategy distinguishes three types of IoT assaults: false-attacks, jam-attacks and reply-attack. It is an expansion of named move frameworks. The arrangement of this IDS may be a centralized approach, since the information that is accumulated by arrange hubs is sent to the interruption discovery module and aids in constructing the occasion databases. The IDS utilize the occasion analyzer, based on a specification-based strategy, to identify interruptions. By comparing the pre-occupied activity streams, this IDS can proficiently identify jam-attacks, false-attacks, and reply-attacks in IoT networks.
Kapitnov et al. [22] proposed a method, also in 2017, that assembles multi-agent systems and blockchain technology in augmenting communication among various autonomous agents involved in unmanned aerial vehicles (UAV). They proposed an architectural protocol to incorporate the communication through an Ethereum blockchain technology in peer-to-peer connected networks. The result is a protocol that is potentially compatible with autonomous agents. The proposed protocol ensures security of the communication process and assists in anticipating the robustness of variable working states. This has led to further investigations in combining blockchain technologies with multi-agent systems.
Calvaresi et al. [23] stated that integration of blockchain technology (BCT) and MAS can achieve the distribution of the trust and can remove the necessity for trusted third parties (TTP) which are the main single points of failure. Calvaresi et al. in 2019, proposed implementing mechanisms for transparent reputation management via BCT, which addressed the challenge of enforcing trust in MAS. They designed and implemented a system integrating MAS, based on the Java agent development framework (JADE), and BTC, based on Hyperledger Fabric. By extending the management of the agent identity to the membership service of Hyperledger Fabric a trustworthy community is created by the system. Moreover, the association agent is offered services using the ledger. Mechanisms are implemented to transparently compute and store agents' reputations based on the communication patterns between the agents and the evaluations of the interactions. The system is evaluated with various test cases. Agent behaviors consists of autonomous and user-dependent behaviors. Smart contract mechanisms are combined with computing and monitoring the agents' reputation based on agent behaviors. Because of the available Hyperledger Fabric implementation, the system is robust and scalable. Testing can be done easily and fast because of a classic command-line interface (graphical interface eased the interactions). Although a prototype integrating BCT and MAS was implemented, technical limitations still exist. This includes prevention of single points of failure of the system by outlining a reasonable mapping between real entities and distributed components of the underlying blockchain technology, enforcing cryptographic solutions to enhance security and privacy, verifying the smart contract implementation precision, and approving the blockchain and agent technology in real-world systems. There are also ethical considerations for BCT and MAS integration in real-world scenarios. There is a direct relationship between the immutability property of the ledger and the transparency of reputation management and achieving an application area that enhances the system users' power and supports trustworthy interactions. Deliberate or accidentally malfunction of the system can compromise the users' privacy, and the framework relies completely on the smart contracts. Although the BCT enabled MAS would preclude intermediaries in conflict resolution, the reliability of the software and verification process in the BCT environment is an open question.
In 2018, Calvaresi et al. [24] reviewed multi-agent systems and blockchain technologies. Their review concluded that the challenge in using blockchain technology is similar to the previously described problems in multi-agent systems for different application domains. Various factors were discussed, such as motivations and requirements, mechanisms and application domains, strengths and limitations. The authors emphasized that a comprehensive evaluation of these systems is necessary, not just discussion of possible advantages. However, they also concluded that combining blockchain technology with multi-agent system-based platforms is serviceable and can be a robust mechanism for gathering information and data from a very large number of associated nodes in a network e.g., IoT. Multi-agent systems combined with blockchain technology could cope more effectively with the challenges of sophisticated interconnected environments.
In a multi-agent system, agents often may get stuck in a situation where reaching 'consensus' becomes impossible. Identifying conflicts of interest among agents and resolving the resulting stagnation is not a trivial task. The model proposed by [25] examines cooperative tendencies by considering the probabilities of cooperation for each agent in the environment. The collective payoff can be augmented by candidate policies (identifying the candidate action sets). The Nash bargaining solution (NBS) algorithm is then used to select the top candidate action sets. The solution to reduce these dilemmas is expected to decrease memory footprint and 'convergence time'. However, the current implementation is not yet suitable for high dimensional environments having extensive state space.
In 2018, Diro et al. proposed an IDS for IoT where deep learning is utilized for anomaly detection [26]. This was demonstrated to be effective in identifying IoT Fog organize attacks compared to conventional IDS.
Air traffic flow management (ATFM) systems plays a significant role in efficiently managing air traffic control. For high precision decision making, a high-fidelity and mathematically reliable model of ATFM is often required but this is quite difficult to design due to the complex nature of airspaces. To address the issue, Ta et al. [27] developed an intelligent, multi-agent and distributed ATFM solution, named BlockAgent which is based on blockchain and reinforcement learning. The system is comprised of three parts, the local layer, the blockchain layer and the global optimization layer. The local layer allows local stakeholders placed in regional bases. The blockchain layer provides distributed and decentralized coordination mechanisms to manage and coordinate a range of diverse information related to airports, aircrafts, varied and dynamic weather conditions, route optimization, etc., most often controlled from a central location in contrast to the usual multi-agent structure of ATFM systems. The global optimization layer: enables the execution of smart contracts to aggregate information that has been locally obtained. The system uses reinforcement learning to reduce the delays experienced by aircrafts both on air and before take-off. BlockAgent did outperformed traditional ATFM systems in handling delays when tested in a limited setup. The authors aim to enhance the system to make it viable for large-scale implementation.
Casado-Vara et al. [28] presented a hybrid IoT architecture that allows decentralized data management via blockchain. This architecture includes a computation distribution under the edge computing paradigm, which enables optimizing the connection between IoT and blockchain. Another part of this hybrid architecture is data management. This consist of a big data ecosystem that simplifies the management of large amounts of data in the blockchain. A new algorithm based on game theory is proposed and applied to the data collected by IoT devices to enhance data quality and false data detection. The existing end-to-end architectures are optimized by the proposed hybrid architecture.
In 2019, Li et al. proposed a way for IoT data feature extraction and a new IDS for smart cities based on deep migration learning [29]. By using deep migration learning, the proposed IDS can overcome the lack of a suitable training sets and resolve sample misclassifications and spatial constraints of data clustering problems, optimizing the intrusion detection model and improving efficiency. Although the experiments show that this proposed system has a better performance than conventional methods and that it reduces the clustering time effectively, the classification accuracy decreases when compressing.
Another deep learning-based model for intrusion detection in networks was put forward by Le et al. in 2019 [30]. A framework was developed for a feature engineering process, selecting intrusion features intelligently. The outcome of this stage was a dimension reduction resulting in the important Electronics 2020, 9, 1120 7 of 27 features for intrusion detection, a subset of the original feature set. The next part of the process was building multiple IDSs and training them using the selected features. The learning process took place using several deep learning methods e.g., recurrent neural networks (RNN). The authors reported a very high classification accuracy over two datasets containing intrusion footprints.
Another notable work presented by Arshad et al. in 2019 proposed a new intrusion detection framework for resource-constrained IoT devices [31]. This framework aims to separate intrusion detection among the IoT devices and the edge router. IoT devices are used as IDS nodes to scan network packets so that the edge router can have an overall view about the network. The limitation of this framework is that the edge router will receive the raw packets from the host node which may contain sensitive information.
Anthi el al. in 2019 proposed a three-layer IDS design to distinguish genuine time pernicious behavior in domestic IoT gadgets [32]. This architecture focuses on classifying the type and profile of the normal behavior of each IoT device in order to detect and classify attacks. This IDS architecture has been evaluated by utilizing organized action information from a genuine test bed and scripts to dispatch multilevel attacks which speak to the behavior of an assailant. In future it should be investigated work, how the extensive need of feature engineering and data labelling could be circumvented.
The response to a survey conducted by Chaabouni et al. in 2019 indicate that machine learning techniques are successful in terms of network security and privacy [33]. The authors concluded that machine learning based frameworks for intrusion detection showed very promising results (up to 99% detection accuracy with a 0.01% false positive rate). They advocate further experimentation, combining IDSs with novel machine learning models.
The research summaries above show current progress in the development IDSs for IoT. In the following section, the technological aspects of IDS models for IoT are further explored [34].

Technology
This section explains the technology behind our proposed IDS, involving a multiagent system and the application of blockchain technology.

Multi-Agent System
An agent is described as the intelligent behavior of computer software in the fields of artificial intelligence (AI) and computer science. The term "multi-agent" generally refers to MAS or multi-agent technology (MAT). A MAS is a set of multiple agents and it is an extension and application of distributed artificial intelligence (DAI) [35]. Complex systems can be simplified and modularized using a MAS. The agents are responsible for coordination and communication through their respective tasks. Being thoroughly autonomous and form-independent, every agent can be an individual and a cluster in the MAS [24,36,37]. While the agents are developed in different languages and have divergent design patterns, they should use standard communication modes that invokes mutual communication which is not present in a single agent system. Each agent works by having its respective set of properties and operation rules. Based on these action rules, they execute tasks during the operation of the whole MAS. The cooperation between agents in MASs helps humans to resolve some complex phenomena. Agents should have the following four basic characteristics: autonomy, responsiveness, initiative, and sociality. This is mainly manifested in the intelligence and agency ability of the system. Intelligence refers to the ability of an application system to use reasoning, learning, and other techniques to analyze and interpret all kinds of information and knowledge [38]. Agent capability refers to the ability of an agent to perceive messages from the outside world and react autonomously according to its own knowledge.
Savaglio et al. [35] provided a survey, reporting the most relevant contemporary contributions in the agent-based computing (ABC) IoT in order to assess the suitability of the ABC for IoT development. Although the authors stated in terms of computing, storage, and communication, there are no technological limitations which would prevent the full realization of the IoT ecosystem, its multifaceted Electronics 2020, 9, 1120 8 of 27 development issues still require further attention. The analysis indicated that by applying ABC both SO and IoT system modeling and programming of interoperability, automaticity, and distributed intelligence can be done at to various degrees. This modeling has advantages over other approaches, such as object-oriented, service-oriented, and component-oriented computing paradigms. Furthermore, validation of multiple design choices can be performed by the agent-based simulation before the deployment phase. In synergy with other paradigms such as cloud and edge computing, it can be performed systematically and efficiently by agent-based methodologies. The survey confirmed that an agent-based development approach signifies an effective choice for many SOs and IoT systems.
In this research the proposed system uses a MAS platform JADE [39], which provides simple but powerful task execution and composition model. It uses asynchronous message passing for communication between agents. JADE is based on Java, so it is platform-independent. JADE can be used on different Java-oriented environments such as Android devices and J2ME-CLDC MIDP 1.0 devices. Moreover, JADE allows a configuration of networks characterized by partial connectivity.

Deep Learning
In recent years, the term "deep learning" has gained popularity among researchers who work with artificial neural network (ANN) based machine learning techniques. Like ANNs, deep learning is inspired by the general structure and functionality of the brain. Deep learning involves a collection of multiple ANN layers which are stacked. It consists of both an input and output layers, building a layered data flow network. Deep learning is also described by the term deep neural network (DNN) or deep structured learning [40]. The key difference between ANN learning and deep learning is in the hidden layers. The hidden layers are placed hierarchically between the input and output layers. Deep learning is more robust than typical ANN as it processes and computes the given inputs to a further extent than an ANN. In the real world, the volume of data is growing leading to more data complexity. The intrinsic character of deep learning is the ability to learn from the previous layers or a set of large data. This feature makes deep learning a very strong and powerful candidate in the selection of machine learning models especially in classifying data of unlabeled sample datasets [41].

Multi-Agent Reinforcement Learning
Reinforcement learning (RL) methods [42] enable the computer to finish the task through continuous training and learning from the start. In recent years, as Alpha Go, in which DeepMind developed and has excelled in complex tasks, RL has shown great research potential [43]. RL has a long history as Sutton et al. already began to study RL in 1979. The method was initially used in the field of intelligent control, but with the further development of science and technology, RL has now been widely applied in autopilots, voice interactions and many other fields. RL is of great importance in the area of machine learning and computational intelligence.
The sequence of actions of multiple agents regulates the environment of multi agent systems. If an agent fails to perceive the interrelation between its own actions and environmental changes, it may generate a non-standard Markov environment. As the approach of RL is defined by the design patterns of MASs, concurrent isolated RL (CIRL) can be employed where a solo agent does not interact with other agents and only relies on an unsupervised learning technique. Interaction learning of multiple agents is achievable by applying interactive RL. Single agent RL only needs to consider temporary credit assignment problems, while multi-Agent RL needs to consider structural credit assignments [44].

Blockchain
Decentralization is the main motivation behind blockchain technology [45]. The distributed and transparent feature of the ledger of the blockchain means that the failure of one node does not affect the whole system. Blockchain changed the framework of the transaction network from a star to a point Electronics 2020, 9, 1120 9 of 27 to point (P2P) layout. This transformed framework allows two parties to deal with each other directly with the help of encryption and security based on code and algorithm protection. Since the parties engaged in the transaction system are only required to trust the employed algorithm for establishing mutual trust, there is no necessity for knowledge about the trustworthiness of parties [46]. Moreover, the framework does not require any security endorsement by third party agents as the algorithm is fully responsible for all kinds of authentication.
Ethereum [47] and Hyperledger Fabric [48] are the two most popular blockchain application development platforms. Their underlying technologies are the same. The fundamental difference between Ethereum and Hyperledger lies in the way they are designed and in the target users. The Ethereum virtual machine (EVM) is available in Ethereum. Smart contracts and public blockchains are targeted at applications that are used by general consumers. Hyperledger Fabric has a very modular architecture that is more suited to business applications. It provides great flexibility and applies business logic more freely. The goal is to simplify work and trade processes using blockchain technology, that is, to solve the problem of inter-firm credit.

SESS (Smart Efficient Secure and Scalable) System
This paper uses Design Science Research and Science and Engineering Research methodologies [49]. By combining science research with engineering design and implementation, the feasibility and value of this new purposed intrusion detection system framework can be explored.

System Perspective
This system will be used to discover attacks from current network traffic and monitor the details of the IoT network condition. A web portal will be used for visualization of detection reports and configuration of response rules. The system consists of five parts: a blockchain smart contract module, a detection and analysis module, a response module, a data process module and a collection module. The system will need to communicate to IoT devices and monitor the dataflow of these devices. Since the system needs data to detect malicious behavior of IoT network it requires a database for data storage. The system will add and modify the data while the web portal can only gather the data. All the database communication will occur over the Internet/Intranet.

System Functions
The smart, efficient, secure, and scalable (SESS) system [34] will enable the IoT network administrator to monitor IoT devices based on network traffic data. The area that will be monitored is based on criteria defined by the administrator. There are several options for administrator to adjust the monitoring process of the system collection module. Collected data will be processed by the data process module which does the first attack detection, based on feature classification. These data will then be divided to two parts: an unidentified dataset and a training dataset. These two datasets will be moved to the detection and analysis module. The training dataset will be used to train the detection agent. The unidentified dataset will be used to analyze the performance of the model. All results will be sent to response module which acts based on configuration rules set by the administrator. All processes will be stored in blockchain ledgers of hosts which have a SESS module.

User Characteristics
IoT network administrator can use the system for intrusion detection and can add data and configure the data process and detection criteria and reduce agents when desired. For large scale IoT networks, the system modules will be installed and executed in different hosts. This means that each SESS module requires its own administrator. These administrators should be able to configure the modules that they manage. The intrusion detection system described [34] in this paper ( Figure 2) adopts a multi agent technology. Each agent is a relatively independent unit. Agents are categorized into four different types of modules. They communicate with each other through communication agents residing in each module. In this way, each module can work relatively independently which reduces dependencies between modules. The whole system consists of a collection module, a data processing module, a detection and analysis module and a response module. SESS uses Foundation of intelligent physical agents-agent communication language FIPA-ACL as agent communication language because FIPA-ACL is supported by many communities. The four communication agents will be modified and improved through interactive reinforcement learning. This means that each agent is affected by the other agents. Every successful action achieved by other agents creates feedback which is sent to the communication agent in same module. Communication agents create a feedback report after collecting feedback from other agents in the module. The feedback reports will be used for training communication agents. The credit for feedback will be assigned to communication agents based on their contributions. Each action that communication agents make will be considered a transaction. Because communication agents are the only agents which have the right to make commands, most security issues will be raised by the communication agents. Transactions will be recorded on blockchain and only the system manager has access to the smart contract (chaincode) of this blockchain.

System Development
Traditional IoT environments face many issues, including a lack of device security, centralized control of data, rigid architecture, lack of communication compatibility and difficulties in cooperation between multiple parties. The IDS used for IoT networks needs to deal with these challenges.
Blockchain can improve device security by its decentralized way to save and pass communication messages, which are commands that control the behavior of the system. Blockchain also helps to build trust between multiple parties by using an agreed smart contract to regulate system operation and behaviors of agents. Agents allow the system to scale down and scale up, making it more suitable for coping with different types of IoT environment structures and communication protocols. As the scale of IoT networks increases rapidly, IoT networks change very fast. Agents can help the system to adopt to these changes, while blockchain can ensure the security of the large scale IoT network. The source code of this system has been released for the open source community and can be downloaded from Github [50].

Blockchain Component
In this system, private chain is used to secure communication between agents. The system integrates the embedded database and the blockchain node in the same agent, starting and stopping at the same time. The latest interactions will be added to the cache area, reducing the search time, and speeding up the usage and storage of disk data by agents. Storing a copy of the database (DB) in all the agents will reduce the overhead of the communication agent by performing a local search of the newly updated information. The communication agent of the detection and analysis module is a super node for the blockchain module. It contains the entire blockchain ledger and is used to make sure all nodes in the network can receive the correct copy and connect with each other. All other communication agents will be full nodes. They contain and distribute the entire blockchain ledger and are essential for validating every communication record in the blockchain. Other agents in each module can be enabled as light nodes and connect to the communication agent which is their parent node. These light nodes only contain block headers of the block and used to check if their parent node gets tampered The blockchain module is not only used to regulate and validate behaviors of communication agents and to ensure the safety of the system, but also to build trust in IoT networks in which multiple parties are involved which each have their own agents. In addition, data from communication messages can be used to analyze attacks and improve agents by applying reinforcement learning. The following describes the Java classes we have used to implement the blockchain.

AgentAccount
The AgentAccount object will store the agent name, the agent publicKey and the agent privateKey. These keys will be generated by the generate PublicAndPrivateKey method. The keys are encrypted by the Elliptic curve digital signature algorithm (ECDSA) cryptographic algorithm, which is also used for encrypting Bitcoin accounts. This ensures that only the correct agent can generate certain communication messages. By random generation of the public key and the private key for each agent, based on the ECDSA cryptographic algorithm, agents can secure their information during communication.

BlockChainUtil
This class contains common methods that need to be used in blockchain products. These methods are mainly are used to deal with encryption and decryption based on cryptographic algorithms.

Communication
The Communication class is used as message container among communication agents. Because communication agents are administrators of the system, their communication has a crucial impact on the system performance. It is therefore important to ensure integrity and security of information which is shared between communication agents. Communication objects contain a hash, data, a previous communication hash, the sender agent's privateKey signature, the sender's publickKey and the recipient agent's publicKey. The hash argument is the identifier of this communication object. The data argument contains information that the sender agent wants to send to the recipient agent. The prevCmHash argument is the hash value of previous communication agents, which is used to track the order of relevant communication objects. The signature argument is created based on the sender agent's privateKey and can be verified by the sender agent's publicKey. It is used to make sure the information is sent by sender itself. The sender and recipient arguments are publicKeys of the sender agent and the recipient agent.

Block
Blockchain is a chain consisting of different blocks. A block contains the previous block hash, its own hash value, the generation time, MerkleRoot and the max block height. MerkleRoot is calculated in a recursive way. The first tree layer is calculated according to hash of communications in the block, and the each subsequent tree layer is calculated based on the previous tree layer, until the tree root is calculated. The MerkleRoot is an efficient way to track target communication and verify its integrity. The previous hash is used to identify the position of this block, while the hash is used as an identification value. Once the size of the list has reached the MAX_BLOCK_HEIGHT, the MerkleRoot and hash will be calculated and the block will be added in blockchain. The, updated blockchain then will be saved on the local host and broadcast to other modules. An example of a block generated using blockchain is shown on the next page. still be added to or removed from the system. The contract is only used to secure communication between communication agents within the management layer, to maintain both security and scalability. When the detection and analysis module is started, blockchain will be initialized. Block genesis and communication will be saved in the blockchain and the blockchain will be saved in the local host and broadcast to other modules. The reason why this blockchain is initialized by the detection and analysis module is because this module is the core module of the system. The type of communications that can be sent and saved is decided by the Contract. Smart contracts can only be edited by system managers before initialization. They cannot be changed once the system is initialized.
Example of a block:

Box 1
Electronics 2020, 9, x FOR PEER REVIEW 12 of 27 in a recursive way. The first tree layer is calculated according to hash of communications in the block, and the each subsequent tree layer is calculated based on the previous tree layer, until the tree root is calculated. The MerkleRoot is an efficient way to track target communication and verify its integrity. The previous hash is used to identify the position of this block, while the hash is used as an identification value. Once the size of the list has reached the MAX_BLOCK_HEIGHT, the MerkleRoot and hash will be calculated and the block will be added in blockchain. The, updated blockchain then will be saved on the local host and broadcast to other modules. An example of a block generated using blockchain is shown on the next page.

Contract
The contract class contains all the business logic of the communication process. Permissions and validations are defined. They are related with specific communication messages the communication agent can send to other specific communication agents and determine when this communication action can be valid. Communication agents can only trigger the communication function by using this smart contract. Because this contract only regulates communication agents, other agents that have new functions and are under control of communication agents can still be added to or removed from the system. The contract is only used to secure communication between communication agents within the management layer, to maintain both security and scalability. When the detection and analysis module is started, blockchain will be initialized. Block genesis and communication will be saved in the blockchain and the blockchain will be saved in the local host and broadcast to other modules. The reason why this blockchain is initialized by the detection and analysis module is because this module is the core module of the system. The type of communications that can be sent and saved is decided by the Contract. Smart contracts can only be edited by system managers before initialization. They cannot be changed once the system is initialized.
Example of a block: Each communication agent has an AgentAccount which will store the agent name, the agent publicKey and the agent privateKey. These keys are encrypted by the ECDSA cryptographic algorithm, to ensure that only the correct agent can generate and access certain communication messages. When agents in system are initialized, agent accounts are generated. When communication agents need to send a communication message (both command and feedback), these communication messages can only be processed by methods defined in the smart contract. Receivers can only check

QEDMgAEb4ZjAt85ZODTgqSvTDR/USW78eu/dw26AlsVjz/FpP3d9O5+iYriNTrHJc/y+/zf"}]}
Each communication agent has an AgentAccount which will store the agent name, the agent publicKey and the agent privateKey. These keys are encrypted by the ECDSA cryptographic algorithm, to ensure that only the correct agent can generate and access certain communication messages. When agents in system are initialized, agent accounts are generated. When communication agents need to send a communication message (both command and feedback), these communication messages can only be processed by methods defined in the smart contract. Receivers can only check the message after verifying the integrity of the data by using the sender public key, the data and the signature. Smart contracts also decide which message can be sent or received by which communication agents. Each communication agent contains information of prevCmHash, hash, signature, data, sender and receiver. The signature is calculated by using the sender's private key and raw data. Each block contains the previous block hash, its own hash value, the generation time, merkleRoot, the max block height and communications. Communications contain the details of each communication message issued by the communication agents. MerkleRoot is calculated in a recursive way. The first tree layer is calculated according to hash of communications in the block, and then each subsequent tree layer is calculated based on the previous tree layer until the tree root is calculated. MerkleRoot is an efficient way to track target communication and verify its integrity. The previous hash is used to identify the position of this block, while the hash that is generated by Sha256 is used as identification value. Once the size of list of Communication reaches the MAX_BLOCK_HEIGHT, MerkleRoot and hash will be calculated, and the block will be added in blockchain. The updated blockchain will then be saved on local the host and broadcast to other modules.
The system integrates an embedded database and a blockchain node in the same agent, starting and stopping at the same time. Interactions will be added to cache, reducing the search time, and speeding up the usage and storage of data in agents.

Detection and Analysis Module
This module is responsible for the detection and analysis of malicious traffic/patterns. The following describes the Java classes we have used to implement this module.

DACommunication Agent
The DACommunication agent is the communication agent of the detection and analysis module. This process is been explained in Figure 3. Once the data process module sends a communication which says that a new data package needs to be detected, the DACommunication agent will send a command to the detection agent. When the detection agent has finished detection, the DACommunication agent will receive the detection report from the detection agent. The detection report will be encapsulated into a communication object and sent to the RCommunication agent. If the DACommunicaiton agent gets a communication object which says that the training set is updated, the DACommunication agent will send a command to the training agent. After the training agent has finished training the model a training report will be sent to the DACommunication agent as well as to the RCommunication agent.

Detection Agent
When the Detection agent receives commands from the DACommunication agent, it will start using the DNN model to analyze the new data package and detect attacks. After the process is finished, a detection report will be generated and sent to DACommunication agent.

Training Agent
When the Training agent receives commands from the DACommunication agent, the Training agent will start training the DNN model by using the updated training dataset. After the process is finished, a training report will be generated and sent to the DACommunication agent. Electronics 2020, 9,

DNNUtil
This class contains the methods that are relevant for DNN model management. The new raw dataset will be standardized and normalized using normalization methods. The transformed structure can be reviewed and adjusted via the transformProcess function and file.

RCommunication agent
The RCommunication agent is the communication agent of the response module. Once the RCommunication agent receives the Communication message that contains the detection report or the training report, the RCommunication agent will send a command to the response agent. These reports will be saved on local host.

Response Agent
After the Response agent receives a command from the RCommunication agent, the response agent will start generating data visualization charts for the users, based on the administrator's response plan, and make firewall and host security changes based on the RCommunication agent's commands.
The procedure sequence of the intrusion detection system SESS is as follows (Figure 4):

DNNUtil
This class contains the methods that are relevant for DNN model management. The new raw dataset will be standardized and normalized using normalization methods. The transformed structure can be reviewed and adjusted via the transformProcess function and file.

RCommunication Agent
The RCommunication agent is the communication agent of the response module. Once the RCommunication agent receives the Communication message that contains the detection report or the training report, the RCommunication agent will send a command to the response agent. These reports will be saved on local host.

Response Agent
After the Response agent receives a command from the RCommunication agent, the response agent will start generating data visualization charts for the users, based on the administrator's response plan, and make firewall and host security changes based on the RCommunication agent's commands.
The procedure sequence of the intrusion detection system SESS is as follows (Figure 4): Electronics 2020, 9, x FOR PEER REVIEW 15 of 27 Figure 4. Smart, efficient, secure, and scalable (SESS) system sequence diagram.

Data Visualization
BarChartUtil and LineChartUtil classes are used to manage data visualization. Parameters, like the size of charts or the theme of the charts can be adjusted.

Data Process Module
The primary aim of this module ( Figure 5) is to process the data for network, host and other agents. The network data management agent and the host data management agent deal ( Figure 5) with data that come from the network and the host respectively. Both agents need to do data pre-

Data Visualization
BarChartUtil and LineChartUtil classes are used to manage data visualization. Parameters, like the size of charts or the theme of the charts can be adjusted.

Data Process Module
The primary aim of this module ( Figure 5) is to process the data for network, host and other agents.

Data Visualization
BarChartUtil and LineChartUtil classes are used to manage data visualization. Parameters, like the size of charts or the theme of the charts can be adjusted.

Data Process Module
The primary aim of this module ( Figure 5) is to process the data for network, host and other agents. The network data management agent and the host data management agent deal ( Figure 5) with data that come from the network and the host respectively. Both agents need to do data pre- The network data management agent and the host data management agent deal ( Figure 5) with data that come from the network and the host respectively. Both agents need to do data pre-processing, including missing value processing, data integration and data standardization. After the pre-processing stage, the data will be scanned, based on intrusion detection rules generated by other open source IDS communities and or rules of system manager. In addition, the pre-processed data will be packaged and compressed. The discovered intrusion behaviors and the corresponding processing measures are recorded in the intrusion detection package. After this is compared with the result of detection agent, the intrusion detection package is sent to the response agent. The labelled data are merged into a training set package, which is used to improve the accuracy of the detection agent.
The database agent ( Figure 6) is the only agent which can alter the database. It can change database only if it has a command from the communication agent. Once the database agent receives the update command from the communication agent, it changes the database based on the data provided by the communication agent. After the database agent has changed the database, it sends a feedback message to the communication agent.
Electronics 2020, 9, x FOR PEER REVIEW 16 of 27 processing, including missing value processing, data integration and data standardization. After the pre-processing stage, the data will be scanned, based on intrusion detection rules generated by other open source IDS communities and or rules of system manager. In addition, the pre-processed data will be packaged and compressed. The discovered intrusion behaviors and the corresponding processing measures are recorded in the intrusion detection package. After this is compared with the result of detection agent, the intrusion detection package is sent to the response agent. The labelled data are merged into a training set package, which is used to improve the accuracy of the detection agent.
The database agent ( Figure 6) is the only agent which can alter the database. It can change database only if it has a command from the communication agent. Once the database agent receives the update command from the communication agent, it changes the database based on the data provided by the communication agent. After the database agent has changed the database, it sends a feedback message to the communication agent. The communication agent for the data process (Figure 7) manages and controls the host data process agent, the network data process agent and the database agent. It sends the data packages from the collection module to the data process agents, gives data process agent commands and gets their feedback. It asks the database agent to run regular updating checks, making sure the system is up to date. This agent also communicates with other communication agents and adjust its work pattern according to the state of system. The communication agent for the data process (Figure 7) manages and controls the host data process agent, the network data process agent and the database agent. It sends the data packages from the collection module to the data process agents, gives data process agent commands and gets their feedback. It asks the database agent to run regular updating checks, making sure the system is up to date. This agent also communicates with other communication agents and adjust its work pattern according to the state of system.

Dataset
In 1998, MIT Lincoln Laboratories undertook an important project, the Defense advanced research projects agency (DARPA) Intrusion Detection Assessment Project, which was designed to detect and evaluate the performance of intrusion detection systems. An important achievement of this evaluation project is the establishment of a data set for simulating various attacks. However, the dataset is too large, which is not conducive to a fair comparison of different intrusion detection algorithms. In addition, the feature information recorded is too complex, with different protocols using different formats which is not conducive to the selection of feature attributes.
The network security audit data set KDDCUP99, published by Professor Stolfo at Columbia University's Intrusion Detection Laboratory and others, was analyzed and compiled from the IDS data set of MIT Lincoln Laboratory in 1998, but it contained only network traffic data [51].
The NSL-KDD dataset solves some of the inherent problems of the KDD99 data sets. The NSL-KDD data set is widely used in the development of intrusion detection systems. Although this data set still has some flaws, it can be used as an effective benchmark data set to help researchers compare different intrusion detection methods. The size of the NSL-KDD training set and test set are reasonable, and the evaluation results of different researchers will be consistent and comparable. Data of the NSL-KDD data set comes from three different protocols (Transmission Control Protocol (TCP), User Datagram Protocol (UDP) and Internet Control Message Protocol (ICMP). The NSL-KDD data set contains four (4) attack classes (DoS, Probe, R2L, U2R) and 39 different attack types [52]. This research uses the NSL-KDD data set as data source for intrusion detection simulation.
Although data from NSL-KDD is not specific for IoT environment testing, the protocols and attacks involved in this dataset are valuable for IoT environment intrusion detection research. Attack types in this dataset also happen in real world IoT networks and protocols like TCP are becoming more important because both the traditional application layer and cloud platform connections an

Dataset
In 1998, MIT Lincoln Laboratories undertook an important project, the Defense advanced research projects agency (DARPA) Intrusion Detection Assessment Project, which was designed to detect and evaluate the performance of intrusion detection systems. An important achievement of this evaluation project is the establishment of a data set for simulating various attacks. However, the dataset is too large, which is not conducive to a fair comparison of different intrusion detection algorithms. In addition, the feature information recorded is too complex, with different protocols using different formats which is not conducive to the selection of feature attributes.
The network security audit data set KDDCUP99, published by Professor Stolfo at Columbia University's Intrusion Detection Laboratory and others, was analyzed and compiled from the IDS data set of MIT Lincoln Laboratory in 1998, but it contained only network traffic data [51].
The NSL-KDD dataset solves some of the inherent problems of the KDD99 data sets. The NSL-KDD data set is widely used in the development of intrusion detection systems. Although this data set still has some flaws, it can be used as an effective benchmark data set to help researchers compare different intrusion detection methods. The size of the NSL-KDD training set and test set are reasonable, and the evaluation results of different researchers will be consistent and comparable. Data of the NSL-KDD data set comes from three different protocols (Transmission Control Protocol (TCP), User Datagram Protocol (UDP) and Internet Control Message Protocol (ICMP). The NSL-KDD data set contains four (4) attack classes (DoS, Probe, R2L, U2R) and 39 different attack types [52]. This research uses the NSL-KDD data set as data source for intrusion detection simulation.
Although data from NSL-KDD is not specific for IoT environment testing, the protocols and attacks involved in this dataset are valuable for IoT environment intrusion detection research. Attack types in this dataset also happen in real world IoT networks and protocols like TCP are becoming more important because both the traditional application layer and cloud platform connections an important part of IoT environments. The NSL-KDD dataset is therefore used as dataset for IoT. Samples from the NSL-KDD dataset are labelled as normal or anomaly. A 70% of subset named "Train + 20% dataset" is used as a training dataset, while 30% of "Train + 20%" dataset is used as a validation dataset. A subset of "Test+ dataset" that contains 1200 data samples is used as a test dataset (Tables 1 and 2).

Evaluation Measures
The performance of intrusion detection will be measured based on an accuracy, precision, recall and F1 score. The accuracy is the ratio of the number of samples in the prediction pair to the total number of samples. The precision [53] is based on prediction results. It indicates how many of the positive prediction samples are true positive samples. The recall is for original sample, which indicates how many positive examples in the sample are predicted correctly. The F1 score comprehensively considers the calculation results of the model's precision rate and recall rate. The larger the F1-score, the better the quality of the model. Overfitting needs to be taken into consideration as well.

Pre-Processing
All raw data for the training, validation and test datasets are standardized by one-hot coding and z-score measures and normalization is done using same schema. For this experimental design, one-hot encoding uses the categorical variables from all training, validation and test dataset, instead of using categorical variables based on individual datasets. One categorical variable in each parameter that needs to be encoded by one-hot encoding is removed to prevent dummy variable trap.
The last field "class" is the output label. Variables as protocol_type, service and flag contains multiple values, so they need to be transferred to numerical values by using one hot encoding. In order to avoid the dummy variable trap, one categorical value of protocol_type, service and flag is removed. The "class" field is encoded to numerical values as well.

Findings and Discussions
To identify the performance of the detection and analysis module, the performance in different situations have been compared (Tables 3-5). By testing the detection and training module with different batch sizes, batch numbers and epochs, it is shown that the DNN model has a better performance for the batch size of 48, the batch number of 300 and a number of epochs of 100 then for the other values tested. The accuracy rate using the validation set is 98.27%, while the accuracy of the test set is 83.17%. Precision on test set is 83.53%. Recall on test set is 84.14%, and the F1 Score on the test set is 83.94%. These mean that DNN model in this system has a good ability to distinguish attacks from normal data traffic. These experiments also indicate that taking enough epochs can ensure that the model has a good performance, even if the training set has fewer samples. Table 6 shows the performance of intrusion based on different datasets. They are denoted as follows: accuracy on validation set as AV, precision on validation set as PV, recall on validation set as RV, F1 score on validation set as FV, accuracy on test set as AT, precision on test set as PT, recall on test set as RT and F1 score on test set as FT: Table 7 summarizes the accuracy of detection when using different data-splitting conditions. The model could potentially be upgraded for more complex datasets.  The table below (Table 8) shows a brief description of the parameters of the data extracted from NSL-KDD along with their serial number. In total 41 parameters used as mentioned below.  (23) 26

Srv_rerror_rate
The percentage of connections that have activated the flag (4) REJ, among the connections aggregated in srv_count (24) 29

Same_srv_rate
The percentage of connections that were to the same service, among the connections aggregated in count (23) 30

Diff_srv_rate
The percentage of connections that were to different services, among the connections aggregated in count (23) 31

Srv_diff_host_rate
The percentage of connections that were to different destination machines among the connections aggregated in srv_count (24) (32) 36

Dst_host_same_src_port_rate
The percentage of connections that were to the same source port, among the connections aggregated in dst_host_srv_count (33) 37

Dst_host_srv_diff_host_rate
The percentage of connections that were to different destination machines, among the connections aggregated in dst_host_srv_count (33) 38 Dst_host_serror_rate The percentage of connections that have activated the flag (4) s0, s1, s2 or s3, among the connections aggregated in dst_host_count (32) 39

Dst_host_erro r_rate
The percentage of connections that have activated the flag (4) REJ, among the connections aggregated in dst_host_count (32)

41
Dst_host_srv_r error_rate The percentage of connections that have activated the flag (4) REJ, among the connections aggregated in dst_host_srv_count (33) Table 9 shows the performance of the detection and analysis module on different numbers of parameters. Because the SESS intrusion detection system is designed to be used on different scales of IoT networks and some parameters are difficult to collect on some IoT networks, it is important to know how the performance of the system is affected by different numbers of parameters.  [4][5][6], [12], [23][24], [29], [32][33][34][35][36][37][38][39] 97  The table above (Table 9) uses 48 as batch sizes, 300 as batch number and 100 as epoch. The total number of parameters of the NSL-KDD dataset is 41.
The first parameter that is removed is service, because the type of service is different for different IoT networks which has a large impact on the normalization process. When excluding the service parameter only, using 40 parameters the DNN model has an 82.33% accuracy on the test set, 0.84% less than when using the full number of 41 parameters.
When the data sets use 25 parameters, the DNN model has an accuracy of 80.75% on the test set and the accuracy increases to 81.33 if we use only 17 parameters.
The reason why the accuracy for 17 parameters is higher than when using 25 parameters is because some parameters needs to be combined with other parameters to detect attacks. When the number of parameters is reduced from 40 to 25, some parameters lose their combination parameters, which has an adverse impact on the weighting process. After removing these combined parameters, the accuracy increases.
When only seven parameters i.e., duration, protocol_type, flag, src_bytes, dst_bytes, count and srv_count are used, the DNN model has a 78.08% accuracy rate. This means that the DNN model still has a decent detection ability even if the collection module cannot get many details from the IoT network. When duration, protocol_type, src_bytes, dst_bytes, count and srv_count only are used as parameters, the DNN model has a 72.68% accuracy and a 68.74% recall on the test set. In this case the removal of the parameter flag has reduced the accuracy rate. This shows the importance of flag parameter. However, the flag parameter also complicates normalization process. In order to investigate the performance of this system on an IoT network that does not provide many details, the flag parameter should be removed, because it can be hard to capture on certain circumstances.
If the data sets use only five parameters, the DNN model has a 72.33% accuracy on the test set. If the data sets use protocol_type, src_bytes, dst_bytes and count as parameters, the DNN model has 73.92% accuracy on test set. This means that using only four parameters results in a better performance than using five or six parameters.
If the data sets use three parameters, the DNN model has a 71.75% accuracy on the test set, but it only has a 58.15% accuracy on the validation set. This is because the DNN model uses Dropout. This regularization technique will randomly remove some weak classifiers during training process. For this situation, using dropout will affect accuracy when applied to the validation set. Increasing the batch size or removing the dropout technique can resolve this problem, although this may in turn lead to an overfitting problem. If dropout is removed from the DNN model, the DNN model has an accuracy of 89.23% on the validation set, and 70% on the test set. If the data sets use src_bytes and dst_bytes as parameters only, the DNN model has 68.5% accuracy on test set. By using dst_bytes as the only parameter, accuracy rate is 43% on the test dataset, but the F1 Score on both the validation and the test dataset is 0%. It means that all samples in the data sets are classified as one class, an overfitting problem. This is because some hyper-parameters, such as the number of epochs, are too high for one parameter datasets.
As shown in Figure 8, this model works well on TCP and ICMP datasets (accuracy rate is 99.1% and 98.1% respectively). However, the accuracy rate on UDP protocol is only 93.7%. One reason is that the data set contains very few data with UDP protocols. The features of the UDP protocol are another reason. For a dataset which uses class for different attack types as values, DNN has a good performance (accuracy rate is 97%). Some attack types which occur only in a relatively small amount of the data, however, cannot be distinguished very well.
As shown in Figure 8, this model works well on TCP and ICMP datasets (accuracy rate is 99.1% and 98.1% respectively). However, the accuracy rate on UDP protocol is only 93.7%. One reason is that the data set contains very few data with UDP protocols. The features of the UDP protocol are another reason. For a dataset which uses class for different attack types as values, DNN has a good performance (accuracy rate is 97%). Some attack types which occur only in a relatively small amount of the data, however, cannot be distinguished very well. In conclusion, these experiments show the usability of this SESS system on different scales of IoT network. During the experiments, the detection time is around 0.18 s, which means that the system can detect attacks and respond to them in real time. The timing was recorded based on various stages including initializing and loading data model, training time, and the detection time. This timing may vary based on various circumstances including the hardware, other resources utilizing the server, memory needs etc. In fact, these measurements need further investigation and research to make sure the proposed model can perform in a high-speed live environment. The findings from these experiments will also be used to define statuses for the multi agent reinforcement learning process.

Limitations
The results of these experiments show that this system can be used on different scales of IoT networks, accommodating even more complex networks. However, more research still needs to be done on this area. Although the accuracy rate of DNN model on distinguishing different attack types is quite high, some rare attack types cannot be detected accurately. The lengthy learning period of agents and the need for large training sets will need to be resolved in future. How multiple agents can cooperate properly in large IoT network needs to be further investigated. Large computing power requirements is also an issue which needs to be addressed in future designs.

Conclusions
This paper proposed an intrusion detection system based on a multi-agent system, blockchain and deep learning. We have described the working mode of each component of the model and the operation of the whole system in detail. The flexibility of multi-agent systems means that this new IDS can be applied across IoT environments of various sizes. All actions caused by communication agents will be recorded on blockchain, which makes the system secure from threats, including information tampering, information disclosure and so on. Use of a multi-agent reinforcement algorithm can help the system to improve its performance continuously. In conclusion, these experiments show the usability of this SESS system on different scales of IoT network. During the experiments, the detection time is around 0.18 s, which means that the system can detect attacks and respond to them in real time. The timing was recorded based on various stages including initializing and loading data model, training time, and the detection time. This timing may vary based on various circumstances including the hardware, other resources utilizing the server, memory needs etc. In fact, these measurements need further investigation and research to make sure the proposed model can perform in a high-speed live environment. The findings from these experiments will also be used to define statuses for the multi agent reinforcement learning process.

Limitations
The results of these experiments show that this system can be used on different scales of IoT networks, accommodating even more complex networks. However, more research still needs to be done on this area. Although the accuracy rate of DNN model on distinguishing different attack types is quite high, some rare attack types cannot be detected accurately. The lengthy learning period of agents and the need for large training sets will need to be resolved in future. How multiple agents can cooperate properly in large IoT network needs to be further investigated. Large computing power requirements is also an issue which needs to be addressed in future designs.

Conclusions
This paper proposed an intrusion detection system based on a multi-agent system, blockchain and deep learning. We have described the working mode of each component of the model and the operation of the whole system in detail. The flexibility of multi-agent systems means that this new IDS can be applied across IoT environments of various sizes. All actions caused by communication agents will be recorded on blockchain, which makes the system secure from threats, including information tampering, information disclosure and so on. Use of a multi-agent reinforcement algorithm can help the system to improve its performance continuously.
The application of neural networks is studied. The simulation results show that the deep learning algorithm has a better performance than traditional methods on the same type of IoT network. The implementation of blockchain technology ensures that this system can be distributed to different remote hosts, because ACL communication is secured by blockchain and communication among agents is regulated by the smart contract. The experiments using the NSL-KDD dataset demonstrate a high accuracy of intrusion detection on the transport layer of the IoT environment for DNN. The performance of the DNN model in distinguishing anomaly from normal is better than the performance of other machine learning methods, such as decision trees. This demonstrates the potential of using deep learning algorithms for IDS of IoT devices. The experiments demonstrate a high performance of the system in different scenarios, such as networks of different complexity and different attack types.

Future Work
The work is based on the study of multi-agent technology, block chains and neural networks to propose an intrusion detection system model for the IoT. The next step is to continuously improve the modules by running the system in an actual environment so that each agent can work well with each other. The following stage is to collect IoT network data sets to train the detection model in order to improve the performance of the system. In addition, creating datasets of novel attack types is something that could be explored further. To improve the performance of the system, more reliable and faster multi-agent algorithms could be used to train the communication agents and to optimize the feedback training process. Advanced deep learning algorithms could be explored to improve the performance of system. Since we have tested block chain on a smaller variant, block max height of blockchain and hyper-parameters of DNN model can be adjusted by using multi agent reinforcement learning in the next version. For a larger scale IoT network, unspent communication output pool should be implemented, to manage the order of communications. Testing against UNSW-BN15 datasets could also be explored. For future IoT system testing, an IoT data simulator [54] or any similar tools could be used to generate data to further evaluate the accuracy of the IDS.