A Novel Semantic IoT Middleware for Secure Data Management: Blockchain and AI-Driven Context Awareness

: In the modern digital landscape of the Internet of Things (IoT), data interoperability and heterogeneity present critical challenges, particularly with the increasing complexity of IoT systems and networks. Addressing these challenges, while ensuring data security and user trust, is pivotal. This paper proposes a novel Semantic IoT Middleware (SIM) for healthcare. The architecture of this middleware comprises the following main processes: data generation, semantic annotation, security encryption, and semantic operations. The data generation module facilitates seamless data and event sourcing, while the Semantic Annotation Component assigns structured vocabulary for uniformity. SIM adopts blockchain technology to provide enhanced data security, and its layered approach ensures robust interoperability and intuitive user-centric operations for IoT systems. The security encryption module offers data protection, and the semantic operations module underpins data processing and integration. A distinctive feature of this middleware is its proficiency in service integration, leveraging semantic descriptions augmented by user feedback. Additionally, SIM integrates artificial intelligence (AI) feedback mechanisms to continuously refine and optimise the middleware’s operational efficiency.


Introduction
The advancement in technologies such as artificial intelligence, machine learning, and 5G has catalysed the evolution and adoption of the Internet of Things (IoT).Improved connectivity and reduced component costs have led to the large-scale adoption of IoT to support diverse applications in varied sectors ranging from healthcare, smart cars, agriculture, smart homes, and many others [1].The backbone of the IoT infrastructure is a network of interconnected devices, which enable real-time data collection and analysis.By 2030, the number of IoT devices are forecast to reach 30 billion [2] worldwide, while the IoT market worth is expected to increase from USD 182 billion (2020) to USD 621 billion by 2030 [3].
As of 2023, the global spending on IoT has already surpassed its expected forecast [4].This spending has linearly increased over the years despite the COVID-19 pandemic because of the exemplary role IoT plays in the development of systems that enable remote monitoring and control.The impact of IoT on healthcare extends beyond the pandemic.The healthcare IoT market is projected to exhibit an annual growth rate (CAGR) of 12.32% from 2023 to 2028 [5].
However, there are several challenges associated with the management and handling of medical data.A typical medical record contains a multitude of different types of information, such as medical history, treatment plans, lab test results, and demographics.Thus, the fundamental challenge is to store, manage and analyse these data.Another significant challenge includes maintaining data quality as the data are typically collected from multiple sources, which include information management systems and healthcare providers.The inconsistencies that may arise because of the integration of data from different sources result in unreliable analysis and decisionmaking.
Other challenges associated with medical data include privacy and security concerns.Regulations and laws securing patient privacy, such as the Health Insurance Portability and Accountability Act (HIPAA) [6] in the United States and General Data Protection Regulation (GDPR) [7] in the United Kingdom and European Union, impose strict requirements for storing, managing, handling, and sharing medical information.Any research including non-commercial and commercial projects involving patient data must adhere to these regulations.
To address some of the challenges in the IoT domain, our previous work introduced the Internet of Things Management Platform (IoT-MP) [8,9].The IoT-MP was developed to tackle the complexities of managing an expansive IoT infrastructure, especially considering the constrained resources of typical IoT devices.This platform supports fundamental management functions necessary for optimal operation, monitoring, and communication.
Building on the foundational works of the IoT-MP, subsequent research [10] has introduced a framework for securely authenticating patients and facilitating the secure sharing of biomedical data through cloud computing technologies.Further advancements were achieved with [11], which seamlessly incorporated blockchain into the aforementioned framework.This evolution in our research culminated in the introduction of BioChainReward (BCR), a blockchain-based framework for secure and privacy-preserving biomedical data sharing [12].BCR stands out by implementing an AI-enabled privacy-preservation sublayer, adapting its privacy measures depending on the data request's nature and purpose.
Despite the progress made with platforms like IoT-MP and BCR, there are still formidable challenges to be addressed.A paramount issue is the heterogeneity of medical data.Data from various sources like hospitals, clinics, and labs come in different formats, each employing unique coding systems and terminologies [13].Such diversity complicates the task of data integration and analysis.For instance, patient outcome data sourced from various healthcare institutions might differ structurally, impeding meaningful comparisons.Another considerable obstacle lies in the realm of semantic annotations.Understanding medical measurements often demands domain-specific knowledge.For instance, a single data point like a heart rate exceeding 150 beats per minute can signify different health states depending on whether the person was at rest or running.Therefore, the data context is critical, and any misinterpretations can drastically skew diagnoses and treatment decisions.
Given the gaps left by solutions such as the IoT-MP and the challenges still at play, middleware remains a promising answer to these challenges.Serving as a buffer layer between IoT devices and data-utilising applications, middleware offers compatibility, interoperability, standardisation, and abstraction, fostering seamless data integration and inter-system communication.When combined with technologies like AI and blockchain, the potential will be further amplified.On one hand, AI's prowess lies in data analysis and insight generation, particularly from heterogeneous sources.On the other hand, blockchain provides a robust framework for secure data management and sharing in IoT systems.Thus, their amalgamation into a semantic middleware may reduce some of the challenges encountered.
To this end, this paper presents the Semantic IoT Middleware (SIM).This novel middleware is designed to address the challenges identified previously.SIM employs blockchain technology with artificial intelligence (AI) to enhance data security, providing an encompassing user-centric solution tailored for diverse IoT systems, especially in the healthcare area.SIM provisions semantic annotations and user feedback to ensure data security and integrity.Additionally, it facilitates service integration and improves operational efficiency.Thus, the primary contributions of this work are as follows: 1.
A comprehensive proposal of the Semantic IoT Middleware (SIM), designed for broad IoT applications with a focus on healthcare, addressing challenges in data interoperability, heterogeneity, and system complexity.

2.
The integration of blockchain technology within SIM for enhanced data security and integrity, coupled with an innovative AI Feedback and Analysis Module for continuous system improvement based on user feedback and data analysis.

3.
Introduction of an innovative AI Feedback and Analysis Module, harnessing the power of artificial intelligence to continually refine and enhance system operations based on user feedback and real-time data analysis.

4.
The introduction of detailed implementation algorithms as a blueprint for practical application and a proof of concept demonstrating the feasibility and adaptability of SIM in various scenarios.

5.
An expanded discussion on user-centric operations in SIM, showcasing its application in diverse contexts beyond healthcare, emphasising its intuitive and accessible user interface.6.
A comparative assessment of the proposed middleware against existing methodologies, demonstrating its capabilities in terms of security, interoperability, and usercentric design.
The remainder of the paper is organised as follows: Section 2 delves into the background of IoT middleware and its application in healthcare.Section 3 presents the proposed middleware, detailing its architecture and modules.Section 4 presents a case study and scenario.The proof of concept is provided in Section 5. Section 6 focuses on the evaluation of the proposed system.Section 7 discusses the challenges encountered during this research.Section 8 offers the concluding remarks and summarises the contributions of this work.

Background and Related Work
The evolution of IoT has given birth to a range of smart devices.However, to develop a sustainable ecosystem, all IoT devices need to work together seamlessly regardless of their technology or manufacturer.As the world transitions towards a future that is expected to use IoT devices in greater capacity, it is crucial to find a solution to this problem of interoperability.On the other hand, the semantic web involves annotating data with semantic information to preserve the context and meaning and allow them to be machineunderstandable [14].
Using the same concept, semantic annotations when used in IoT allow data from different devices to be understandable by different machines and frameworks, enabling efficient context-aware data exchange and integrated analysis.Thus, the concept of the semantic web can be borrowed to address the interoperability issues in IoT.In other words, a unified framework for data communication and representation can greatly impact the usefulness and efficiency of IoT systems.Nonetheless, the integration of semantic annotation suffers from several challenges, such as complexity, security, and performance, and, as IoT evolves, the evolution of an integrative use of these technologies is also expected for the development of sustainable IoT ecosystems.
IoT middleware is a software layer that enables communication among the diverse components of IoT ecosystems, which include network hardware, network stacks, operating systems, and applications [15].Interoperability, self-adaptability, reliability, scalability, lightweight nature, and real-time capabilities are some of the key characteristics of IoT middleware [16].However, the development of IoT middleware to maintain machineto-machine communications and demonstrate the above-mentioned characteristics is a daunting task [17].
Various architectures for IoT middleware have been proposed over the years.Existing work [18] in this area classified these architectures into several categories, including application-specific, agent-based, VM-based, database-oriented, tuple spaces, serviceoriented, and message-oriented architectures.Architectures of IoT middleware have evolved over the years to support and cater to the changing demands of IoT environments.
Table 1 provides an analysis of the benefits and shortcomings of the aforementioned architectures.The best-fit architecture for an IoT application is identified based on the specific requirements of the application, such as scalability, modularity, nature of devices, and need for real-time processing.With the ever-growing complexity and challenges associated with IoT systems, hybrid architectures [19] may also be employed to address specific challenges and cater to the diverse demands of the IoT system.

Use Cases Pros Cons
Application-specific Tailored for particular use cases or industries [20,21] High optimization, reduced overhead, fine-tuned performance

High complexity and data redundancy
Service-oriented Modular services for specific functions [33] High modularity, service reuse, adaptability Several commercial-grade IoT middleware solutions are also available.Some of the popular solutions include AWS IoT Core [36], FIWARE-IoT-Agent [37], FIWARE-ORION [38], Linksmart [39], Microsoft Azure [40], OpenIoT [41], and Symbius IoT [42].Table 2 provides a technical comparison of these commercial-grade solutions, which demonstrates an overarching trend in areas of future innovation.These solutions can benefit from the integration of improved security features, AI/ML, and edge computing capabilities [43][44][45][46].In academia, research into IoT middleware continues apace as commercial-grade products, despite their merits, still possess limitations and scope for innovation.Moreover, the explosive growth regarding IoT devices and their diverse applications poses a plethora of challenges to the evolution of these products.Microsoft Azure (IoT Hub) [43] Supports bi-directional communication between IoT applications and the devices it manage, supports a wide array of SDKs, built-in device authentication Enhanced AI and ML capabilities, better integration with non-Azure systems, more comprehensive device management tools OpenIoT [46] Open-source, focuses on semantic interoperability for cloud-based IoT, offers utility-based sensing via virtual sensors More user-friendly deployment options, enhanced real-time analytics, better security features.
Symbius IoT [42] Offers a unified middleware solution, supports compatibility with major sensor manufacturers, streamlined data accessibility, reduced deployment complexities Expanding compatibility, enhancing security, and offering richer data insights IBM Watson IoT Platform [47] Includes advanced AI capabilities for data analytics, anomaly detection, and predictive maintenance.Demonstrates high scalability, real-time data processing, easy integration with other products, and provides tools for data management.
AI capabilities can be enhanced to support automated decisionmaking.In addition to this, it can include enhanced support for blockchain and edge computing, with the addition of tools to facilitate development of custom solutions.

Google Cloud IoT Core [48]
Fully managed service that is scalable and allows seamless data integration.Moreover, it also provides efficient data management and robust security features.
Discontinued in August 2023 due to non-technical reasons.
As the IoT landscape diversifies with the introduction of new devices and protocols, it is imperative to evolve existing platforms to ensure seamless interoperability and integration.In this context, [49] presents an innovative multilayer IoT middleware rooted in the concepts of knowledge graph, effectively addressing the challenge of connecting a heterogeneous array of IoT devices employing various communication protocols.Another work, [50], was proposed in this area, with a prime focus on resolving privacy and security issues, specifically concentrated on the message queuing telemetry transport (MQTT) protocol.Some of the recent works that have amalgamated the concept of semantic annotations in IoT middleware include GMSCA [51], S2NetM [52], and SEDIA [53].GMSCA is a generic IoT middleware for smart city applications that targets to address challenges regarding heterogeneity, service management, and security.However, the paper acknowledges that security is still a challenge that needs to be addressed in future work.S2NetM introduces the concept of Semantic Social Network of Things Middleware, and the purpose of this study is to enhance semantic interoperability in Social Internet of Things (SIoT) systems by comprehending and using social relationships.The study acknowledges several challenges, such as security and standardisation in SIoT environments.
On the other hand, SEDIA is a platform for integrating IoT data with semantic enrichment.This work is focused on smart city applications and aims to address specific challenges such as integrating diverse data sources and incorporating geographical data.
A proof of concept is provided using air quality monitoring as a case study.This study is a significant step towards developing a generic and sustainable platform for IoT data management.Although no specific challenges have been identified by the study, the proposed platform uses MQTT, which has associated security and privacy challenges [54].These challenges are particularly grave for healthcare applications, where identity and data protection are the highest priorities.Therefore, this paper proposes the use of a blockchain-based decentralised identity system that will facilitate the authentication of users and administrators.
Blockchain's potential for securing IoT devices and their usage is widely recognised in both academia and industry.Therefore, it is viewed as a technology that is all set to revolutionise how IoT devices are administered, operated, and, most importantly, secured.
Recent research [55] introduced BPIIoT, a decentralised peer-to-peer Industrial Internet of Things platform underpinned by blockchain.This platform facilitates direct communication between peers in a trustless network without necessitating a central intermediary.Another significant contribution [56] integrated permissioned blockchain mechanisms into a privacyconscious IoT middleware.This middleware functions without any central authority and does not intrinsically trust the IoT platform.
Despite significant advancements in both IoT and blockchain and the growing research on their integration into IoT middleware, a clear gap persists.While our previous works, including IoT-MP [8,9], the Biometric Authentication Framework [10], and BCR [12], made progress in certain areas, middleware solutions on the whole still often lack cohesive semantic translation or annotation capabilities.These solutions, although pioneering, underscore the broader challenge of fully integrating security, transparency, and other essential IoT features.This highlights the urgent need for a comprehensive solution that combines semantic annotation, heightened security, transparency, and core IoT functionalities.Such a solution would address the current system limitations while capitalising on their inherent strengths.In the following section, the architecture of the Semantic IoT Middleware (SIM), designed to fill this critical gap and meet the highlighted needs, will be presented.

The Proposed Semantic Internet of Things Middleware
The proposed Semantic IoT Middleware (SIM) seamlessly integrates semantic annotation capabilities, blockchain technology, and AI-powered feedback mechanisms.The middleware harnesses the security and decentralised advantages of blockchain to manage and store semantic annotations, ensuring data integrity and protection from unauthorised modifications.Additionally, the embedded AI Feedback and Analysis Module employs advanced artificial intelligence techniques, focusing on refining the quality and usability of semantic annotations through user feedback and system events.As a result, end users, applications, and third parties, such as health monitoring apps, emergency response teams, and healthcare data analytics providers, benefit from secure, transparent, and meaningful access to users' IoT devices.The architecture of the middleware is depicted in Figure 1 and  detailed below: 1.
IoT Data and Event Generation Module: Constitutes the data and event sources in the IoT system, which may be varied, i.e., diverse types, capabilities, protocols, formats, locations, and domains.

2.
Semantic Annotation Component: Processes and attributes common vocabulary and structure to data and events sourced from the IoT module.It leverages a semantic annotation model for defining concepts, properties, and relationships pertinent to the IoT domain, with an annotation tool that either automatically or semi-automatically assigns semantic annotations.

3.
Data Security and Encryption Module: This component, placed after the Semantic Annotation Component, ensures the secure transition of data to the blockchain network.It handles encryption and decryption of data and events, authentication and authorisation of users and applications, and digital signing and verification of semantic operations.It interacts with the Blockchain Component to secure the storage and retrieval of semantic annotations.It also optimises lightweight and efficient security protocols and algorithms compatible with the IoT environment.

4.
Blockchain Component: Stores and manages semantic annotations in a distributed, decentralised manner.It incorporates a blockchain network comprising multiple nodes that maintain a shared ledger of transactions, with each transaction containing a verified and validated semantic annotation.It ensures the security, integrity, and trustworthiness of the semantic annotations, mitigating unauthorised access and modification.

5.
Semantic Query Module: Enables discovery and access of blockchain-stored semantic annotations, utilising a semantic query language for expressing user and application information needs based on concepts, properties, and relations defined in the semantic annotation model.It uses a semantic query engine to process and execute semantic queries over the blockchain network and return relevant results.6.
Semantic Analysis and Inference Engine: Analyses and infers from semantic annotations stored on the blockchain network, utilising logical rules and algorithms to derive new knowledge and insights from the semantic annotations.It employs AI and deep learning techniques to enhance reasoning capabilities and tackle complex and uncertain scenarios.7.
Service Integration and Composition Module: Facilitates integration and composition of IoT services and applications based on the semantic annotations in the blockchain network.It uses a semantic service model to define functionalities, inputs, outputs, and parameters of various IoT services and applications.It also uses a semantic service discovery and composition tool to automatically or semi-automatically find and combine suitable IoT services and applications according to their semantic descriptions and user requirements.

8.
Semantic Visualisation Component: Provides graphical and interactive representations of semantic annotations, queries, reasoning, and services.9.

AI Feedback and Analysis Module:
This module acts as a central entity in the middleware that harnesses the power of artificial intelligence and machine learning techniques to enhance the system's capabilities.It primarily focuses on gathering and analysing user feedback and system events, ensuring that all insights are accounted for to refine the quality and usability of semantic annotations, queries, reasoning, and services.

Design Decisions: Incorporating AI into the Middleware
The introduction of AI into the middleware represents a significant advancement in the domain of IoT and Health Informatics, and our approach required a great deal of thought and deliberation.We considered three options: Each of these options had its unique merits and drawbacks.Option 1, introducing an AI module within the middleware, offered a fine balance between enhancing the system with AI capabilities and preserving the decentralised nature of the middleware.This approach ensured centralised control over the AI functionality while maintaining the autonomy of other middleware modules.It aligned with the principle of separation of concerns, allowing each component to concentrate on its core functionalities while the dedicated AI module focuses on enhancing system-wide performance and learning.It further streamlined the management, up-gradation, and scalability of the AI capabilities.However, the integration of a new module required significant architectural modifications.
Option 2, integrating AI within each component, promised a high level of tailored intelligence and autonomy for each component.This could result in more context-aware and adaptive functionality.Nevertheless, the downside was the potential for redundant AI development across components and the necessity for stringent oversight to maintain consistency and reliability of AI decisionmaking.Option 3, a centralised AI entity, provided a singular point of control and consistency for AI functions.However, this approach contradicted the inherently decentralised nature of IoT middleware and could limit the autonomy of individual components and devices.
Upon weighing the pros and cons, we concluded that Option 1-introducing a dedicated AI module within the middleware-was the most suitable approach for our requirements.This decision hinged upon the need for centralised AI capability while still respecting the decentralised architecture of the middleware.The new AI module allows for continuous learning and improvement, thereby enhancing the overall middleware performance and usability.

The AI Feedback and Analysis Module: Role in the Middleware
The AI Feedback and Analysis Module is a crucial entity embedded at the core of SIM.It employs advanced machine learning techniques, including deep learning, natural language processing, clustering and classification, and reinforcement learning to improve the middleware's functional capabilities and efficiency.A schematic representation of this module is presented in Figure 3.The module primarily focuses on 1. Enhanced Insights Analysis for Service Integration: The AI module derives complex insights from data by utilising deep learning algorithms, which enriches the Service Integration and Composition Module's ability to manage services effectively.
2. Semantic Annotation Suggestions: Utilising natural language processing techniques, the module discerns the context and semantics of diverse IoT devices and data, thereby suggesting more accurate and relevant semantic annotations and enhancing the data processing efficiency.
3. Event and Feedback Analysis: Using clustering and classification techniques, the AI module comprehensively analyses system events and users' feedback, leading to the identification of patterns, trends, and potential areas for improvement.
4. Adaptive Learning and Continuous Improvement: Through reinforcement learning, the AI module perpetually learns and adapts from its interactions and experiences.It consistently monitors and assesses system operations, user feedback, and interactions, detecting potential areas for improvement and suggesting proactive modifications to other middleware components.This keeps the middleware adaptive, optimised, and user-centric.

Ensuring Robust Security in the Middleware
Healthcare data, by their highly personal and sensitive nature, necessitate top-tier security measures.The Semantic IoT Middleware (SIM), as shown in Figure 1, integrates a layered approach to safeguarding data integrity and privacy, especially when transiting through different components.Figure 4 further details the security capabilities of SIM, mainly the following: • Data Annotation: The process begins with the acquisition of raw health data and events from IoT devices.These raw data may include information such as the vital signs of a patient or specific medical events.Initially, interfacing with the Semantic Annotation Module occurs, where the data are labelled with relevant annotations, adding semantic context to raw readings.This step not only aids in data understandability but also prepares them for subsequent security measures.

Use Case Study: Alice's Healthcare Monitoring
This section illustrates the operation of the Semantic IoT Middleware (SIM) through a practical scenario involving Alice, an elderly lady with a history of high blood pressure and previous cardiac events.Alice utilises an array of IoT healthcare devices, including a heart rate monitor, blood pressure monitor, and temperature sensor to monitor her health conditions continuously.These devices are crucial for tracking her health and alerting her and her healthcare providers to potential risks or emergencies.The scenario also includes Bob, the provider of a health monitoring app; Charlie, part of an emergency response team; and Dave, a healthcare data analyst.

Scenario Overview
Alice's array of IoT healthcare devices are integral to her daily health monitoring routine.They capture critical data, such as heart rate, blood pressure, and body temperature.These data are then fed into SIM's Semantic Annotation Component, which utilises a specialised model to assign relevant medical terminology to the raw data, thereby transforming them into structured, semantically annotated data.For example, heart rate readings might be categorised as "Normal", "Elevated", or "Low" based on predetermined thresholds.
Upon structuring and annotating Alice's data, the Data Security and Encryption Module encrypts the information, ensuring its secure transfer to the Blockchain Component.The data are securely stored and safeguarded against unauthorised access and modification.When Bob's health monitoring app needs to access Alice's data, it queries the Semantic Query Module.This module utilises a semantic query language to search through the blockchain-stored annotations and retrieve relevant data.For example, it could search for instances where Alice's annotated heart rate data are labelled as "Elevated".
The Semantic Analysis and Inference Engine analyses the queried data.For example, continuous instances of "Elevated" heart rate could indicate a potential cardiac risk.
Insights derived by the analysis engine are then transferred to the Service Integration and Composition Module.This module composes relevant services based on these insights.
For instance, if a potential cardiac risk is detected in Alice's health data, it could trigger a service to alert Charlie's emergency response team and send a health advisory to Alice via Bob's app.
All the information, from the original data to the final insights and services, is visualised using the Semantic Visualisation Component.This allows Alice, Bob, Charlie, and Dave to visually monitor the process and outcome, aiding in understanding and decision-making.
Lastly, the AI Feedback and Analysis Module captures user feedback and preferences to improve the system.Feedback from Alice about her health monitoring experience, Bob on the app's performance, Charlie on emergency response efficiency, and Dave on data analysis is collected and used to refine the system's operations.Figure 5 shows a diagram illustrating how Alice's data are collected and processed by different modules of SIM to deliver services or trigger alerts.
To further demonstrate the practical application of SIM in this scenario, several algorithms that act as a blueprint for implementation are provided.These algorithms outline the step-by-step operational workflow of SIM, showcasing its functionality.Algorithm 1 provides a high-level pseudo-code for SIM implementation for healthcare monitoring.
Figure 6 provides a visual guide to the system's process, from data collection to continuous monitoring and improvement.-Perform Blockchain Verification

The Pivotal Role of Blockchain in Augmenting the Security of SIM
Blockchain serves as a key security feature of the Semantic IoT Middleware, providing an immutable, transparent, and secure mechanism for handling health data.This critical role is depicted in the sequence diagram presented in Figure 8.The data flow within this sequence diagram is as follows: 1.
Data Generation and Initial Processing: The starting point involves the generation of data from IoT devices and end users.These raw data, which might be derived from sources like Alice's health monitors, first interface with the 'Data Security and Encryption Module'.This module ensures that the data undergo essential encryption and security measures before moving forward.

2.
Blockchain Interaction: Upon being secured and annotated, the health data are relayed to the 'Blockchain Component'.Each data entry, conceptualised as a transaction within the blockchain, undergoes rigorous verification.Subsequently, a new block, exclusive to the specific data (e.g., Alice's health readings), is formulated.

3.
Achieving Consensus: Before its incorporation into the blockchain, the new block must be unanimously validated by nodes within the network.This is achieved through the 'Consensus Mechanism'.Once consensus is ascertained, the blocknow housing Alice's secured health data-joins the chain, offering permanence and tamper resistance.

4.
Integration with Data Processing: After its blockchain journey, the health data find themselves in the 'Integrated Data Processing Module'.This module acts as a nexus, facilitating the retrieval of blockchain-secured health data for diverse applications or analytics.5.
Data Access Protocols for End Users: Encompassed within the 'end users' label are distinct entities such as Dave, a healthcare data analyst, and the Emergency Services.Their interactions with the blockchain are driven by specific requirements.For instance, Dave's analysis endeavours may demand a deep dive into Alice's health data, while Emergency Services might seek swift access to Alice's records during exigent situations.Smart contracts provision these interactions, ensuring that data access occurs only after thorough authentication and following the predefined contractual conditions.After detailing the pivotal role of blockchain in enhancing the security of SIM, it becomes essential to understand the specific processes involved in this security enhancement.To this end, a supplementary algorithm that details the operational steps of the Blockchain Component within SIM is provided in Algorithm 2. Algorithm 2 provides a comprehensive view of how data from IoT devices, such as Alice's health monitors, are securely processed, verified, and stored within the blockchain framework.It also details the mechanisms for data retrieval and access by authorised entities, such as healthcare data analysts and emergency services.
Algorithm 2 shows the sequence of operations, starting from the initial encryption and semantic annotation of the data through its journey in the blockchain network, including transaction creation, data verification, block addition, and consensus achievement.Furthermore, it highlights how data integrity and security are maintained during access requests, ensuring that only authenticated and authorised requests are processed.It provides a practical guide on how these security measures can be implemented in real-world scenarios.Building upon the enhanced security features of SIM, a comparative analysis with other middleware solutions in IoT and smart city applications is presented in Table 3.This analysis highlights the unique security implementations of SIM, including its blockchain integration and encryption techniques, in comparison with other recently proposed platforms in the literature.

Applying the AI Feedback and Analysis Module to the Scenario
The AI Feedback and Analysis Module, depicted in Figure 9, plays a crucial role in improving the middleware's functioning based on feedback from stakeholders.It employs a variety of machine learning techniques, such as deep learning, natural language processing (NLP), clustering and classification, and reinforcement learning (RL), to analyse the feedback and historical data, identify patterns, and suggest improvements.
Consider a case where Dave, the healthcare data analyst, identifies a pattern where high heart rate alerts are being triggered during Alice's exercise routines, a known situation that does not require immediate alerts.Dave provides feedback to the system, suggesting that Alice's physical activities should be taken into account alongside her heart rate before triggering an alert.
This feedback is sent to the AI Feedback and Analysis Module.In this module, deep learning algorithms evaluate the feedback alongside historical data and infer that certain high heart rate events correlate with exercise sessions.It suggests that the middleware should contextualise heart rate readings with concurrent physical activity data, thereby helping to reduce unnecessary alerts.
Meanwhile, NLP techniques are employed to improve the semantic annotations.Based on the feedback and the identified pattern, NLP suggests refining the annotation schemes of physical activity data to better capture the context in future data collection.On the other hand, clustering and classification techniques help to recognise patterns and classify them into actionable segments.This makes the feedback more structured and insightful.
RL methods foster adaptive learning and continuous improvement.They incentivise the system to adjust its decisionmaking strategy based on the feedback and the outcome of the decisions made, leading to a system that evolves and improves its alert generation strategy.
Incorporating the insights and suggestions from the AI Feedback and Analysis Module, the Service Integration and Composition Module adjusts the alert system to consider Alice's physical activities in its decisionmaking process.The Semantic Annotation Module improves its annotation schemes as suggested, and the Semantic Analysis and Inference Engine incorporates the enhanced data for its analyses.
Lastly, the Semantic Visualisation Component reflects these updates, and Dave is informed that the alert system now considers physical activities in its analysis.Through this AI-driven iterative process, the system becomes smarter, providing more context-aware and personalised services to its users.To demonstrate the operation of the AI Feedback and Analysis Module within the scenario discussed in this work, Algorithm 3 is provided.It delineates the module's process of responding to and incorporating user feedback for system enhancement.Algorithm 3 illustrates how the module employs advanced machine learning techniques, including deep learning, natural language processing (NLP), clustering and classification, and reinforcement learning (RL), to analyse user feedback, similar to the insights provided by Dave.It details the iterative process of identifying patterns, refining semantic annotations, and making informed adjustments to the system.It also details the steps within the rest of the SIM modules, such as the analysis of feedback content, the application of various AI techniques for pattern recognition and contextual understanding, and the subsequent system modifications based on the insights gained.This includes adjustments in the alert mechanisms to consider factors like physical activities during heart rate monitoring.Algorithm 3 also shows how these refinements are integrated back into the system, culminating in enhanced personalised user experiences.-Visualise the updated alert system 34: Notify Users of System Improvements -The Semantic Annotation Component labels these metrics, identifying periods of rest, activity, and potential health anomalies.-The AI Feedback and Analysis Module processes Alice's feedback on false alerts during exercise, adjusting the context recognition algorithms to differentiate between exercise-induced and health-related heart rate elevations.
-The Blockchain Component securely logs these adjustments for consistent application in future scenarios.
Outcome: Alice receives personalised health monitoring that intelligently distinguishes between her physical activity and genuine health risks, enhancing her trust in the system's alerts.

Smart Home Management: Tailored Environmental Preferences
Architecture Utilisation: -The system learns Emma's preferences for lighting, temperature, and security settings through continuous interaction with the IoT Data and Event Generation Module.-The Semantic Analysis and Inference Engine integrates these preferences to automate home environment adjustments based on time of day, weather conditions, and Emma's presence in the home.-The Service Integration and Composition Module synchronises various smart devices to create a seamless living experience.-Emma uses the Semantic Visualisation Component to easily adjust settings or override automated controls.
Outcome: Emma's home environment dynamically responds to her lifestyle, providing comfort and efficiency without the need for constant manual adjustments.

Agricultural Monitoring: Precision Agriculture for Enhanced Crop Yield
Architecture Utilisation: -Lucas's farm sensors track soil moisture levels, weather patterns, and plant growth, feeding data into the Semantic Annotation Component.-The Semantic Analysis and Inference Engine analyses these data to predict optimal watering times, fertiliser needs, and potential pest threats.-The AI Feedback and Analysis Module refines predictions based on Lucas's input and observed crop responses, enhancing the accuracy of future recommendations.-Data integrity and history are maintained in the Blockchain Component for long-term agricultural planning.
Outcome: Lucas benefits from data-driven insights that improve his crop yields and resource efficiency, demonstrating SIM's capability to support advanced agricultural practices.

Industrial Automation: Proactive Anomaly Detection and Resolution
Architecture Utilisation: -Factory sensors monitor machine performance, sending real-time data to the IoT Data and Event Generation Module.-The Semantic Annotation Component contextualises these data, identifying patterns indicative of potential equipment failures.-The AI Feedback and Analysis Module uses historical data and Aisha's feedback to refine its predictive maintenance algorithms, reducing false alarms and pinpointing actual issues.-The Blockchain Component ensures an immutable record of machine performance and maintenance actions for compliance and auditing purposes.
Outcome: Aisha oversees a more efficient and proactive maintenance regime, with SIM providing critical insights to address manufacturing issues, enhancing productivity, and reducing downtime preemptively.
These scenarios illustrate the depth and practicality of SIM's user-centric approach, leveraging its comprehensive architecture to deliver tailored, intelligent, and efficient experiences across diverse domains.

Proof of Concept: Demonstrating SIM in Action
This section presents a proof of concept of the Semantic IoT Middleware (SIM) developed in Python, specifically designed to demonstrate the feasibility of the core functionalities proposed for SIM within the context of the practical scenario described in Section 4. The proof of concept specifically implements Alice's scenario, showcasing how SIM can be applied to real-world health monitoring and decision making processes.As a reminder, the scenario involves Alice, an elderly lady with health conditions, using various IoT healthcare devices to monitor her health, supported by healthcare providers and other stakeholders for timely and effective health management.To bridge the theoretical concepts with this practical application, certain assumptions and abstractions were made for simplicity and to focus on conceptual demonstrations: • Assumptions: The sensor data for heart rate, blood pressure, and temperature are simulated using random number generation within typical physiological ranges, with thresholds set for elevated levels.• Abstractions: Complex functionalities such as blockchain transactions are abstracted with placeholders, allowing us to focus on the data flow and logic rather than the technical specifics of these operations.
The codebase for this proof of concept includes scripts for generating sensor data, annotating these data semantically, simulating encryption for secure data transfer, and representing blockchain operations.These elements directly correspond to the IoT Data and Event Generation Module, Semantic Annotation Component, Data Security and Encryption Module, and Blockchain Component of the SIM architecture, respectively.By simulating these operations, we demonstrate how each module of SIM could interact and process data in the context of Alice's scenario, providing a holistic and secure health monitoring solution.
A key aspect of the proof of concept is the simulation of the AI Feedback and Analysis Module, which showcases SIM's adaptability and potential for real-world application by adjusting the system based on user feedback.This reflects the scenario's need for a responsive system that can adapt to Alice's unique health condition and requirements.
Selected code snippets from the proof of concept implementation are provided below to illustrate these interactions: Generating Sensor Data: The code snippet provided in Listing 1 simulates the generation of health data from IoT devices, akin to those used by Alice for monitoring her heart rate, blood pressure, and temperature.These data are the starting point in the SIM architecture, feeding into subsequent modules for processing and analysis.Listing 2 shows the code for annotating sensor data.The heart rate readings are annotated as normal, elevated, or low, as shown in Figure 10." h e a r t _ r a t e _ s t a t u s " : heart_rate_status , 11 " bl ood_pr essure " : reading [ " bloo d_pres sure " ] , 12 " b l o o d _ p r e s s u r e _ s t a t u s " : blood_pressure_status , 13 " temperature " : reading [ " temperature " ] ,

14
" t e m p e r a t u r e _ s t a t u s " : t e m p e r a t u r e _ s t a t u s These data serve as a basis for the subsequent annotation, encryption, and blockchain operations, all crucial for the secure and effective management of health data within the SIM framework.By simulating these operations, we demonstrate how SIM handles and processes data and also show its alignment with the needs and complexities of real-world health monitoring scenarios like Alice's.

Simulating Data Encryption:
The encryption process is a critical aspect of data security in SIM.It ensures that Alice's health data remains confidential and secure during transit and storage.The code shown in Listing 3 simulates the encryption of data using the SHA-256 hashing algorithm, reflecting the type of cryptographic operations that would be performed in the Data Security and Encryption Module.These simulations represent how SIM's Data Security and Encryption Module and Blockchain Component work together to maintain the security, integrity, and utility of health data.By encrypting the data and utilising smart contracts for data management, SIM ensures that health data like Alice's are handled securely and efficiently, facilitating trustworthy and responsive health monitoring and care.These operations, while simplified in the proof of concept, illustrate the potential of SIM to provide a robust framework for health data security and management in real-world applications.

Simulating Third-Party Interactions
Within the comprehensive SIM framework, third-party interactions are facilitated by the Service Integration and Composition Module, Semantic Visualisation Component, and AI Feedback and Analysis Module.These modules ensure that the system not only captures and processes data efficiently but also interacts seamlessly with healthcare providers and patients.Here, we simulate potential interactions to demonstrate SIM's versatility and user-centric design.

Healthcare Provider Interaction
Healthcare providers, pivotal in the healthcare ecosystem, interact with the system to monitor and manage patient health.SIM facilitates these interactions by providing streamlined access to processed health data and relevant alerts.The code provided in Listing 5 simulates a healthcare provider's interaction with the system, focusing on how they might review and respond to elevated health alerts, reflecting the real-time decisionmaking facilitated by SIM.
This simulation underlines how the Service Integration and Composition Module and Semantic Visualisation Component could empower healthcare providers to make informed decisions, tailoring care to individual patient needs.
Listing 5. Healthcare Provider Interaction Function.
1 def h e a l t h c a r e _ p r o v i d e r _ i n t e r a c t i o n ( e le va t ed _a l er ts ) : 2 for alert in e l ev at e d_ a le rt s : Incorporating patient feedback into the health monitoring process, as simulated above, showcases the potential of the AI Feedback and Analysis Module to adapt and refine the system continuously.It ensures that the health monitoring is as accurate and personalised as possible, aligning with SIM's goal of providing a user-centric and adaptable healthcare monitoring solution.

Adaptability Through AI Feedback and Analysis
The AI Feedback and Analysis Module processes user input to continuously refine and personalise the health monitoring experience.This adaptability is demonstrated through a proof of concept visualisation, illustrating how SIM responds to user feedback regarding exercise context, thereby adjusting health data categorisation.
Figure 12 effectively demonstrates the dynamic nature of SIM.Initially, without considering exercise context, some health readings might be misinterpreted as anomalies or signs of distress (indicated by the blue bars).However, after integrating user feedback about recent physical activities, the system adjusts its interpretation (indicated by the orange bars).This leads to a more accurate representation of the user's health status, avoiding unnecessary alerts and providing a tailored health monitoring experience.
This ability to incorporate and respond to user feedback is pivotal, especially in diverse and variable scenarios such as health monitoring.It allows SIM to adapt over time, improving its accuracy and effectiveness.The AI Feedback and Analysis Module does not just adjust parameters but learns from interactions, ensuring that the system evolves to meet the unique needs of each user.

Performance Considerations and Typical Delays
One of the critical aspects of middleware systems, especially those dealing with realtime data and decisionmaking like SIM, is performance.Performance in such systems is often gauged by their latency, or the time taken to complete an operation, from data acquisition to actionable insights.In the context of SIM, several operations could introduce delays: The justifications for these assessments are based on the anticipated performance metrics derived from the design and features of SIM.However, it is critical to note that these theoretical evaluations are conjectural and await empirical validation.The insights gained from this comparison set the stage for more extensive research, where SIM can be empirically tested and validated.
This groundwork is the beginning of a broader journey to transform SIM from a theoretical model into a practical and impactful IoT system.Moving forward, our focus is on empirical research to validate the effectiveness of SIM in real-world applications, crucial for its evolution and implementation.
SIM exhibits key benefits, such as enhanced interoperability through its semantic approach, intelligent decisionmaking via AI integration, and robust security and data integrity through Blockchain Components.These features distinguish SIM from traditional, cloud-centric, and edge-centric middleware, each having its own merits in isolated metrics.Our comparative analysis underscores SIM's comprehensive strengths, marking it as a promising choice for modern and future IoT applications.

Challenges and Advancements in SIM
Although the Semantic IoT Middleware (SIM) demonstrates numerous potential benefits for robust and efficient data sharing in distributed systems, several challenges are apparent that must be addressed in future research.

Computational Complexity and Storage Requirements
One significant limitation of SIM involves the computational complexity and storage requirements associated with the integration of semantic annotation and blockchain technology.Semantic annotation, which involves attaching metadata to content, is a highcomplexity and computing-intensive task, particularly when automated using natural language processing and machine learning techniques.Blockchain technology, known for transaction processing and smart contracts, adds to this complexity, especially in decentralised implementations.
Implications: These complexities can impact the scalability and real-time processing capabilities of SIM.High computational demands may hinder the system's ability to efficiently process large volumes of data or operate effectively in resource-constrained environments.
Potential Solutions: To mitigate these challenges, future research could explore more efficient data processing algorithms, the utilization of cloud-based services for storage, or implementing hybrid blockchain models to balance decentralization with computational efficiency.
Contextualization Within Design: Within the overall design of SIM, these complexities are crucial considerations.The middleware's architecture may need to incorporate mechanisms to manage these demands effectively, ensuring that the system remains both functional and efficient.
An analysis of these complexities is summarised in Table 5, providing a theoretical assessment based on existing literature and design considerations.While the Semantic IoT Middleware (SIM) offers robust data management capabilities, there are inherent concerns related to data privacy and security.The semantic layer, although beneficial for understanding and managing data, might inadvertently expose sensitive data patterns or semantics to unauthorised entities.To address these risks, it is crucial to implement advanced encryption and obfuscation techniques tailored for semantic data.
Our previous work, "BioChainReward: A Secure and Incentivised Blockchain Framework for Biomedical Data Sharing", presents a relevant and effective approach to this challenge.In the BCR framework, we developed an AI-enabled privacy Preservation sublayer, which automatically manages the selection and application of privacy-preserving techniques based on the data transaction context.
This sublayer incorporates methods such as anonymisation, obfuscation, and differential privacy.Anonymisation is used when the data request does not require personally identifiable information (PII), obfuscation is applied to protect specific sensitive attributes in the data, and differential privacy is utilised for generating statistical summaries while maintaining individual privacy.The selection of these methods is driven by an AI system, which uses decision trees to determine the most suitable privacy technique based on the data request's purpose, data sensitivity, and the data owner's privacy preferences.
Incorporating a similar AI-enabled privacy preservation approach into SIM could significantly enhance its data security capabilities.The AI-driven decision-making process from BCR can be adapted to the needs of SIM, ensuring that data privacy is maintained without compromising the utility of the data.Moreover, this approach aligns with SIM's focus on user-centric operations as it allows for customisable privacy settings based on user preferences.

Other Challenges
Beyond computational and storage challenges, other limitations include: Semantic Layer Maintenance: The comprehensive creation and maintenance of semantic representations, especially in rapidly evolving domains, might prove challenging.
Standardisation Adoption: The middleware's effectiveness depends on the widespread adoption of standardised semantic representations, where inconsistency can dilute potential benefits.
Legacy System Integration: While SIM endeavours to bridge the gap between different data sharing systems, integrating legacy systems that do not support semantic technologies remains a challenge, requiring the development of adaptors or translators.

Conclusions
This study introduced SIM, a novel semantic-, blockchain-, and IoT-based middleware designed to enhance the secure sharing of medical data with third-party entities.The architecture's pivotal Semantic Annotation Component transcends traditional data representation, powering raw data with a structured, universally understood vocabulary that encompasses the vast heterogeneity intrinsic to IoT environments.
SIM seamlessly integrates nine elements, combining the benefits of blockchain technology, AI, and IoT.These elements synergistically deliver secure data sharing, real-time annotation, adaptive access controls, and profound insights drawn from semantic analysis.
A key feature of SIM is the integration of cutting-edge cryptographic measures within the Data Security and Encryption Module.This module, interacting closely with the Blockchain Component, guarantees the secure transition of semantically annotated data to a blockchain network, addressing multifaceted concerns in data security and user privacy.
Moreover, the incorporation of an AI Feedback and Analysis Module harnesses artificial intelligence capabilities in order to continually refine the middleware, ensuring holistic enhancement of the middleware based on user feedback and intricate system event analyses.

Figure 1 .
Figure 1.SIM architecture showing all its modules and components.

Figure 2
Figure 2 presents additional insights into the Service Integration and Composition process within SIM.Using an example, it presents a step-by-step workflow of how user requirements lead to the discovery, assessment, selection, and integration of services.

Figure 2 .
Figure 2. Service Integration and Composition workflow of SIM.

• Option 1 :
Introducing a dedicated AI module within the middleware.• Option 2: Embedding AI functionalities within each component of the middleware.• Option 3: Implementing a standalone centralised AI entity outside the middleware.

Figure 3 .
Figure 3.The AI Feedback and Analysis Module.This image shows all the different machine learning techniques this module uses.
• Encryption and Authentication: Once annotated, the health data enter the Data Security and Encryption Module.This pivotal component encompasses multiple operations to ensure data confidentiality and integrity: -Encryption: Using the AES-256 encryption standard.-Device Authentication: Through Public Key Infrastructure (PKI), the module authenticates the source IoT device.This ensures that the data origin is legitimate and trustworthy.-Authorisation: Role-Based Access Control (RBAC) is employed to authorise the data transfer, ensuring that only entities with the requisite permissions can access the data.-Digital Signing: The module employs the Elliptic Curve Digital Signature Algorithm (ECDSA) to digitally sign the data, ensuring the validity and integrity of the transmitted information.-Blockchain Integration: Once the health data are secured, they are passed on to the Blockchain Component.This step amalgamates the robustness of blockchain technology with the previously applied security protocols, granting the data an added layer of protection against tampering or unauthorised access.-Data Processing and Feedback: From the blockchain, the data progress to the Semantic Processing Component, where operations like query, analysis, and integration occur.The processed data are subsequently directed to the AI Feedback and Analysis Module.This AI-driven component harnesses the data to generate actionable insights or composed services, capitalising on rich, secured, and semantically rich information.

Figure 6 .
Figure 6.Flowchart illustrating the practical application of SIM in the scenario.

Figure 7
Figure7provides a detailed view of the operations within each module of the middleware.It highlights the role of the AI Feedback and Analysis Module in refining system performance and demonstrates the sophisticated process that transforms Alice's raw data into actionable insights and services.

Figure 7 .
Figure 7. Detailed semantic annotations sequence diagram showing the ML-enhanced insights.

Figure 8 .
Figure 8. Sequence diagram depicting the generic data flow.It shows how annotated health data are passed from the data security module to the blockchain module and the operations involved in this module.

Figure 9 .
Figure 9. AI module: usage of deep learning to implement adjustment for alerts.

35 :-
Inform Dave about the enhanced alert mechanism 36: Repeat Process for Continuous Learning and System Evolution 4.4.User-Centric Operations in SIM: Beyond Healthcare This section details the user-centric characteristics of SIM through multiple scenarios.4.4.1.Healthcare Scenario: Context-Aware Health MonitoringArchitecture Utilisation: -The IoT Data and Event Generation Module captures detailed health metrics from Alice, including heart rate, blood pressure, and activity levels.

Listing 1 . 2 # 11 " 12 "
Generate Sensor Data Function.1 def g e n e r a t e _ s e n s o r _ d a t a ( num_readings ) : Code to generate random sensor data 3 data = [] 4 for _ in range ( num_readings ) : 5 heart_rate = random .randint (55 , 120) 6 b l o o d _ p r e s s u r e _ s y s t o l i c = random .randint (85 , 140) 7 b l o o d _ p r e s s u r e _ d i a s t o l i c = random .randint (55 , 90) 8 temperature = round ( random .uniform (36.0 , 39.0) , 1) bl ood_pr essure " : ( blood_pressure_systolic , b l o o d _ p r e s s u r e _ d i a s t o l i c ) ,

Listing 2 . 2 #
Annotate Sensor Data. 1 def a n n o t a t e _ s e n s o r _ d a t a ( sensor_data ) : Code to annotate sensor data 3 anno tated_ data = [] 4 for reading in sensor_data :

Figure 11
Figure 11  depicts the distribution of heart rate readings generated by the above script, illustrating the type of data that SIM would handle and analyse in real-world applications.

Figure 11 .
Figure 11.Distribution of heart rate readings.

Listing 3 . 2 #
Encryption using SHA-256. 1 def encrypt_data ( data ) : Code to simulate encryption using SHA -256 hash 3 data_string = json .dumps ( data ) .encode () 4 hash_object = hashlib .sha256 ( data_string ) 5 return hash_object .hexdigest ()Simulating Smart Contract Execution: After securing the data, SIM interacts with the Blockchain Component to ensure the integrity and traceability of health data.The following snippet shown in Listing 4 simulates the execution of a smart contract within a blockchain network, an operation that would handle tasks such as verifying health data transactions or alerting healthcare providers in response to specific conditions detected in the data.

Listing 4 . 2 #
Smart Contract Function. 1 def e x e c u t e _ s m a r t _ c o n t r a c t ( en crypte d_data ) : Code to simulate smart contract execution 3 if 'a ' in enc rypted _data : # Placeholder condition 4 print ( " Smart Contract : Alert sent to healthcare provider ." ) Contract : No action needed ." )

Figure 12 .
Figure 12.Combined visualisation of annotation categorisation before and after AI feedback.The blue bars represent the initial annotations, and the orange bars show the updated annotations following feedback, demonstrating the system's adaptability to incorporate user-provided exercise context.

Table 1 .
Comparative analysis of advanced features in IoT middleware architectures.

Table 2 .
Comparison of commercial-grade IoT middleware solutions.

Algorithm 2
Blockchain Role in SIM for Enhanced Security 8: Initialise Blockchain Component 9: for each piece of data from IoT devices do

Table 3 .
Comparative analysis of middleware solutions in IoT and smart city applications.
Algorithm 3 AI Feedback and Analysis in SIM for Personalised Healthcare 15: Initialise AI Feedback and Analysis Module 16: for each feedback instance from users do participation in their health monitoring is crucial for the efficacy of healthcare systems.SIM acknowledges this by incorporating mechanisms for patients to provide feedback and contextual information, which is crucial for accurate and personalised health monitoring.Listing 6 represents how patients might interact with SIM, providing context such as recent physical activities, which can significantly influence health data interpretation.
4return " Context updated in patient profile "

• Data Acquisition and Annotation:
The initial phase involves collecting data from IoT devices and annotating them.While generally fast, this process's speed can vary based on the complexity of the data and the efficiency of the annotation algorithms.User Control: Enhanced user control in SIM is anticipated owing to its Blockchain Component, which allows for transparent and user-controlled data access and manipulation.•Latency:The distributed nature of SIM, with processing completed closer to data sources, is expected to result in lower latency compared to other approaches that might rely on centralised processing.
• AI Feedback and Analysis: The AI-driven feedback and analysis module, while providing valuable adaptability and learning capabilities, can also introduce computational delays.The extent depends on the amount of data processed and the efficiency of the implementation.•

Table 5 .
Detailed assessment of computational complexity and storage in SIM.