3.1. System Architecture
The proposed system design and implementation comprises several subsystems, including a backend cloud system, an Edge service, IoT nodes, and specific IOTA subsystems. Our proposed architecture shifts from a cloud-centric IoT system, where the Super nodes simply aggregate and push data to the cloud, to a node-centric system, where each Super node owns the data pushed in a distributed and decentralized database (i.e., the Tangle). The backend serves as a consumer of data and a provider of additional resources, such as administration panel, analytics, data marketplace, etc. Specifically, a narrow down approach was followed commencing from a three-tier reference architecture proposed by Tranoris et al. [38
], as shown in Figure 8
, which was modified accordingly to meet our needs.
illustrates the activities diagram of a typical IoT application that was used as a template for the implementation of our prototype [38
]. The diagram was further simplified in order to keep the complexity of the prototype low. The semantic notation submodule was removed as it was deemed unnecessary. The MAM submodule is an addition to the generalized IoT architecture, providing a service which is responsible for receiving the data and based on the metadata, creates the appropriate MAM messages. This submodule is also responsible for connecting to the IOTA Full Node. The final architecture that was adopted is shown in Figure 10
and Figure 11
Process wise, data originated from the sensors are decoded in the Edge through specific drivers. Afterwards, they are enriched with time stamp and node denotation metadata and saved. Local storage not only cancels latency but in conjunction with a rule engine, it could enable intelligence. Then the data are transmitted to the Cloud where they traverse through the same subsystems as shown in Figure 10
The cloud module also offers a data marketplace based on IOTA as well as a dashboard for the IoT system owner as Figure 11
depicts. Essentially, these are activities that are built on top of the Open API. Various sub-modules were discarded as they increased the complexity of the system without contributing to the implementation’s purpose.
3.2. Envisioned Core Activities
The architecture is built around 3 key activities setup phase, data log, data request which encompass the key innovations that are illustrated in this design.
During the setup phase, IoT nodes create the proper MAM channels in order to start broadcasting in the Tangle. Using a unique seed, the IoT node runs the createChannel(dimension) function twice. Note that the higher the dimension of the Merklee Tree, the higher the requirement for memory. The first channel is called status_channel, it has no endpoints other than the central and it is used to publish status messages of 1300 bytes (1300 ASCII char) as follows:
status_message: [string node_id, integer geo_loc_lat,
integer geo_loc_alt, integerbattery, string data_channel_chid,
string last_msgid, string chid_PSK, string data_description,
node_id: Unique id of the IoT node
geo_loc_lat: geographic location latitude
geo_loc_alt: geographic location altitude
battery: Battery level between [0, 100]
data_channel_chid: string of trytes representing channel_id
last_msgid: string of trytes representing the message_id of the last message
chid_PSK: string of the pre-shared key required to unlock the data stream
data_description: short string representing the data
error_code: a unique integer that represent error messages (e.g., 103)
Each message is encrypted with an NTRU scheme, using a public_key that is shared by the backend system. According to the above described scheme, the IoT Node ID is publicly auditable, since it resided in the Tangle, but it is only readable by the parties that have the appropriate private key. In our case, only the backend can read it. As such, everyone can verify its integrity and correctness, but only the backend can extract the sensitive information. When the channel exhausts the Merklee Tree, the last message points to a newly generated status_chanel_chid. It is important to underline that the IoT node is able to dynamically change the format of the offered data streams, by changing how the sensor data logs (SDLs) are encapsulated into different channels/endpoints and the frequency of the PSK change. All the changes are eventually mirrored in the status messages. In case a sensor needs to be added to the system, setup phase is initiated through the functions in the setup_driver.js and its data are stored through Mongoose in the MongoDB database.
Data Log Phase: The IoT node services are activated on event basis and read data from the sensors that are stored in the Tangle. Before that, the data are enriched with metadata (e.g., sensor id, type, etc.) and stored for local use (e.g., local decision-making). Then the MAM service, adhering to the business logic encapsulates the data into data streams. In case a sensor data log (SDL) is divided into multiple data streams, they are referred to as data log segments. Afterwards, the MAM service creates the proper MAM Bundle where the header and packet transactions are placed and hashed. The service then sends the MAM Bundle to an IOTA Full node in order to issue it to the network. Finally, the MAM service updates appropriately the PSK and data_channel_chid fields of the status_message.
Each data stream can be viewed as a time series of SDLs
at corresponding instances
. Likewise, as PSK
changes every m SDL (m ≥1), the SDL time series creates a PSK
and a data channel chid
. In the proposed data configuration, in n SDL there are m PSK
(m≤n) and z chid
(z≤m), since the chid
messages, where d is an arbitrary integer as mentioned in Section 1.4
. All the above are illustrated in Figure 12
. Figure 12
a describes the state machine of the data log phase, Figure 12
b illustrates the required activities, while Figure 12
c showcases the sequence of the messages between the actors (users and services).
Data Request Activity:
During the data request phase, a user accesses the marketplace and creates an account and applying appropriate filters (e.g., data types, geolocation, etc.) reveals data streams on the map. The user can purchase data, in chunks of m SDL, selecting from specific streams, where m is the result of the PSK changes frequency for each data stream. The user can either purchase access of data already in the Tangle or pre-purchase access for future SDL by subscribing to a data stream for a specific amount of SDL chunks. After the user selects the appropriate data stream and accepts the exchange of tokens, the backend generates an IOTA address and an Oauth token by using a uniquely generated seed [39
]. Consequently, the user acquires the IOTA address to deposit the requested amount and the Oauth token in the data_field. Upon the verification of the transfer, the backend searches the database for the appropriate msgid
combinations which are used to find the transactions in the Tangle, fetches the Bundles and decrypts the data. Data are served through the web interface to the end user as the corresponding diagrams depict. The above mentioned service is also offered through a REST API that can be used in lieu of the web interface. Figure 13
a showcases the state machine of the data request phase, Figure 13
b depicts the required activities, while Figure 13
c illustrates the sequence of the messages between the actors (users and services).
3.3. Architecture Metrics
We can identify two distinct metrics that encapsulate the capabilities of the proposed architecture. Those two metrics are:
Iota Transaction throughput
Regarding the Iota Transaction throughput, that is, the number of transactions per second (TPS) that the network can support, we are referring to a metric that solely depends on the inherent characteristics of the protocol (IOTA) is important because it directly influences the average time that a transaction needs to be accepted by the network. The current implementation of the tangle supports a TPS of 4–5 [40
] with an average confirmation time of 10 min. The Tangle is thus mature enough to support the described architecture as it offers an acceptable transaction time with no fees. Finally, this latency, although adequate for real-time transactions is bound to be decreased through improvements of the IOTA network architecture itself.
Proof of Work
This metric is important because it highlights and ultimately defines the scalability of the proposed system. The PoW must be done for each transaction (in the current state of the network, and for the foreseeable future). PoW can either be conducted on the hardware, which is preferable as it increases the autonomy of the IoT device, or can be offloaded to a server to be offered as a service. Latest advancements in FPGA implementation [41
] of the PoW algorithm has illustrated the possibility of adding a relatively cheap dedicated PoW core to the IoT hardware, which will be able to conduct PoW in an extremely efficient manner, both in terms of time and power consumption. By implementing an FPGA array in a cloud based PoW, the cost is reduced even more, proving the architecture to be extremely cost effective even with a large increase of IoT nodes.
Each transaction can encapsulate 1 to n data transmissions. By increasing the number of data log packets in each transaction, we decrease the required PoW (by lowering number of transactions needed to transmit the same number of information) but we also decrease the granularity of the data access (since access is granted on transaction basis). The current time needed to perform PoW is shown bellow:
A median computing time of 90 s on Raspberry power 3:4× ARM Cortex-A53, 1.2 GHz [41
Natively on our IoT Hardware (IoT Node): Unfeasible, both in terms of time and power consumption. (M-cortex)
On ×86 CPU (4×cores + GPU, upgrade ccurl implementation “dcurl” [42
]): 9 s
FPGA implementation: 70 ms
Each MAM message can encapsulate up to 1300 bytes of information, or 65 SDL, since each Sensor Data Log has a size of 20 bytes as dictated in the specification. In a scenario where each node emits an SDL every 10 min and there are 12,000 nodes that are used in various applications and are supported by the same backend server, the following metrics are found, in regards to data access granularity and PoW cost. In the scenario, it is assumed that each MAM message demands 3 distinct IOTA transactions to be fully transmitted (this can change in the future as MAM is still in development). The final number of transactions is enlarged by a factor of 10% in order to allow for unforeseen errors that may appear and may lead the IOTA node to re-perform PoW. Figure 14
shows on a per node basis, the amount of transactions issued per day versus the access control granularity, translated in number of SDLs that will be encapsulated into every MAM message. Figure 15
illustrates the linear increase of the daily cost to conduct PoW for the whole system, assuming that it is conducted in a centralised data center on FPGA arrays that consume approximately 8 W and the cost of 1 Kwh is $