block analytics tool integrated with blockchain based IoT platform

: The Internet of Things (IoT) is currently the paradigm of connectivity and driving force 1 behind the state-of-art applications and services. However, the exponential growth of the number 2 of IoT devices and services, their distributed nature, and scarcity of resources has increased the 3 number of security and privacy concerns ranging from the risks of unauthorized data alterations 4 to the potential discrimination enabled by data analytics over sensitive information. A blockchain 5 based IoT-platform is introduced to address these issues. Built upon the tamper-proof architecture, 6 the access management mechanisms ensure the authenticity and integrity of data. Moreover, a novel 7 approach called Block Analytics Tool (BAT), integrated with the platform is proposed to analyze and 8 make predictions on data stored on blockchain. BAT enables the data-analysis applications to be 9 developed using the data stored in the platform in an optimized manner acting as an interface to 10 off-chain processing. A pharmaceutical supply chain is the use case scenario to show the functionality 11 of the proposed platform. Furthermore, a model to forecast the demand of the pharmaceutical 12 drugs is investigated using a real-world data set to demonstrate the functionality of BAT. Finally, the 13 performance of BAT integrated with the platform is evaluated. presents an IoT-platform that uses blockchain with other state of the art technologies in solving the identiﬁed problems. The decentralized and peer-to-peer architecture of blockchain addresses problems related to centralized architectures and the tamper-proof mechanisms of blockchain provide data integrity making the system resistant to unapproved modiﬁcations. Smart contracts are used the communication between IoT-devices verifying the authenticity of data sources and a new access control mechanism is proposed to protect data and prevent unauthorized access. A smart granted.


17
The Internet of Things (IoT) plays a significant role in the convenience of human daily life at 18 present through various innovative applications and services. Further, it empowers the concept of 19 autonomous systems creating a new social paradigm. The enormous amount of data generated by 20 these services and systems are usually stored in on-premises servers and cloud servers depending on a centralized authority. The data is stored in a separate storage and the hash pointer is stored on 152 blockchain. In this mechanism, every time a transaction is processed, the data stored on the off-chain 153 network has to be retrieved to identify the state of the block. For instance, in supply-chain transactions, 154 the current owner of a specific object has to be identified to perform the transaction between the current 155 owner and the new owner of the object. For this purpose, the off-chain database has to be queried. 156 Hence, the performance of transaction processing is reduced drastically. In our study, we introduce a 157 new approach to mitigate these constraints and drawbacks called BAT. conditions. According to the World Health Organization, more than 10% of medicines worldwide are 164 counterfeited [24]. Manipulating the expire date, producing drugs with no active chemical ingredients 165 [25] are some instances where drug counterfeiting occurs. After distributing these ill-treated drugs, 166 users are unable to identify these counterfeit medicines. There is no proper mechanism to verify the 167 integrity of whether the original package is distributed by the third party logistics company [24]. Most 168 of the above-mentioned problems can be addressed by establishing trust between the parties in the 169 supply chain. A blockchain based IoT-platform can be used to process transactions that occur in the 170 supply chain and the implementation is presented in the Section 5. Figure 1 shows the scenario of the 171 use case chosen. In the use case, the production companies produce medicine, Third Party Logistics (3PL) supply the medicine to the warehouse. Through the warehouse, the medicine is distributed to  Blockchain acts as a data storage system for transactions as all the transactions occured in the 176 supply chain are recorded in blockchain. These data can be used for data analytics of the supply 177 chain. Maintaining the supply and demand in pharmaceutical supply chains is very important.

178
Demand forecasting in the pharmaceutical industry is critical as the availability of drugs at the needed 179 time impacts on patient's life [26]. Furthermore, the demand for drugs by different pharmaceutical 180 companies is a complex combination of the necessity of drugs, shelf life, regulations and the cost 181 associated with drugs. The consumption method [27] of forecasting of drugs is the usage of historical 182 data of past consumption of drugs. As the data is recorded in blockchain in a well-ordered manner, it 183 can be used to predict future requirements of drugs. Functionality of the BAT was experimented with 184 the use case scenario of developing a demand forecasting model for pharmaceutical drugs as shown in 185 Section 5.2.  Figure 2 shows the architecture overview of the platform. A modular architecture is adopted with 188 a layered structure that ensures that each layer could be designed separately without altering the other 189 layers. The architecture is divided into 4 layers. Each layer is interfaced with the other layer through a 190 communication medium.

191
The device layer that consists of the sensors and actuators, is connected through a local gateway IoT devices as it is lightweight and more suitable for the communication between constrained devices 198 as it has a small header [29]. Furthermore, the network layer uses Transport Layer Security (TLS) [30] 199 for the encryption of data. The network layer is connected with blockchain network through a message 200 broker. In our implementation, MQTT broker is used as the message broker.

201
Blockchain is the main actor in the platform. The service layer is interconnected with blockchain.   Hence, blockchain acts as a data warehouse [33] that stores data from different sources (e.g., IoT 221 devices, DApps data, management data, etc.). These data are very useful especially for industries and 222 business. With the proposed system, data can be retrieved and visualized easily specially for business 223 analytics. A novel approach to query blockchain is introduced in Section 4. On top of the services 224 provided by blockchain, the application layer is visible that exposes these services to the external users.

225
This layer is interfaced with the service layer using an API gateway. The transaction processing and 226 interacting with front-end applications are developed through DApps.

227
In the study, a special tool called BAT facilitates the development of Machine Learning (ML) and 228 Artificial Intelligence (AI) applications using the data stored in blockchain, as explained in Section 229 4. Each IoT-device is registered before data transmission in the platform. Device management in the 230 application layer is used to handle IoT devices. The registration and management is performed through 231 a smart contract specifically created for that purpose. Each device is also provided with a wallet to 232 prove its identity. Hence, this mechanism ensures that unauthorized devices cannot communicate in  After the authentication, the request for the specific transaction, or the process is submitted to 242 the API gateway through the application. The smart contract is invoked once the request is received 243 by blockchain network. From one smart contract to the other the specific logic function is changed.

244
The use of a smart contract removes the necessity of a centralized control, or a third-party access   255 We are proposing a novel approach called BAT for the analytical processing of the data stored in 256 blockchain while overcoming the drawbacks in the state-of-the art approaches in this section. We aim to ensure the privacy and security of the data that is being used for that analysis. Hence, 258 all communications that take place through BAT also adopts the secure features used in blockchain

265
The overall execution procedure of BAT is shown in the Figure 5.

266
There are four main tiers in BAT that have separate functionalities and these are explained in detail 267 along with the execution procedure. Tier 1 is the extraction of data from blockchain. Extracting data 268 from blockchain acts as a bottleneck due to the effect of data extraction on the transaction processing.

269
As mentioned earlier rich queries can degrade the performance of blockchain system. Furthermore, to 270 query a single block, all the blocks in blockchain have to be searched. Hence we use a novel approach 271 to extract data from blockchain through an index specifically created for transactions occurring on 272 blockchain. A special index called Block Index is proposed to increase the efficiency of the data 273 extraction process. Block Index acts as a filtering process for the data. Objects(B x ) = B 1 , B 2 , . . . . . . , B m Parties(p m , p n ) = P 1,2 , P 2,3 , .., P n−1,n Transactions(t) = B 1 P 1,2 , B 2 P 2,3 ..., B n P n−1,n In here, the transactions occur related to object 1 between party 1 and party 2 is represented as B 1 P 1,2 .

283
When object 1's transaction occur between party 2 and party 3 the details about the object remain the 284 same where as the current owner changes from party 2 to party 3. This relationship is considered and 285 the Block Index is created as a matrix representation of the transactions. Each row in the Block Index 286 represents the transactions related to one particular object. The columns represent the transactions 287 between different parties. The chain of transactions related to the objects between different parties is 288 mapped into a matrix representation. Index is shown in the algorithm 1.Through this representation, the data extraction processed can be 294 performed more efficiently. If we assume that blockchain recorded the data about transactions as below: If the details about the object related to the 10 th transaction t 10 . Then, instead of searching through 10 300 transactions, the Block Index gets the required details by only searching the first column (P 1,2 ) as all the 301 other column have the replication of the same details. In the worst case scenario, this reduces the data

303
Through the configuration file, a user can provide the necessary data to be extracted. Then the 304 specified data are mapped to be used by the Block Index architecture. For instance, if the user wants the 305 data about the transactions between party P m and P n the data from the P m,n has to be retrieved. After 306 mapping the data, rules are created to retrieve data from blockchain. These rules are the queries that 307 are used to get data from blockchain storage system. Through the Block Index the data are extracted.

308
The communication between the platform and BAT takes place through a smart contract.

309
The extraction process takes place through pagination to ensure the consistency of data and to 310 avoid the overflow of memory. After the initial extraction, BAT keeps track of the last set of data that 311 has been queried from the platform and this is kept as a snapshot of the data retrieved. For instance, if 312 the details about the block k (B k ) were queried in the last extraction, BAT keeps a tag stored to assure 313 that in the next data extraction time, the same data is not copied that reduces the data extraction 314 time. As soon as a transaction occurs in the platform, an event is triggered to update the Block Index.

315
Updating the Block Index takes a small amount of time, whereas in the current approaches of analytical 316 processing, the latency associated with the transaction processing is very high. The performances are 317 compared in the Section 6.

337
The user can explicitly mention the operations and conditions that the data set must go through 338 to transform the data. BAT creates a precedence process of operations must be followed to ensure that  Unlike blockchain data storage system, the tools can perform rich queries, preprocess data stored in 351 BDC. For ML developments, latency and throughput plays a critical role. Usage of BDC instead of 352 acquiring data directly from blockchain reduces the latency and increases throughput. Different users 353 can make use of the same BAT instance. The users can provide a composer file that specifies the type of 354 data required by different users and they are available at the data analysis tools specified by the user.    and processing of data. They are used by production companies, suppliers, warehouse and issuers 400 for the transactions. Through another application, an end user can examine where a specific batch 401 is located on that occasion. Eventually, the end user at the issuer can see how the batch has been 402 transported, stored and issued by the organizations while maintaining the favorable conditions for 403 batches with complete transparency. Figure 6 shows the user interface of the application at the end of 404 transferring a batch through the supply chain.  The transactions between different organizations can be shown as follows. production to supplier 412 = P pr,s , supplier to warehouse = P s,w , warehouse to issuer = P w,i , issuer to patients = P i,p , Different P pr,s P s,w P w,i P i,p B 1 B 1 P pr,s B 1 P s,w B 1 P w,i B 1 P i,p B 2 B 2 P pr,s B 2 P s,w B 3 B 3 P pr,s B 3 P s,w B 3 P w,i B 3 P i,p B 4 B 4 P pr,s B 4 P s,w B 4 P w,i B 5 B 5 P pr,s

459
In this section, we present the analysis of the performance of BAT integrated with the platform. the new batch that is the transaction between production and supplier companies. This is due to the 464 additional device registration time related to RFID and temperature sensors.

465
The following are few mechanisms that ensure the authenticity and integrity of batches while 466 assuring the quality.   When BAT is used for analytics, the overall performance of the platform related to transaction should 491 not be effected.   processing remains constant at almost 0ms when using BAT whereas in the conventional method, the 500 effect on the transaction processing increased exponentially with the increase of the size.

501
Transaction throughput is the number of transactions that can be processed in a second. Even 502 though the total throughput decreases with the increase of the block size, the throughput of the 503 conventional method is comparatively low compared with BAT as shown in Figure 10. Furthermore, 504 the off-chain database duplicates all the data in blockchain creating redundant data whereas the size of 505 the Block Index remains less than 1MB most of the time. The storage cost associated with the current 506 approach is drastically high as it stores lot of redundant data.

507
The comparison of the data acquiring process using BAT and the conventional process is compared 508 in Figure 11. BAT utilize the Block Index in the process of acquiring data whereas in the conventional 509 process, the normal querying architecture is used. In the figure, batches issued to the users mean 510 retrieving data of the 4 th column of the Block Index. Selecting quantity for the batch x is getting data 511 about a batch as the ownership is the only variable that changes through a row. Hence, data can be 512 retrieved by querying through the first column without reading all the blocks. Data related to one 513 organization or transaction can be retrieved through the Block Index that increases the efficiency by 514 almost 100% as shown in Figure 11. The search pointer reads only the first column and check for the 515 blocks where drug type is z. The search pointer has no special exploitation in using the Block Index to 516 obtain the transaction history for a batch, as it has to read all the columns to retrieve the transaction 517 history. In this paper, we design and implement a blockchain-based IoT-platform that addresses 520 centralized control, scalability, data security and access control problems found in the current IoT 521 systems. Furthermore, we propose a novel approach called BAT, integrated with the platform that 522 ensures the integrity and authenticity of data used for data-analytics applications. The smart contracts, 523 authentication and access control mechanisms of the platform as well as the BAT ensure about the 524 security and integrity of the data. We present the functionality of blockchain-based IoT-platform and 525 the integration of BAT with the platform through the use case scenario of the pharmaceutical supply 526 chain.

527
According to our implementation and evaluation, the novel approach of BAT utilizing BDC saves 528 resources such as storage facilities compared to the conventional approaches of creating a mirror 529 storage that duplicates resources. Furthermore, the total transaction processing time with BAT is 530 considerably low due to the minimal event processing time related to the creation of Block Index 531 compared with the conventional approach of creating off-chain database. The costs associated with 532 data retrieval from the platform are reduced by the usage of Block Index. Currently, batch processing 533 is conducted in the BAT. In the future, we will mainly focus to bring the platform towards edge 534 computing with the optimization of BAT for real-time processing.