You are currently viewing a new version of our website. To view the old version click .
Algorithms
  • Article
  • Open Access

27 November 2020

Monitoring Blockchain Cryptocurrency Transactions to Improve the Trustworthiness of the Fourth Industrial Revolution (Industry 4.0)

,
,
and
1
Faculty of Industrial Engineering, Urmia University of Technology, Urmia 17165-57166, Iran
2
Industrial Engineering Department, Iran University of Science and Technology, Tehran 13114-16846, Iran
3
Muma College of Business, University of South Florida, Tampa, FL 33620, USA
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Using Blockchain Technology in the Industry 4.0 Era: Cases and Applications

Abstract

A completely new economic system is required for the era of Industry 4.0. Blockchain technology and blockchain cryptocurrencies are the best means to confront this new trustless economy. Millions of smart devices are able to complete transparent financial transactions via blockchain technology and its related cryptocurrencies. However, via blockchain technology, internet-connected devices may be hacked to mine cryptocurrencies. In this regard, monitoring the network of these blockchain-based transactions can be very useful to detect the abnormal behavior of users of these cryptocurrencies. Therefore, the trustworthiness of the transactions can be assured. In this paper, a novel procedure is proposed to monitor the network of blockchain cryptocurrency transactions. To do so, a hidden Markov multi-linear tensor model (HMTM) is utilized to model the transactions among nodes of the blockchain network. Then, a multivariate exponentially weighted moving average (MEWMA) control chart is applied to the monitoring of the latent effects. Average run length (ARL) is used to evaluate the performance of the MEWMA control chart in detecting blockchain network anomalies. The proposed procedure is applied to a real dataset of Bitcoin transactions.

1. Introduction

The fourth industrial revolution has changed the life of human beings in different ways, which means it has transformed the world into a great information system [1]. It has brought different new innovations, including the Internet of Things (IoT), smart machines, etc., to our lives. In this era, the digitalization and networking of systems increase the amount of data that can support the functions of analysis in different industries [2]. Various Industry 4.0-based applications, including manufacturing, agriculture, healthcare, logistics, and finance sectors, deal with data in large volumes [3]. Building such an automated world requires a large number of tools with connectivity to the internet [4]. Blockchain technology has brought new ways to achieve work processes in such industries. In other words, it is a new way of transferring and storing data. For example, in finance, blockchain enables business transactions to be completed safely with smart contracts [1].
A blockchain, which is a decentralized ledger system, can be defined as a collection of transactions between nodes inside each block [5]. Blockchain provides data security without any other third party organization [6]. Incorporating blockchain and Industry 4.0 offers a high level of automation, more flexible systems, smart contracts, security, and micropayments [4,7,8]. Therefore, blockchain is the most important innovation of the last industrial revolution. The inherent security of blockchain, which inhibits any of the content being changed once it has been stored in a block, and decentralization of the blockchain, incorporates collaboration and transparency. One of the most important effects of Industry 4.0 is the blockchain cryptocurrency market, which started with Bitcoin. Bitcoin is the name of the most common cryptocurrency, the one for which blockchain technology was created. Bitcoin is the world’s largest digital currency known as a cryptocurrency market [9,10]. According to www.coinmarketcap.com, the number of Bitcoin nodes, as the most common blockchain cryptocurrency, is growing and is currently more than 47,000 nodes. However, the emergence of blockchain and cryptocurrencies has put internet-connected devices in more danger of being hijacked by hackers to mine cryptocurrencies [11]. In this regard, monitoring blockchain cryptocurrency networks and analyzing different factors and characteristics as well as the unobserved structure of the blockchain network can be useful to observe abnormal behaviors of nodes and transactions. A network is a structure that indicates interactions among individual entities involving relationships. With the rapid development of network data, it has become important to use these data in order to capture various information on social, biological, computer, financial etc. networks. In this regard, Luqman et al. [12] applied complex neutrosophic hypergraphs to the theory of social networks and discussed lower truncation, upper truncation, and transition levels of the proposed model to deal with the periodic nature of inconsistent information in hyper networks. Behera et al. [13] proposed a computing method to calculate network centrality value in large-scale datasets, including social networks. They focused on the centrality analysis of time-varying social network entities for a Twitter dataset. Salehi Rizi and Granitzer [14] proposed a methodology to identify graph properties explaining the similarity of the local neighborhood of a node by random-walk based graph embedding methods. Then, they examined whether embedding can be used to predict centrality values directly. Recently, the emergence of blockchain technology gave access to stored financial transaction data [15,16,17]. Thus, the financial industry is the main user of blockchain because of the cryptocurrency application [18]. In this regard, many research studies have focused on transaction network analysis. We review some papers related to this field in Section 2.
This paper proposes an approach to monitor the latent patterns of a blockchain-based cryptocurrency transaction network. To do so, the blockchain network is modeled as a dynamic network in which every blockchain cryptocurrency account is considered as a node and edges represent the transactions between two nodes. Then, the network is modeled with a hidden Markov multi-linear tensor model (HMTM), and its parameters are estimated by a Markov Chain Monte Carlo (MCMC) algorithm. Finally, a multivariate exponentially weighted moving average (MEWMA) control chart is utilized to monitor the estimated latent parameters. Therefore, if any abnormal behavior in the blockchain cryptocurrency transactions occurs, the control chart can signal an out-of-control condition. The proposed procedure can also be used to monitor other network applications, as Industry 4.0 has made network transactions an essential part of manufacturing, financial, social, etc. systems. Despite the importance of monitoring cryptocurrency transactions for a more trustful Industry 4.0 economy, this line of research has not gained much attention. Therefore, statistical modeling and monitoring of blockchain cryptocurrency transaction networks are investigated in this research. To the best of our knowledge, this is the first research study that attempts to statistically monitor the network of cryptocurrency transactions.
The remainder of this paper is organized as follows. In Section 2, we review related papers, and in Section 3, we briefly discuss the multi-linear tensor model (MTRM) and its Bayesian parameter estimation method using a MCMC algorithm. Section 4 describes the monitoring scheme of the proposed network model. In Section 5, the performance of the proposed approach is evaluated using a simulation. In Section 6, Mt. Gox blockchain Bitcoin dataset is used to reveal a real-world example. Finally, concluding remarks and future research are stated in Section 7.

3. Statistical Modeling of Network and Parameter Estimation

3.1. Notations

The notations used for formulating the proposed model are defined in Table 1.
Table 1. Notations list.

3.2. Hidden Markov Multi Linear Tensor Model

Recently, dynamic modeling of longitudinal networks has gained much more attention. These networks have found applications in many fields, such as social, economical, biological, etc. One can model a longitudinal network with N = { 1 , , N } nodes as a multilayer network in which there are T layers and relationships between nodes at time t are represented by a matrix Y t = { y i j t | i , j N } with size N × N for t { 1 , , T } . Modeling the matrix Y t , a hidden Markov multi-linear tensor model (HMTM) proposed by Park and Sohn [29] is used. Park and Sohn [29] extended the work of Hoff [30,31], which presented a multi-linear tensor model (MTRM). According to Hoff [30,31], network effects can be formulated as the following:
Pr ( y i j t = 1 | x i j t , u i , u j , v t ) = x i j t β + u i , v t , u j + ε i j t
where, U = ( u 1 , , u N ) T and v t = ( v 1 t , , v R t ) T , which are, respectively, R-dimensional latent node positions and node connection rules at time t. Additionally, ε i j t follow N ( 0 , σ 2 ) and U and V = ( v 1 , v 2 , , v T ) follow matrix normal distributions where
U ~ MN N × R ( 1 μ U , I N , φ U )
V ~ MN N × R ( 1 μ V , I N , φ V )
Park and Sohn [29] combined MTRM with a hidden Markov model and developed a hidden Markov multi-linear tensor model (HMTM). In a HMTM that involves k breaks, the probability distribution of network data, B t , is modeled as the following:
B t = Y t Ω t
B t = β 1 N + U Ψ t V t U Ψ t T + E t
where, Ψ t is a hidden state variable, 1 N is a N × N matrix with all one element, and E t follows a matrix normal distribution with MN N × N ( 0 , σ Ψ t 2 I N , I N ) . We use the procedure and prior distributions adopted by Park and Sohn [29] for parameter estimation. For more details on MTRM, HMTM, and parameter estimation methods, readers can refer to the works of Park and Sohn [29], and Hoff [30,31].

4. Monitoring Scheme

The MEWMA control chart is used for the simultaneous monitoring of several variables [32]. In this research, the means of latent node positions for all nodes and node connection rules are monitored. We consider R = 2 dimensions for latent positions and connection rules. Therefore, vector L t = ( u ¯ 1 t , u ¯ 2 t , v 1 t , v 2 t ) is monitored over time. To obtain MEWMA statistics, vector Z t is defined as the following relation:
Z i = Λ L t + ( 1 Λ ) Z t 1
In which, Λ is the vector of smoothing parameters, which is a diagonal matrix of λ 1 ,   λ 2 , , λ p , where p is the number of variables. If all the variables are equally important, then λ 1 = λ 2 = = λ p [33]. The statistic to be calculated for the MEWMA control chart is given as,
T t 2 = Z t Q 1 Z t
where, Q is the variance–covariance matrix of the variables. T 2 statistic is monitored using a control chart with an upper control limit defined by ( n 1 ) p ( n p ) F ( α , p , n p ) 1 where F ( α , p , n p ) 1 denotes the upper 100 ( 1 α ) percentile of the F distribution with p and n-p degrees of freedom, respectively [32].
In order to design the MEWMA control chart in phase I, a number of in-control networks are collected over the course of time. The parameters of these networks are estimated with the MCMC algorithm and accordingly the T 2 statistics of the networks are calculated. Then, based on the T 2 statistics, an upper control limit (UCL) is found so that a type I error ( α ) is met. For phase II, new snapshots of networks are collected and their T 2 statistics are compared to the UCL. An out of control signal is triggered if the T 2 statistic of any network exceeds the UCL.

5. Performance Evaluation Using Simulation

In this section, the ability of the MEWMA control chart to detect abrupt changes in the network is assessed through simulation. To do so, average run lengths (ARL) of simulated networks in phase II are calculated after applying different shifts to the edge creation probabilities between nodes of the network. Therefore, 500 in-control consecutive networks with 600 nodes are generated to determine the UCL. The networks are modeled with HMTM, and their parameters are estimated through the MCMC algorithm. In this regard, the following steps are followed to generate 10,000 random networks in phase I and obtain the UCL:
(1)
For t = 1 to 10000 ;
  • For 1 i < j 500 and based on the probability of edge creation between nodes i and j generate y i j t ;
  • Use the MCMC algorithm to estimate vector L t for the generated network;
(2)
For t = 1 to 10000 , calculate Z t based on relation (6);
(3)
For t = 1 to 10000 , evaluate the T 2 statistics based on relation (7);
(4)
For all T 2 statistics, find a UCL that the type I error α meets.
For type I error α = 0.01 , the UCL is obtained. Therefore, the in-control ARL of the control chart is 100. For phase II, different probabilities of edges are used to evaluate the performance of the proposed control chart in detecting network anomalies. Then, the following steps are followed to evaluate the out-of-control ARL of the control chart:
(1)
For i = 1 to 10000;
(a)
Set R L i = 0 ;
(b)
While T 2 < UC;
  • Generate a random network based on different probabilities;
  • Estimate model parameters with the MCMC algorithm and obtain T 2 statistic from relation (7);
  • Put R L i = R L i + 1 ;
(2)
Evaluate ARL = i = 1 10000 R L i / 10000 .
The results of the simulation for phase II are illustrated in Figure 1. It can be observed that the T 2 control chart is capable of detecting changes in the HMTM parameters. The out-of-control ARL values are much less than the in-control ARL.
Figure 1. Average run length (ARL) values for different amounts of probabilities.

6. Real-World Example

In this section, we used a Bitcoin transaction dataset from 2011 to 2013 of Mt. Gox leaked transactions dataset to analyze users’ behavior in a Bitcoin network. Each user has a unique identity as a seller or buyer. Therefore, we used user-IDs as transaction network nodes, and the link between nodes was created if there was at least one transaction between nodes during a month. Figure 2 indicates the sample plot of the transaction network graph for one month. We estimated latent parameters for each month and calculated statistics to analyze abnormal behaviors. In this regard, the MCMC algorithm was implemented with a Markov chain with a length of 1000. The shape parameter of the inverse gamma prior distribution for U is 10, and the scale parameter is one. Additionally, the shape parameter for the inverse gamma prior to variance parameters for V is 10, and the scale parameter is the time length of Y. The prior distribution of error variances is an inverse gamma distribution with the shape and rate parameters equal to 0.1. Estimated L t vectors for t = 1 , , 25 are represented in Table 2. We used 15 months as phase I and 10 months as phase II. Figure 3 represents per month statistics and the upper control limit of phase I for 15 months and phase II statistics for the next 10 months. It is observed in Figure 3 that an increase starting from the 20th month in phase II is detected, and it is obvious that some latent variables have caused this increase in the number of statistics for the 20–25th months. Therefore, we conclude that there is a significant change in monthly transactions especially in the 23rd month in phase II.
Figure 2. Transaction network model of APRIL 2011.
Table 2. Estimated latent positions and node connection rules.
Figure 3. Phase I and II control chart.
In Figure 3, statistic T 2 is plotted versus time. In order to have a better analysis of latent effects, we also plot these effects over time. Figure 4 and Figure 5 represent the trend of u ¯ 1 t and u ¯ 2 t , and v 1 t and v 2 t in different months. According to Figure 5, in phase II, it can be seen that the increase in v 2 t for the 19–23rd months has caused an increase in statistic T 2 . Therefore, changes in v 2 t can be a reason for changes in T 2 in phase II.
Figure 4. u ¯ 1 t and u ¯ 2 t over time.
Figure 5. v 1 t and v 2 t over time.
For a better analysis of the network, one can monitor separate parts of the network to detect abrupt changes in specific parts of the transaction networks. In this regard, separate control charts can be designed for different parts of the network. In addition, the number of traded Bitcoins can also be monitored. In the current research, when the number of transactions abruptly changes, the control chart will trigger a signal. In Figure 6, the amount of money spent for buying Bitcoins is plotted versus time. This money can be expended by one user or a number of users. Figure 3 cannot be compared with Figure 6 because Figure 3 is based on the number of transactions, and Figure 6 shows the value of transactions. However, Figure 6 indicates an increase starting from the 21st month, which could be an indication of an increase in the number of transactions from the 21st to the 25th month in Figure 3. The proposed procedure can monitor the number of transactions over time. However, the value of transactions can also be a matter of importance. For future research, this procedure can be extended to take into account the value of transactions.
Figure 6. Amount of money spent to buy Bitcoin.

7. Conclusions

In this paper, a procedure was proposed to monitor network anomalies of transactions in blockchain cryptocurrency networks. To do so, networks of blockchain transactions were modeled by a hidden Markov multi-linear tensor model (HMTM), and the MCMC algorithm was used for parameter estimation. Then, the mean vector of latent node positions and node connection rules were monitored by a MEWMA control chart. Simulation studies confirm the ability of the MEWMA control chart to detect network anomalies. A real dataset of blockchain Bitcoin cryptocurrency was used to show the applicability of the proposed procedure. For future research, blockchain cryptocurrency transactions can be modeled by weighted networks for which the value of transactions is monitored in addition to their quantity.

Author Contributions

Conceptualization, K.S.-L., S.J.G. and A.M.; Methodology, K.S.-L. and F.E.; Software, F.E.; Validation, K.S.-L., S.J.G. and A.M.; Formal analysis, K.S.-L. and F.E.; Investigation, K.S.-L. and F.E.; Resources, K.S.-L., S.J.G., F.E. and A.M.; K.S.-L. and F.E.; Writing—original draft preparation, K.S.-L. and F.E.; Writing—review and editing, S.J.G. and A.M.; Visualization, F.E.; Supervision, K.S.-L., S.J.G. and A.M.; Project administration, K.S.-L., S.J.G. and A.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Akdoğan, D.A.; Kurular, G.Y.S.; Geyik, O. Cryptocurrencies and Blockchain in 4th Industrial Revolution Process: Some Public Policy Recommendations. Available online: https://www.researchgate.net/publication/337635983_Cryptocurrencies_and_Blockchain_In_4th_Industrial_Revolution_Process_Some_Public_Recommendations (accessed on 12 November 2020).
  2. Lasi, H.; Fettke, P.; Kemper, H.G.; Feld, T.; Hoffmann, M. Industry 4.0. Bus. Inf. Syst. Eng. 2013, 6, 239–242. [Google Scholar] [CrossRef]
  3. Bodkhe, U.; Tanwar, S.; Parekh, K.; Khanpara, P.; Tyagi, S.; Kumar, N.; Alazab, M. Blockchain for industry 4.0: A comprehensive review. IEEE Access 2020, 8, 79764–79800. [Google Scholar] [CrossRef]
  4. Mushtaq, A.; Haq, I.U. Implications of blockchain in industry 4. o. In Proceedings of the 2019 International Conference on Engineering and Emerging Technologies (ICEET), Lahore, Pakistan, 21–22 February 2019. [Google Scholar]
  5. Lee, S.C. Magical capitalism, gambler subjects: South Korea’s bitcoin investment frenzy. Cult. Stud. 2020, 1–24. [Google Scholar] [CrossRef]
  6. Yli-Huumo, J.; Ko, D.; Choi, S.; Park, S.; Smolander, K. Where Is Current Research on Blockchain Technology?—A Systematic Review. PLoS ONE 2016, 11, e0163477. [Google Scholar] [CrossRef] [PubMed]
  7. Hofmann, E.; Rüsch, M. Industry 4.0 and the current status as well as future prospects on logistics. Comput. Ind. 2017, 89, 23–34. [Google Scholar] [CrossRef]
  8. Aitzhan, N.Z.; Svetinovic, D. Security and Privacy in Decentralized Energy Trading Through Multi-Signatures, Blockchain and Anonymous Messaging Streams. IEEE Trans. Dependable Secur. Comput. 2016, 15, 840–852. [Google Scholar] [CrossRef]
  9. Chen, W.; Wu, J.; Zheng, Z.; Chen, C.; Zhou, Y. Market manipulation of bitcoin: Evidence from mining the Mt. Gox transaction network. In Proceedings of the IEEE INFOCOM 2019-IEEE Conference on Computer Communications, Paris, France, 29 April–2 May 2019. [Google Scholar]
  10. Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System. Manubot. 2019. Available online: https://git.dhimmel.com/bitcoin-whitepaper (accessed on 10 October 2020).
  11. Teichmann, F.M.J.; Falker, M.-C. Cryptocurrencies and Financial Crime: Solutions from Liechtenstein. Available online: https://www.emerald.com/insight/content/doi/10.1108/JMLC-05-2020-0060/full/html (accessed on 10 October 2020).
  12. Luqman, A.; Akram, M.; Smarandache, F. Complex Neutrosophic Hypergraphs: New Social Network Models. Algorithms 2019, 12, 234. [Google Scholar] [CrossRef]
  13. Behera, R.K.; Rath, S.K.; Misra, S.; Damaševičius, R.; Maskeliūnas, R. Distributed Centrality Analysis of Social Network Data Using MapReduce. Algorithms 2019, 12, 161. [Google Scholar] [CrossRef]
  14. Rizi, F.S.; Granitzer, M. Properties of Vector Embeddings in Social Networks. Algorithms 2017, 10, 109. [Google Scholar] [CrossRef]
  15. Swan, M. Blockchain: Blueprint for a New Economy; O’Reilly Media, Inc.: Newton, MA, USA, 2015. [Google Scholar]
  16. Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System. Bitcoin. Org. 2008. Available online: https://bitcoin.org/bitcoin.pdf (accessed on 24 February 2020).
  17. Wu, J.; Lin, D.; Zheng, Z.; Yuan, Q. T-EDGE: Temporal weighted multidigraph embedding for Ethereum transaction network analysis. arXiv 2019, arXiv:1905.08038. [Google Scholar]
  18. Nofer, M.; Gomber, P.; Hinz, O.; Schiereck, D. Blockchain. Bus. Inform. Syst. Eng. 2017, 59, 183–187. [Google Scholar] [CrossRef]
  19. Chan, S.; Chu, J.; Nadarajah, S.; Osterrieder, J. A statistical analysis of cryptocurrencies. J. Risk Financ. Manag. 2017, 10, 12. [Google Scholar] [CrossRef]
  20. Chu, J.; Nadarajah, S.; Chan, S. Statistical Analysis of the Exchange Rate of Bitcoin. PLoS ONE 2015, 10, e0133678. [Google Scholar] [CrossRef] [PubMed]
  21. Bakar, N.A.; Rosbi, S. High volatility detection method using statistical process control for cryptocurrency exchange rate: A case study of Bitcoin. Int. J. Eng. Sci. 2017, 6, 39–48. [Google Scholar]
  22. Szetela, B. The Use of Control Charts in the Study of Bitcoin’s Price Variability. In Quality Control and Assurance—An Ancient Greek Term Re-Mastered; IntechOpen: London, UK, 2017; p. 201. [Google Scholar]
  23. Li, Z.; Wang, Y.; Huang, Z. Risk Connectedness Heterogeneity in the Cryptocurrency Markets. Front. Phys. 2020, 8, 243. [Google Scholar] [CrossRef]
  24. Motamed, A.P.; Bahrak, B. Quantitative analysis of cryptocurrencies transaction graph. Appl. Netw. Sci. 2019, 4, 1–21. [Google Scholar] [CrossRef]
  25. Elliott, A.; Cucuringu, M.; Luaces, M.M.; Reidy, P.; Reinert, G. Anomaly detection in networks with application to financial transaction networks. arXiv 2019, arXiv:1901.00402. [Google Scholar]
  26. Lin, D.; Wu, J.; Yuan, Q.; Zheng, Z. Modeling and Understanding Ethereum Transaction Records via a Complex Network Approach. IEEE Trans. Circuits Syst. II Express Briefs 2020, 67, 2737–2741. [Google Scholar] [CrossRef]
  27. Ferretti, S.; D’Angelo, G. On the Ethereum blockchain structure: A complex networks theory perspective. Concurr. Comput. Pr. Exp. 2020, 32, 5493. [Google Scholar] [CrossRef]
  28. Javarone, M.A.; Wright, C.S. From Bitcoin to Bitcoin Cash: A network analysis. In Proceedings of the 1st Workshop on Cryptocurrencies and Blockchains for Distributed Systems, Munich Germany, 15 May–15 June 2018. [Google Scholar]
  29. Park, J.H.; Sohn, Y. Detecting Structural Changes in Longitudinal Network Data. Bayesian Anal. 2020, 15, 133–157. [Google Scholar] [CrossRef]
  30. Hoff, P.D. Hierarchical multilinear models for multiway data. Comput. Stat. Data Anal. 2011, 55, 530–543. [Google Scholar] [CrossRef][Green Version]
  31. Hoff, P. Multilinear tensor regression for longitudinal relational data. Ann. Appl. Stat. 2015, 9, 1169–1193. [Google Scholar] [CrossRef] [PubMed]
  32. Crowder, S.V.; Wiel, S.A. Exponentially Weighted Moving Average (EWMA) Control Chart; Wiley StatsRef: Statistics Reference Online; John Wiley & Sons, Ltd.: Hoboken, NJ, USA, 2014. [Google Scholar]
  33. Lowry, C.A.; Woodall, W.H.; Champ, C.W.; Rigdon, S.E. A Multivariate Exponentially Weighted Moving Average Control Chart. Technometrics 1992, 34, 46–53. [Google Scholar] [CrossRef]
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.