# Change Point Analysis of Time Series Related to Bitcoin Transactions: Towards the Detection of Illegal Activities

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Related Work

## 3. Materials and Methods

#### 3.1. Feature Extraction

#### 3.2. Feature Selection

- Boundary condition: ${p}_{1}=(1,1)$ & ${p}_{l}=(N,M)$;
- Monotonicity condition: ${n}_{1}\le {n}_{2}\le \cdots \le {n}_{L}$ & ${m}_{1}\le {m}_{2}\le \cdots \le {m}_{L}$;
- Step size condition: ${p}_{l+1}-{p}_{l}\in \{(1,0),(0,1),(1,1)$}, $l\in [1:L-1]$.

#### 3.3. Change Point Detection

## 4. Results

#### 4.1. Experimental Setup

- The timestamp of the transaction;
- The ID of the spending entity;
- The transacted amount;
- The ID of the receiving entity;
- The progressive balance;
- The blockchain transaction ID.

#### 4.2. Pirate@40 HYIP Scheme

**Feature extraction.**The features mentioned in Table 1 are extracted from the Pirate@40 dataset per day, resulting in the construction of 28 time series with length $T=408$ (days) each, covering the period 22 from June 2011 until 26 August 2012.

**Feature selection.**To categorise the extracted features into homogeneous clusters so as to use the medoid time series of each class as input to the CPD algorithm, the PAM clustering algorithm is applied to the constructed time series using the DTW distance. In order to estimate the optimal number of clusters, we use the Silhouette index for the clustering validation, selecting the number of clusters that maximises its value (see Section 3.2); the results are presented in Table 2 for different numbers of clusters.

**Change point analysis.**Using the medoids of each of the five clusters, we result in creating a five-dimensional time series which constitutes the input to the CPD algorithm presented in Section 3.3. The results of the change point analysis in the multidimensional time series are presented in Table 4 and are depicted graphically in Figure 7.

#### 4.3. The MintPal Exchange Platform

**Feature extraction.**The features mentioned in Table 1 are extracted from MintPal dataset per day, resulting in the construction of 28 time series with length $T=271$ (days) each, spanning the period from 2 February 2014 until 31 October 2014.

**Feature selection.**Similarly to the Pirate@40 case, we categorise the extracted features into clusters using the PAM clustering method with the DTW distance, and the optimal number of clusters is selected using the Silhouette index; the values of this index for different number of clusters are presented in Table 5.

**Change point analysis.**Using the medoids of each of the two clusters, we result in creating a two-dimensional time series which constitutes the input of the CPD algorithm presented in Section 3.3. The results of the change point analysis in the multidimensional time series are presented in Table 7 and are depicted graphically in Figure 9.

## 5. Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## Abbreviations

AIC | Akaike Information Criterion |

ARIMA | Autoregressive Integrated Moving Average |

BIC | Bayesian Information Criterion |

BTC | Bitcoin |

CPD | Change Point Detection |

DTW | Dynamic Time Warping |

HYIP | High-Yield Investment Programme |

GARCH | Generalized Autoregressive Conditional Heteroskedasticity |

PAM | Partition Around Medoids |

USD | United States Dollar |

## Notes

## References

- Abbasimehr, Hossein, and Mostafa Shabani. 2021. A new methodology for customer behavior analysis using time series clustering: A case study on a bank’s customers. Kybernetes 50: 221–42. [Google Scholar] [CrossRef]
- Arbelaitz, Olatz, Ibai Gurrutxaga, Javier Muguerza, Jesús M. Pérez, and Iñigo Perona. 2013. An extensive comparative study of cluster validity indices. Pattern Recognition 46: 243–56. [Google Scholar] [CrossRef]
- Azari, Amin. 2019. Bitcoin price prediction: An arima approach. arXiv arXiv:1904.05315. [Google Scholar]
- Chu, Jeffrey, Stephen Chan, Saralees Nadarajah, and Joerg Osterrieder. 2017. Garch modelling of cryptocurrencies. Journal of Risk and Financial Management 10: 17. [Google Scholar] [CrossRef]
- Davies, David L., and Donald W Bouldin. 1979. A cluster separation measure. IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI-1: 224–27. [Google Scholar] [CrossRef]
- Dunn, Joseph C. 1973. A fuzzy relative of the isodata process and its use in detecting compact well-separated clusters. Journal of Cybernetics 3: 32–57. [Google Scholar] [CrossRef]
- Farrugia, Steven, Joshua Ellul, and George Azzopardi. 2020. Detection of illicit accounts over the Ethereum blockchain. Expert Systems with Applications 150: 113318. [Google Scholar] [CrossRef]
- Fleischer, Jacques Phillipe, Gregor von Laszewski, Carlos Theran, and Yohn Jairo Parra Bautista. 2022. Time series analysis of cryptocurrency prices using long short-term memory. Algorithms 15: 230. [Google Scholar] [CrossRef]
- Gerlach, Jan-Christian, Guilherme Demos, and Didier Sornette. 2019. Dissection of Bitcoin’s multiscale bubble history from january 2012 to february 2018. Royal Society Open Science 6: 180643. [Google Scholar] [CrossRef] [PubMed]
- He, Xi, Ketai He, Shenwen Lin, Jinglin Yang, and Hongliang Mao. 2022. Bitcoin address clustering method based on multiple heuristic conditions. IET Blockchain 2: 44–56. [Google Scholar] [CrossRef]
- Ibrahim, Ahmed, Rasha Kashef, and Liam Corrigan. 2021. Predicting market movement direction for Bitcoin: A comparison of time series modeling methods. Computers & Electrical Engineering 89: 106905. [Google Scholar]
- Kaufman, Leonard, and Peter J Rousseeuw. 2009. Finding Groups in Data: An Introduction to Cluster Analysis. Hoboken: John Wiley & Sons. [Google Scholar]
- Li, Yang, Yue Cai, Hao Tian, Gengsheng Xue, and Zibin Zheng. 2020. Identifying Illicit Addresses in Bitcoin Network. Communications in Computer and Information Science 1267: 99–111. [Google Scholar] [CrossRef]
- Lin, Yu Jing, Po Wei Wu, Cheng Han Hsu, I. Ping Tu, and Shih Wei Liao. 2019. An Evaluation of Bitcoin Address Classification based on Transaction History Summarization. Paper presented at ICBC 2019—IEEE International Conference on Blockchain and Cryptocurrency, Seoul, Republic of Korea, May 14–17; pp. 302–10. [Google Scholar] [CrossRef]
- Matteson, David S., and Nicholas A. James. 2014. A nonparametric approach for multiple change point analysis of multivariate data. Journal of the American Statistical Association 109: 334–45. [Google Scholar] [CrossRef]
- McNally, Sean, Jason Roche, and Simon Caton. 2018. Predicting the price of Bitcoin using machine learning. Paper presented at 2018 26th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), Cambridge, UK, March 21–23; pp. 339–43. [Google Scholar]
- Müller, Meinard. 2007. Dynamic time warping. In Information Retrieval for Music and Motion. Berlin and Heidelberg: Springer, pp. 69–84. [Google Scholar]
- Oliveira, Catarina, João Torres, Maria Inês Silva, David Aparício, João Tiago Ascensão, and Pedro Bizarro. 2021. GuiltyWalker: Distance to illicit nodes in the Bitcoin network. arXiv arXiv:2102.05373. [Google Scholar]
- Puspita, Pratiwi Eka, and Zulkarnain. 2020. A Practical Evaluation of Dynamic Time Warping in Financial Time Series Clustering. Paper presented at 2020 International Conference on Advanced Computer Science and Information Systems (ICACSIS), Depok, Indonesia, October 17–18; pp. 61–68. [Google Scholar] [CrossRef]
- Ranshous, Stephen, Cliff A. Joslyn, Sean Kreyling, Kathleen Nowak, Nagiza F. Samatova, Curtis L. West, and Samuel Winters. 2017. Exchange pattern mining in the Bitcoin transaction directed hypergraph. In Financial Cryptography and Data Security: FC 2017 585 International Workshops, WAHC, BITCOIN, VOTING, WTSC, and TA, Sliema, Malta, April 7, 2017. Revised Selected Papers 21, 10323 LNCS. Cham: Springer, pp. 248–63. [Google Scholar]
- Rousseeuw, Peter J. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics 20: 53–65. [Google Scholar] [CrossRef]
- Sándor, Barnabás, and Dávid János Fehér. 2019. Examining the relationship between the Bitcoin and cybercrime. Paper presented at 2019 IEEE 13th International Symposium on Applied Computational Intelligence and Informatics (SACI), Timisoara, Romania, May 29–31; pp. 121–26. [Google Scholar]
- Tan, Xue, and Rasha Kashef. 2019. Predicting the closing price of cryptocurrencies: A comparative study. Paper presented at Second International Conference on Data Science, E-Learning and Information Systems, Dubai, United Arab Emirates, December 2–5; pp. 1–5. [Google Scholar]
- Theodosiadou, Ourania, Kyriaki Pantelidou, Nikolaos Bastas, Despoina Chatzakou, Theodora Tsikrika, Stefanos Vrochidis, and Ioannis Kompatsiaris. 2021. Change point detection in terrorism-related online content using deep learning derived indicators. Information 12: 274. [Google Scholar] [CrossRef]
- Toyoda, Kentaroh, P. Takis Mathiopoulos, and Tomoaki Ohtsuki. 2019. A Novel Methodology for HYIP Operators’ Bitcoin Addresses Identification. IEEE Access 7: 74835–48. [Google Scholar] [CrossRef]
- Toyoda, Kentaroh, Tomoaki Ohtsuki, and P. Takis Mathiopoulos. 2017. Identification of High Yielding Investment Programs in Bitcoin via Transactions Pattern Analysis. Paper presented at 2017 IEEE Global Communications Conference, GLOBECOM 2017—Proceedings, Singapore, December 4–8; pp. 1–6. [Google Scholar] [CrossRef]
- Toyoda, Kentaroh, Tomoaki Ohtsuki, and P. Takis Mathiopoulos. 2018a. Multi-Class Bitcoin-Enabled Service Identification Based on Transaction History Summarization. Paper presented at Proceedings—IEEE 2018 International Congress on Cybermatics: 2018 IEEE Conferences on Internet of Things, Green Computing and Communications, Cyber, Physical and Social Computing, Smart Data, Blockchain, Computer and Information Technology, iThings/Gree, Halifax, NS, Canada, July 30–August 3; pp. 1153–60. [Google Scholar] [CrossRef]
- Toyoda, Kentaroh, Tomoaki Ohtsuki, and P. Takis Mathiopoulos. 2018b. Time series analysis for Bitcoin transactions: The case of pirate@ 40’s hyip scheme. Paper presented at 2018 IEEE International Conference on Data Mining Workshops (ICDMW), Singapore, November 17–20; pp. 151–55. [Google Scholar]
- Weber, Mark, Giacomo Domeniconi, Jie Chen, Daniel Karl I. Weidele, Claudio Bellei, Tom Robinson, and Charles E. Leiserson. 2019. Anti-Money Laundering in Bitcoin: Experimenting with Graph Convolutional Networks for Financial Forensics. arXiv arXiv:1908.02591. [Google Scholar]
- Yang, Qingqing, Yuexin Xiang, Wenmao Liu, and Wei Ren. 2022. An Illicit Bitcoin Address Analysis Scheme Based on Subgraph Evolution. Paper presented at 2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys), Hainan, China, December 18–20; pp. 679–86. [Google Scholar]
- Zhang, Yuhang, Jun Wang, and Jie Luo. 2020. Heuristic-based address clustering in Bitcoin. IEEE Access 8: 210582–91. [Google Scholar] [CrossRef]

**Figure 4.**Overview of the CPD algorithm where the time series is segmented into clusters of observations after the detection of statistically significant change points (CPs).

**Figure 6.**Medoid time series of the clusters related to the Pirate@40 case: (

**a**) 1st cluster, (

**b**) 2nd cluster; (

**c**) 3rd cluster, (

**d**) 4th cluster, and (

**e**) 5th cluster.

**Figure 7.**Time locations (vertical lines) of the estimated change points in the five-dimensional time series.

**Figure 8.**Medoid time series of the clusters related to the MintPal case: (

**a**) 1st cluster and (

**b**) 2nd cluster.

**Figure 9.**Time locations (vertical lines) of the estimated change points in the two-dimensional time series.

No | Feature Name | Description |
---|---|---|

1 | ${f}_{TX}$ | The number of transactions per day |

2 | ${r}_{received}$ | Ratio of received transactions to all transactions |

3 | ${r}_{spent}$ | Ratio of spent transactions to all transactions |

4 | ${r}_{coinbase}$ | Ratio of coinbase transactions to all transactions |

5 | ${r}_{received,spent}$ | Ratio of received transactions to spent transactions |

6 | ${r}_{received}^{amount}$ | Ratio of received amount over all transacted amount |

7 | ${r}_{spent}^{amount}$ | Ratio of spent amount over all transacted amount |

8 | ${r}_{coinbase}^{amount}$ | Ratio of coinbase transactions amount over all transacted amount |

9 | ${r}_{received,spent}^{amount}$ | Ratio of received amount over the spent amount |

10 | ${m}_{spent\_USD}^{amount}$ | Mean amount of spent transactions in USD |

11 | ${m}_{received\_USD}^{amount}$ | Mean amount of received transactions in USD |

12 | ${m}_{coinbase\_USD}^{amount}$ | Mean amount of coinbase transactions in USD |

13 | ${m}_{balance\_USD}$ | Mean balance of the wallet in USD |

14 | ${m}_{balance}$ | Mean balance of the wallet in BTC |

15–21 | ${f}_{i\_spent\_USD}$ | Frequency of spent transactions where the amount (in USD) is: ${10}^{i-1}<USD\le {10}^{i}$ for $i\in \{-1,0,1,2,3,4,5\}$ |

22–28 | ${f}_{i\_received\_USD}$ | Frequency of received transactions where the amount (in USD) is: ${10}^{i-1}<USD\le {10}^{i}$ for $i\in \{-1,0,1,2,3,4,5\}$ |

**Table 2.**Calculation of the Silhouette index for different number of clusters in the Pirate@40 case. In

**bold**the number of clusters that corresponds to the maximum value of the index.

No. of Clusters | Silhouette Index |
---|---|

2 | 0.1683 |

3 | 0.1429 |

4 | 0.1568 |

5 | 0.1820 |

6 | 0.1303 |

7 | 0.0401 |

8 | 0.0985 |

9 | 0.0411 |

10 | 0.0887 |

11 | −0.0525 |

12 | 0.0529 |

Cluster | Features in Cluster | Medoid Feature | Label of Medoid |
---|---|---|---|

1st | 4, 10, 11, 13, 14, 17, 21, 23, 27, 28 | 11 | Mean amount of received transactions (USD) |

2nd | 12, 18 | 12 | Mean amount of coinbase transactions (USD) |

3rd | 3, 6, 7 | 7 | Ratio of spent amount |

4th | 2, 8, 9, 15, 16, 22 | 8 | Ratio of coinbase transactions amount |

5th | 1, 5, 19, 20, 24, 25, 26 | 24 | ${f}_{1\_received\_USD}$ |

**Table 4.**Estimated change points for the five-dimensional time series along with the corresponding significance values at 5% significance level.

# | Time | Date | p-Value |
---|---|---|---|

1 | 106 | 5 November 2011 | 0.002 |

2 | 161 | 30 December 2011 | 0.002 |

3 | 265 | 12 April 2012 | 0.002 |

4 | 302 | 19 May 2012 | 0.006 |

5 | 365 | 21 July 2012 | 0.030 |

**Table 5.**Calculation of the Silhouette index for different number of clusters in the MintPal case. In

**bold**the number of clusters that corresponds to the maximum value of the index.

No. of Clusters | Silhouette Index |
---|---|

2 | 0.3056 |

3 | 0.1679 |

4 | 0.1987 |

5 | 0.1161 |

6 | 0.1125 |

7 | 0.1286 |

8 | 0.1082 |

9 | 0.0016 |

10 | 0.1100 |

11 | 0.0584 |

12 | 0.1120 |

Cluster | Features in Cluster | Medoid Feature | Label of Medoid |
---|---|---|---|

1st | 1, 3, 11, 13, 14, 15, 16, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 | 24 | ${f}_{1\_received\_USD}$ |

2nd | 2, 4, 5, 6, 7, 8, 9, 10, 12, 17 | 10 | Mean amount of spent transactions (USD) |

**Table 7.**Estimated change points for the two-dimensional time series along with the corresponding significance values at 5% significance level.

# | Time | Date | p-Value |
---|---|---|---|

1 | 35 | 9 March 2014 | 0.002 |

2 | 70 | 13 April 2014 | 0.002 |

3 | 111 | 24 May 2014 | 0.002 |

4 | 141 | 23 June 2014 | 0.002 |

5 | 171 | 23 July 2014 | 0.002 |

6 | 209 | 30 August 2014 | 0.006 |

7 | 242 | 2 October 2014 | 0.022 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Theodosiadou, O.; Koufakis, A.-M.; Tsikrika, T.; Vrochidis, S.; Kompatsiaris, I.
Change Point Analysis of Time Series Related to Bitcoin Transactions: Towards the Detection of Illegal Activities. *J. Risk Financial Manag.* **2023**, *16*, 408.
https://doi.org/10.3390/jrfm16090408

**AMA Style**

Theodosiadou O, Koufakis A-M, Tsikrika T, Vrochidis S, Kompatsiaris I.
Change Point Analysis of Time Series Related to Bitcoin Transactions: Towards the Detection of Illegal Activities. *Journal of Risk and Financial Management*. 2023; 16(9):408.
https://doi.org/10.3390/jrfm16090408

**Chicago/Turabian Style**

Theodosiadou, Ourania, Alexandros-Michail Koufakis, Theodora Tsikrika, Stefanos Vrochidis, and Ioannis Kompatsiaris.
2023. "Change Point Analysis of Time Series Related to Bitcoin Transactions: Towards the Detection of Illegal Activities" *Journal of Risk and Financial Management* 16, no. 9: 408.
https://doi.org/10.3390/jrfm16090408