A Similarity Measure for Linking CoinJoin Output Spenders
Abstract
1. Introduction
1.1. Related Work
1.2. Scope
2. Background
2.1. Blockchain
2.2. Address Clustering
2.3. CoinJoin
- The transaction uses outputs within a defined range of denominated amounts.
- The transaction creates outputs within a defined range of denominations amounts
- CoinJoin transactions have a set amount of input and outputs
- Payment of the CoinJoin and transaction fee
2.4. Whirlpool CoinJoin
2.5. Wasabi 2.0 CoinJoin
2.6. Dash CoinJoin
3. Improving Wasabi 2.0 Classification
- The output denomination selection in Wasabi 2.0 is based on the provided input amounts. Given that the input amounts have to be split in several denominations, the likelihood is low that only common denominations will be selected is low.
- Given a random selection of the output denomination, with a 33% chance for each output to be of a common denomination, a transaction with 20 denominated outputs has a chance of 0.000000029% to only contain common denomination. Wasabi 2.0 CoinJoin transactions have a high number of outputs (236 on average based on our findings). Therefore, the likelihood that a Wasabi 2.0 transaction only has outputs of common denominations is low.
- The proposed rule should significantly reduce the number of transactions before the Wasabi 2.0 release date (reducing false positives).
- The proposed rule should not reduce the number of transactions after the Wasabi 2.0 release date by more transactions over time than before the release date. This is because the entities responsible for creating transactions, which are misclassified without the proposed rule, are likely to continue to generate such transactions on an ongoing basis. In other words, the number of transactions detected to be misclassified should be steady over time (low number of false positives).
4. Finding Similar CoinJoin Spenders
4.1. Similarity Measure
- Not sensitive to aberrations: timestamps of that have a large Euclidean distance to timestamps of should not influence the distance measure, if points with a low Euclidean distance to timestamps of are present. Input timestamps are in some cases years apart; therefore, a similarity measure sensitive to such aberrations would skew the result.
- Set size: the distance measure should be able to compare sets with unequal lengths (although sampling could overcome this requirement, it would still distort the input of the measure and the result). Input counts of transactions in blockchain systems can range from a single input to several hundred inputs.
- Hausdorff distance [20]: This measure compares the dissimilarity of two point clouds by finding the maximum distance of any two points. This makes it sensitive to aberrations. It can handle sets of unequal length.
- Earth Mover distance [21]: This measure determines how similar two point clouds are by measuring how much one of the point clouds has to be changed to become equal to the other. Although it can handle sets of unequal length, it is not robust to aberrations.
- Chamfer distance [22]: The measure finds the mean minimum distance between all points in the compared point clouds in both directions. It is sensitive to aberrations and can handle sets of unequal length.
- One-sided Chamfer distance: The one-sided variant of the Chamfer distance only finds the mean minimum distance from one point cloud to the other. It is not sensitive to aberrations and can handle sets of unequal length.
4.2. Evaluation of the Proposed Similarity Measure
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Dash CoinJoin Transaction Classification Algorithm
- The number of inputs and outputs most be equal
- The transaction fee must be zero
- All input and output amounts must be part of the defined denominations
- All inputs and outputs must be of the same denomination
Appendix B. Whirlpool CoinJoin Transaction Classification
- Must have at least 5 outputs
- Must have at maximum 8 outputs
- Must have same amount of inputs as outputs
- All input and outputs must have the same denomination
- Must spend at least one input from a whirlpool Tx0 transaction
Appendix C. Wasabi 2.0 CoinJoin Transaction Classification Algorithm
- All output scripts must be unique
- Input amounts must be at least 5000 Satoshi
- At least half of the output amounts must be part of the defined denominations
- The number of outputs must be at least the number of minimum participants
- Transactions must contain at least one uncommon denomination (new rule)
References
- Deuber, D.; Schröder, D. CoinJoin in the Wild: An Empirical Analysis in Dash. In Computer Security—ESORICS 2021; Springer International Publishing: Cham, Switzerland, 2021; pp. 461–480. [Google Scholar] [CrossRef]
- Baldimtsi, F.; Brandao, J.; Chatzigiannis, P.; Karantaidou, I. Dash Cryptocurrency Deanonymization. 2023. Available online: https://cina.gmu.edu/wp-content/uploads/2023/04/Dash-Cryptocurrency-Deanonymization.pdf (accessed on 17 September 2025).
- Biryukov, A.; Tikhomirov, S. Transaction Clustering Using Network Traffic Analysis for Bitcoin and Derived Blockchains. In Proceedings of the IEEE INFOCOM 2019—IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), Paris, France, 29 April–2 May 2019; pp. 204–209. [Google Scholar] [CrossRef]
- Awan, M.K.; Cortesi, A. Blockchain Transaction Analysis Using Dominant Sets. In Computer Information Systems and Industrial Management; Springer International Publishing: Cham, Switzerland, 2017; pp. 229–239. [Google Scholar] [CrossRef]
- Lu, Y.; Wang, H. Similarity Matching, Classification, and Recognition Mechanism for Transaction Analysis in Blockchain Environment. IEEE Trans. Consum. Electron. 2024, 70, 7018–7027. [Google Scholar] [CrossRef]
- Tovanich, N.; Cazabet, R. Fingerprinting Bitcoin entities using money flow representation learning. Appl. Netw. Sci. 2023, 8, 63. [Google Scholar] [CrossRef]
- Zavřel, J.; Koutenský, M.; Dolejška, D.; Veselý, V. Tumbling down the stairs: Exploiting a tumbler’s attempt to hide with ordinary-looking transactions using wallet fingerprinting. Forensic Sci. Int. Digit. Investig. 2025, 52, 301869. [Google Scholar] [CrossRef]
- Ziegler, M.H.; Nowostawski, M.; Katt, B. The Privacy Impact of Dash Mixing Fee Payments. In Data and Applications Security and Privacy XXXIX; Springer Nature Switzerland: Cham, Switzerland, 2025; pp. 427–438. [Google Scholar] [CrossRef]
- Goldfeder, S.; Kalodner, H.; Reisman, D.; Narayanan, A. When the cookie meets the blockchain: Privacy risks of web payments via cryptocurrencies. Proc. Priv. Enhancing Technol. 2018, 2018, 179–199. [Google Scholar] [CrossRef]
- Stütz, R.; Stockinger, J.; Haslhofer, B.; Moreno-Sanchez, P.; Maffei, M. Adoption and Actual Privacy of Decentralized CoinJoin Implementations in Bitcoin. arXiv 2021, arXiv:2109.10229. [Google Scholar] [CrossRef]
- Schnoering, H.; Vazirgiannis, M. Heuristics for Detecting CoinJoin Transactions on the Bitcoin Blockchain. arXiv 2023, arXiv:2311.12491. [Google Scholar] [CrossRef]
- Nakamoto, S. Bitcoin: A Peer-to-Peer Electronic Cash System. 2008. Available online: https://bitcoin.org/bitcoin.pdf (accessed on 17 September 2025).
- Meiklejohn, S.; Pomarole, M.; Jordan, G.; Levchenko, K.; McCoy, D.; Voelker, G.M.; Savage, S. A fistful of bitcoins: Characterizing payments among men with no names. In Proceedings of the 2013 Conference on Internet Measurement Conference (IMC’13), Barcelona, Spain, 23–25 October 2013. [Google Scholar] [CrossRef]
- Maxwell, G. CoinJoin: Bitcoin Privacy for the Real World. 2013. Available online: https://bitcointalk.org/index.php?topic=279249 (accessed on 17 September 2025).
- Ziegler, M.H.; Nowostawski, M.; Katt, B. A Systematic Literature Review of Information Privacy in Blockchain Systems. J. Cybersecur. Priv. 2025, 5, 65. [Google Scholar] [CrossRef]
- Samourai Wallet Authors. Samourai Wallet—Whirlpool Repository. 2019. Available online: https://github.com/Samourai-Wallet/Whirlpool/commits/whirlpool/ (accessed on 17 September 2025).
- Wasabi Wallet Authors. Wasabi Wallet Documentation. 2025. Available online: https://docs.wasabiwallet.io/FAQ/FAQ-UseWasabi.html (accessed on 17 September 2025).
- Dash Core Group, I. Dash Core CoinJoin Documentation. 2025. Available online: https://docs.dash.org/projects/core/en/stable/docs/guide/dash-features-coinjoin.html (accessed on 17 September 2025).
- Dickey, D.A.; Fuller, W.A. Distribution of the Estimators for Autoregressive Time Series with a Unit Root. J. Am. Stat. Assoc. 1979, 74, 427. [Google Scholar] [CrossRef] [PubMed]
- Hausdorff, F. Grundzüge der Mengenlehre; Chelsea: New York, NY, USA, 1978. [Google Scholar]
- Rubner, Y.; Tomasi, C.; Guibas, L. A metric for distributions with applications to image databases. In Proceedings of the Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271), Bombay, India, 4–7 January 1998; Narosa Publishing House: Delhi, India, 1998. ICCV-98. pp. 59–66. [Google Scholar] [CrossRef]
- Barrow, H.G.; Tenenbaum, J.M.; Bolles, R.C.; Wolf, H.C. Parametric correspondence and chamfer matching: Two new techniques for image matching. In Proceedings of the Proceedings: Image Understanding Workshop, Palo Alto, CA, USA, 20–21 October 1977; pp. 21–27. [Google Scholar]
- Harrigan, M.; Fretter, C. The Unreasonable Effectiveness of Address Clustering. In Proceedings of the 2016 Intl IEEE Conferences on Ubiquitous Intelligence & Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Cloud and Big Data Computing, Internet of People, and Smart World Congress (UIC/ATC/ScalCom/CBDCom/IoP/SmartWorld), Toulouse, France, 18–21 July 2016; pp. 368–373. [Google Scholar] [CrossRef]
Dash | Whirlpool | Wasabi 2.0 | |
---|---|---|---|
CSTs | 284,069 | 198,984 | 342,228 |
Included CSTs | 128,528 | 95,379 | 102,169 |
Address Clusters | 146,066 | 247,112 | 432,919 |
Included Address Clusters | 21,168 | 41,897 | 35,356 |
Linked Address Clusters (top 10) | 14.22% | 8.42% | 10.40% |
Linked Address Clusters (top 20) | 27.33% | 17.64% | 20.97% |
Linked Address Clusters (top 30) | 63.19% | 42.50% | 40.29% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ziegler, M.H.; Nowostawski, M.; Katt, B. A Similarity Measure for Linking CoinJoin Output Spenders. J. Cybersecur. Priv. 2025, 5, 88. https://doi.org/10.3390/jcp5040088
Ziegler MH, Nowostawski M, Katt B. A Similarity Measure for Linking CoinJoin Output Spenders. Journal of Cybersecurity and Privacy. 2025; 5(4):88. https://doi.org/10.3390/jcp5040088
Chicago/Turabian StyleZiegler, Michael Herbert, Mariusz Nowostawski, and Basel Katt. 2025. "A Similarity Measure for Linking CoinJoin Output Spenders" Journal of Cybersecurity and Privacy 5, no. 4: 88. https://doi.org/10.3390/jcp5040088
APA StyleZiegler, M. H., Nowostawski, M., & Katt, B. (2025). A Similarity Measure for Linking CoinJoin Output Spenders. Journal of Cybersecurity and Privacy, 5(4), 88. https://doi.org/10.3390/jcp5040088