Next Article in Journal
Dynamic Marketing Resource Allocation with Two-Stage Decisions
Next Article in Special Issue
Synergistic Mechanism of the High-Quality Development of the Urban Digital Economy from Blockchain Adoption Perspective—A Configuration Approach
Previous Article in Journal
A Workload-Balancing Order Dispatch Scheme for O2O Food Delivery with Order Splitting Choice
Previous Article in Special Issue
Sentiment Analysis of Review Data Using Blockchain and LSTM to Improve Regulation for a Sustainable Market
 
 
Article

Application of Benford’s Law on Cryptocurrencies

by 1,2,*,† and 1,3,†
1
Faculty of Mathematics Natural Sciences and Information Technologies, University of Primorska, 6000 Koper, Slovenia
2
Research Centre of the Slovenian Academy of Sciences and Arts, The Fran Ramovš Institute, 1000 Ljubljana, Slovenia
3
InnoRenew CoE, 6310 Izola, Slovenia
*
Author to whom correspondence should be addressed.
These authors contributed equally to this work.
Academic Editor: Jani Merikivi
J. Theor. Appl. Electron. Commer. Res. 2022, 17(1), 313-326; https://doi.org/10.3390/jtaer17010016
Received: 7 November 2021 / Revised: 7 February 2022 / Accepted: 8 February 2022 / Published: 25 February 2022
(This article belongs to the Special Issue Blockchain Commerce Ecosystem)

Abstract

The manuscript presents a study of the possibility of use of Benford’s law conformity test, a well proven tool in the accounting fraud discovery, on a new domain: the discovery of anomalies (possibly fraudulent behaviour) in the the cryptocurrency transactions. Blockchain-based currencies or cryptocurrencies have become a global phenomenon known to most people as a disruptive technology, and a new investment vehicle. However, due to their decentralized nature, regulating these markets has presented regulators with difficulties in finding a balance between nurturing innovation, and protecting consumers. The growing concerns about illicit activity have forced regulators to seek new ways of detecting, analyzing, and ultimately policing public blockchain transactions. Extensive research on machine learning, and transaction graph analysis algorithms has been done to track suspicious behaviour. However, having a macro view of a public ledger is equally important before pursuing a more fine-grained analysis. Benford’s law, the law of first digit, has been extensively used as a tool to discover accountant frauds (many other use cases exist). The basic motivation that drove our research presented in this paper was to test the applicability of the well established method to a new domain, in this case the identification of anomalous behavior using Benford’s law conformity test to the cryptocurrency domain. The research focused on transaction values in all major cryptocurrencies. A suitable time-period was identified that was long enough to present sufficiently large number of observations for Benford’s law conformity tests and was also situated long enough in the past so that the anomalies were identified and well documented. The results show that most of the cryptocurrencies that did not conform to Benford’s law had well documented anomalous incidents, the first digits of aggregated transaction values of all well known cryptocurrency projects were conforming to Benford’s law. Thus the proposed method is applicable to the new domain.
Keywords: cryptocurrency; Benford’s law; anomaly detection; method application cryptocurrency; Benford’s law; anomaly detection; method application

1. Introduction

Benford’s law [1], also known as the first-digit law, has been widely used as a tool to discover anomalies in various data ranging from accounting fraud detection, stock prices, house prices to electricity bills, population numbers, natural phenomena, death rates and recently so popular COVID-19 cases reports. Cryptocurrencies, also referred to as Blockchain-based currencies or crypto coins, have become a global phenomenon known to most people.Throughout the paper we will rely on the definition presented by [2] (cryptocurrency). A cryptocurrency is in fact quite a narrow, albeit recognizable, description of a subset of an umbrella class of cryptoassets. While still somehow geeky and not understood by most people, banks, governments and many companies are aware of its importance.
Since the inception of Bitcoin, many alternative systems have been developed. Some remain blockchain-based, where transactions are stored and consequently timestamped in blocks to create a canonical chain through consensus. Others employ a directed acyclic graph based data structures, where there is no single canonical chain. Instead, transactions reference and confirm previous transactions in order to increase the system’s throughput by sacrificing some security features. Moreover, transaction structure can be changed to achieve privacy, i.e., using ring signatures in Monero [3]. Regardless of the underlying data structure, consensus mechanism, or network protocol, cryptocurrencies are decentralized and permissionless computer networks that maintain a transparent ledger of transactions. Unlike cryptocurrencies, where a user can have an arbitrary number of wallets (identities), centralized and permissioned systems are easier to monitor, detecting suspicious behaviour or anomalies where approaches are analogue to traditional banking systems, as users are assumed to have a verifiable identity.
A report from The World Economics Forum [4] predicts 10% of the global domestic product to be stored on blockchain based public ledgers. The growing interest has made many developers, research, and innovators dedicate their time in an effort to improve on the existing systems. The effects can be observed through the thousands of cryptocurrencies and networks that exist presently. The growing velocity of these networks further increases the risk for the regulator to protect the consumer and the stability of the financial system. The United Nations Office on Drugs and Crime estimated up to 5% of the global GDP of laundered money [5]. Assuming frauds grow in parallel with the velocity and total value locked in the underlying network, a method for fast and efficient anomaly detection is paramount. However, with the growth of innovation in this space, the techniques employed must search for a generic solution that makes little or no assumptions about the underlying network.
Our approach attempts to provide a technology agnostic tool to analyze open ledgers to alert of potential suspicious behaviour which requires further, more fine-grained analysis. Although more than 12 years have passed since the first transaction of the first cryptocurrency—Bitcoin (BTC) [6]—only the last few years have seen a big enough number of transactions and a large enough time frame for some statistical analysis to be carried out. Our research focused on empirical proof whether Benford law [1], a law of anomalous numbers, could be used in a non-altered form for discovering fraudulent or at least suspicious activity on cryptocurrencies in the same way it is used in standard financial forensics.
Although we could observe the cryptocurrency transactions as just another financial tool that should comply to all the used mechanisms (among them also the Benford law conformity for identifying frauds and other anomalous behavior), there are some properties that must be addressed or at least be observed:
  • Mining transactions (mostly with mining pools) for all cryptocurrency assets that are based on the Proof of Work (PoW) [7] consensus mechanism, by which the cryptocurrency blockchain network achieves distributed consensus. Mining pools, where most of the miners are concentrated, pay out rewards to miners based on the computing power contributed. The payouts are mostly scheduled to occur once the miner is owed more than the threshold to save up on transaction fees. As many miners keep the default threshold, many transactions are possibly of the same value;
  • Default transaction fees (GAS) are the same. GAS refers to the pricing value required to successfully conduct a transaction or execute a contract on the Ethereum blockchain platform.
The basic idea of the research was to test if Benford’s law conformity can be used as a tool to detect anomalies in cryptocurrencies. The paper is structured as follows: Section 2 presents the basic properties of Benford’s law and its usages, Section 3 presents the state of the art, followed by Methods and Materials in Section 4. The results are presented in Section 5 and are discussed in Section 6.

2. Benford’s Law

Benford’s law, also called the Newcomb–Benford law or the first-digit law, is an observation about the frequency distribution of leading digits. The observation was first discovered by [8] and later rediscovered by [1]. Benford’s law defines a fixed probability distribution for leading digits of any kind of numeric data with the following properties [9]:
  • Data with values from several distributions;
  • Data that has a wide variety in the number of digits (e.g., data with plenty of values in the hundreds, thousands, tens of thousands, etc.);
  • A data set that is fairly large, as a rule of a thumb consisting of at least 50–100 observations [10], although usually thousands of observations;
  • Data is right-skewed (i.e., the mean is greater than the median), and the distribution has a long right-tail rather than being symmetric;
  • Data has no predefined maximum or minimum value (with the exception of a zero minimum).
The distribution of digits is presented in Figure 1; the digit 1 occurs in roughly 30% of the cases, and the other digits follow in a logarithmic curve. It has been shown that this result applies to a wide variety of data sets [9]. Some examples are presented in Section 3. The equation for the distribution of the first digits of observed data is presented in Equation (1).
P ( d ) = l o g 10 ( d + 1 ) l o g 10 ( d ) = l o g 10 ( 1 + 1 d )
The quantity P ( d ) is proportional to the space between d and d + 1 on a logarithmic scale. Therefore, this is the distribution expected if the logarithms of the numbers (but not the numbers themselves) are uniformly and randomly distributed.

3. State of the Art

Benford’s law has been thoroughly researched and its theoretical grounds have been proved in many scientific papers. The methodology and basic mathematical grounds are discussed in greater detail by [11]. Many researchers have verified for themselves that the law is widely obeyed but have also noted that the popular explanations are not completely satisfying [12]. To the authors’ knowledge, there has been no research in using Benford’s law as a tool for the detection of anomalies in cryptocurrency transactions.
Benford’s law has been extensively used in the accountant fraud detection and prevention, and there has been a lot of research in the area, such as [13,14], who present a literature overview of the area. Ref. [15] introduces Benford’s Law and Digital Analysis (analysis of digit and number patterns of a data set), which can be used as an analytical procedure and fraud detection tool. Ref. [16] presents Benford’s law as a simple and effective tool for the detection of fraud. The purpose of the paper is to assist auditors in the most effective use of digital analysis based on Benford’s law by identifying data sets, which can be expected to follow Benford’s distribution, and presenting types of frauds that would be “detected/not detected” by such analysis. However, there are some research findings that point out some inherent problems that potentially arise in the use of Benford’s law in the auditing process such as [17].
The simplicity of Benford’s law as a tool allows for a broad range of uses. Ref. [18] examined crime statistics at the USA National, State, and local level in order to test the conformity to Benford’s law distribution. Ref. [19] observed the distribution of initial digits of physical constants; however, their results were inconclusive.
One of the more recent researches involving Benford’s law is [20]. The authors proposed a test of the reported number of cases of coronavirus disease in 2019 in China with Benford’s law and report that the reported numbers of affected people abide to Benford’s law.
Ref. [21] presented an overview of identified frauds that can be committed in the cryptocurency paradigm. Identified frauds include Ponzi schemes [22], fake initial coin offering schemes, pump and dump schemes, as well as cryptocurrency theft. Ref. [23] identified the main reasons for frauds and manipulation in cryptocurrencies: lack of consistent regulation, relative anonymity, low barriers of entry, exchange standards, and sophistication. Ref. [24] performed an end-to-end characterization of the counterfeit token in the Ethereum network, targeting Erc20 coins. Ref. [25] aimed to demonstrate that Bitcoin, the most known cryptocurrency, constitutes a substantial danger in terms of criminal enterprise. Ref. [26] presented an economic analysis of money laundering schemes utilizing cryptocurrencies, which aims at providing an answer to the open question of whether cryptocurrencies constitute a driver for money laundering. Ref. [27] proposed an approach to detect illicit accounts on the Ethereum blockchain using well proven machine learning techniques. Recent anomaly detection makes use of machine learning approaches. Support Vector Machines (SVM) were used to detect anomalies in the Bitcoin network [28]. However, the analysis is on the network level, and not on individual transactions. A clustering approach with Random Forest (RF) was used to detect wallets with anomalous behaviour [29]. However, the approach makes assumptions on the underlying structure of transactions to extract the features needed, and thereby lacks generality. A recent study showed that neural networks can be used to detect abnormalities with good stability and effectiveness, but the technique is limited to smart contract platforms, and not general transaction networks. Kamišalić et al. [30] presented a detailed overview of various techniques used for anomaly detection. This highlights the need for a simpler implementation agnostic technique for preliminary screening of public ledgers.

4. Methodology

As mentioned in Section 1, this paper proposes a methodology for identifying out-of-the-ordinary behavior and possibly detect frauds in blockchain-based currency. As such, the purpose is to present scientific grounds that allow feasibility and usefulness of the method as well as to propose a set of usage guidelines and a use case where our hypotheses were confirmed.
Our research experiment started with gathering all transactions on the Ethereum (ETH) network. Ethereum was chosen for these properties: It is one of the biggest cryptocurrencies by market capitalization and number of transactions processed; the network houses multiple cryptocurrencies (tokens) that could be compared directly (this part of the experiment is still open); and it is a well-documented and accessible blockchain. The first preliminary results revealed that transaction values (non-aggregated) of the whole Ethereum network do not conform to Benford’s law [1] as is presented in Figure 2. Blue color depicts the leading digits that conform to Benford’s law, red color depicts the non-conforming digits. The reasoning is further discussed in Section 4.1.
Figure 2. The leading digits of all ETH transaction values do not conform to Benford’s law. The daily aggregated values conform to the same metric (see Figure 3), leading to a possible conclusion that there are too many automatic transactions in the network, but the aggregated values avoid this effect.
Figure 2. The leading digits of all ETH transaction values do not conform to Benford’s law. The daily aggregated values conform to the same metric (see Figure 3), leading to a possible conclusion that there are too many automatic transactions in the network, but the aggregated values avoid this effect.
Jtaer 17 00016 g002
Although this does not mean that there was any artificial manipulation or any other kind of anomaly, we investigated further. According to [31] Benford’s law metric can be used to achieve similar goals on aggregated data. We explored the same phenomenon on aggregated values (number of transactions in an observed period, aggregated transaction values, …). Most of the aggregated values conform to Benford’s law according to goodness of fit chi square ( χ ˜ 2 ) test [32], which in most literature, such as [16], is considered as a suitable tool to test Benford’s law conformity. We extended our research to all major cryptocurrencies with enough transactions in the selected time-period.

4.1. Methods

The observation sets need to conform to all the basic prerequisites for Benford’s law as described in Section 2. This is the agenda for the executed research:
  • Take all major cryptocurrencies into consideration;
  • Express all aggregated daily transactions in one currency—we selected USD ($) as the most used fiat currency in comparisons;
  • Select a viable observation period:
    Starting date for each currency was the date of the first successful transaction;
    Ending date for the observation period was set long enough into the past so that the frauds or abnormal behavior were well documented (in the forms of lawsuits, scandals, vanished cryptocurrencies, well-documented special properties of specific currencies). We selected the year as the end of 2018, almost three years in the past;
    A long enough observation period that makes Benford’s law conformity observation feasible (as presented in Section 2). In the body of surveyed literature, the sample size varies from 200 [33] to a few hundred thousand. We opted for doubling the minimum sample size—selecting all cryptocurrencies with 400 or more transaction days;
    Perform the MAD test [34] and classify all the cryptocurrencies according to [35] and visually observe all conformity graphs;
    Perform a literature review for all the currencies that do not conform to Benford’s law and establish if there are any abnormalities documented for the selected time frame.
Testing conformity to Benford’s law distribution has been done with many goodness of fit tests ranging from Pearson’s Chi squared [36], Kolmogorov-Smirnov D statistics [37], Freedman’s modification of Watson U 2 statistics [38], euclidean distance d statistics, and many others. However, no real data will ever follow the exact distribution; hence, most analysis supplements statistical testing with graphical representations that help in pointing out suspicious patterns in the data for further investigation. Additionally, different tests have different reactions on sample sizes. The Chi square test suffers from an excess power problem in that when the number of observations becomes large (above 5000 records estimated by [35]) it becomes more sensitive to insignificant spikes, leading to the conclusion that the data does not conform. Ref. [39] suggested that some statistical tests can render misleading results when applied to large number of observations. On the other hand, ref. [40] conclude that the Mean Absolute Deviation MAD test [34] is reliable with as low as 200 observations (as additional safety measure, we opted doubling that value to 400 in our experiment). Ref. [41] proposed the Mantissa Arc test, which is a very interesting geometrical test. Unfortunately, it tolerates little deviation from Benford’s distributions.
Ref. [35] concluded that the best test is Mean Absolute Deviation (MAD), and a lot of the state-of-the art literature agrees with this proposal. Ref. [35] also presents a list of thresholds to classify the observed conformity:
  • Conformity (0.000);
  • Acceptable conformity (0.006);
  • Marginally acceptable conformity (0.012);
  • Nonconformity (0.015 and above).
The adapted MAD is used to measure the average deviation between the heights of the bars and the Benford line. The higher the MAD, the lower the conformity. We opted to perform conformity tests using all three of the aforementioned tests as our sample sizes are well within the acceptable ranges. All presented statistical tests are also supplemented with graphical representations; the results are presented in Section 5.

The Criteria That the Objects under Scrutiny Must Meet

Select a big enough set of aggregated data that conforms to Benford’s law prerequisites described in Section 2. Observing only ledgers, the prerequisites that must be met are:
  • The ledger must have support querying for transactions that contain the sending address, receiving address, amount, and timestamp;
  • The assets being transferred must be denominated in any universally comparable form (any fiat currency (i.e., US Dollars) meets this criterion) at the time of transfer.
Count leading digits and perform Mean Absolute Deviation (MAD) conformity [35] on the gathered data. Plot simple bar charts with the numbers for each leading digit and visually and manually observe the distribution. If the data does not conform to Benford’s law, investigate further.

4.2. Materials

DataHub cryptocurrency datasets (DataHub cryptocurrency datasets: https://datahub.io/cryptocurrency accessed on 1 March 2021) hosts daily aggregated data about all transactions on all crypto coin networks from the first mined block on the Bitcoin network till the end of 2018. As such, it presents the perfect data source for our research. The problem that arises is how to get more recent data. The problem is further discussed in Section 6.
The data that support the findings of this study are openly available on Zenodo (Zenodo: https://zenodo.org/record/4682976 accessed on 1 January 2022, doi:10.5281/zenodo.4682976).

5. Results

This section presents the results of the experiment following the methodology from Section 4. All the figures in this section have the same format: a graph showing the distribution of leading digits. Red colored bars represent suspect values, which skew the distribution the most. Suspects are classified where the mean absolute deviation is above the threshold of 4. The threshold can be adjusted to increase the sensitivity. Suspects are useful as a starting point for further investigation in the case of nonconformity.
The time interval selected was between 2009 and 2018. Most of the cryptocurrencies were in an early development phase without a use-case or product, and consequently the amount of transactions recorded was negligible. Table 1 presents all cryptocurrencies that conformed to the prerequisites presented in Section 2 and Section 4. The most discriminating factor in this phase was the minimum number of observations, which was set to 400 days (roughly double the minimal number of observations for Benford’s law to be meaningful). This property eliminated all currencies that were started later than the last quarter of 2017. Each cryptocurrency is presented by its name and the ticker, number of observations (equal to the number of days), starting and ending date of the observation period and all the values from Benford’s law conformance test. The currencies were grouped into four groups according to [35] and were also sorted according to this grouping from best to worst conformance.
All non-conformant cryptocurrencies were thoroughly observed and a list of publicly announced anomalies and even frauds was compiled for each of these cryptocurrencies. The two best performing and two cryptocurrencies with the biggest market cap were also observed in details. The results are presented in the remainder of the section. All the other cryptocurrencies can be further analyzed using the available accompanying data (Zenodo: https://zenodo.org/record/4682976 accessed on 1 January 2022, doi:10.5281/zenodo.4682976) in the raw aggregated data form, a list of Benford’s law conformity values and charts.
Two “best conforming” cryptocurrencies, Ethereum classic (ETC) and Vertcoin (VTC), both still respectable projects, were classified as “Close conformity”. The two biggest blockchain platforms regarding market capitalization, Bitcoin (BTC) and Ethereum (ETH), were classified as “Acceptable conformity” and “Marginally acceptable conformity”, respectively. Figure 4 shows Benford’s law conformance chart for further visual examination for all four cryptocurrencies.
Six of the currencies from Table 1 were classified as “non-conformant” to Benford’s law: EOS (EOS), TENX token (TENX), Veritaseum (VERI), Basic Atention Token (BAT), PIVX (PIVX), and Dogecoin (DOGE). Each of the cryptocurrencies from this list will be presented and discussed.

5.1. TENX Token (TENX)

Figure 5 shows the TENX aggregated transactions and the conformance to Benford’s law. The MAD value, a well documented Wirecard scandal (Crypto.com, TenX crypto debit cards were frozen following the Wirecard scandal: https://decrypt.co/33695/crypto-debit-cards-frozen-following-wirecard-scandal accessed on 1 March 2021) shows a possible reason for non-conformity.

5.2. Veritaseum (VERI)

Figure 6 shows the VERI aggregated transactions and the conformance to Benford’s law.
The U.S. Securities and Exchange Commission (SEC) said it has reached a settlement with Reggie Middleton, organizer of the fraught $14.8 million Veritaseum (VERI) initial coin offering (ICO) (Analysis of the Veritaseum Scam: https://steemit.com/money/@financialcritic/analysis-of-the-veritaseum-scam accessed on 1 March 2021). The case was closed on October 2019, but the frauds were committed well within the observation period of our research.

5.3. Dogecoin (DOGE)

Figure 7 shows the DOGE aggregated transactions and the conformance to Benford’s law. The coin was introduced as a satire initially in December 2013 and included an image of the Doge meme as its logo. The author of this coin/crypto currency revealed this motivation publicly. Some properties showing the soundness of our decision are as follows:
  • On the 24 September 2018 (a randomly chosen date on a working day at the end of our observation period): the last tweet from the official Tweeter account on 14 July 2018 (80 days) (Dogecoin twitter account: https://twitter.com/Dodgecoin accessed on 1 March 2021);
  • Fun and friendly internet currency, the dogecoin logo is a dog from a meme;
  • 24 h trading volume on all exchanges according to CoinCodex (Concodex: https://coincodex.com/crypto/XXX/exchanges/ accessed on 1 March 2021) was USD 42.51 million dollars.
In the last years Dogecoin has gained a lot of positive reputation as being a “lost cause” founding platform, and, especially in 2021, the value of the coin has seen a rapid increase in price with the help of celebrity exposure [42]. However, these recent developments were excluded from our analysis as we fixed the observation period from the start of the crypto-assets till the end of 2018.

5.4. Basic Attention Token (BAT)

Figure 8 shows the BAT aggregated transactions and the conformance to Benford’s law. The transactions of the BAT coin are mostly automatically generated as this coin is the basis of a digital marketing platform that periodically rewards users for participation, and as such break Benford’s law prerequisites.

5.5. PIVX (PIVX)

Figure 9 shows the PIVX aggregated transactions and the conformance to Benford’s law. There was no scandal reported for the PIVX project in the observation period (in fact, the authors could not find any notable anomaly for this cryptocurrency). The only speculation that the authors could give is that the PIVX network relies on anonymous transactions that could be used to hide anomalies.

5.6. EOS (EOS)

Figure 10 shows the EOS aggregated transactions and the conformance to Benford’s law. EOS is regarded as a valid project and survived until 2021. The only drawback is that in 2018, the project was in the starting phase and the backing capital risen by the backers of the project was an order of magnitude bigger than what the proposed project promised to accomplish (“Why EOS Failed to Kill Ethereum: The Fatal Flaw of Centralization in a Decentralized Market”: https://coincodex.com/article/10454/why-eos-failed-to-kill-ethereum-the-fatal-flaw-of-centralization-in-a-decentralized-market/ accessed on 1 March 2021).

5.7. Additional Currencies

An examination of all remaining cryptocurrencies that did not meet the criteria presented in Section 4, mainly due to the lack of data, show additional cases that support the validity of the presented method. By lowering the requirement for the minimum number of observations to 300 days, we can observe additional cryptocurrencies that do not conform to Benford’s law that have documented scams and scandals attributed to the observation period, such as: the Enigma (ENG) (Enigma Ethereum marketplace was hijacked, its investors duped by phishing scam: https://www.zdnet.com/article/enigma-ethereum-marketplace-hijacked-by-attackers/ accessed on 1 March 2021); SALT (SALT) (SALT COIN EXIT SCAM! Massive selloff predicted by Morgan Stanley: https://www.youtube.com/watch?v=E2iNt3Z6qaY accessed on 1 March 2021); and Waltonchain (WTC) (Monumentall stupid tweet blows up in blockchain company’s face: https://mashable.com/2018/02/28/waltonchain-twitter-scam-wtc/?europe=true accessed on 1 March 2021).

6. Discussion and Future Work

The main goal of the presented research was to test the applicability of Benford’s law to the cryptocurrency transaction networks as a preliminary screening tool. The research focused on some well-documented anomalies and frauds from the past and compared the proposed metric on proven ecosystems that performed normally in the same time period. We focused on the time period between 2009 (time of the first transaction on the Bitcoin network) and 2018, as there were already enough transactions to meet all of Benford’s law prerequisites, but also enough time had passed so that the anomalies and frauds had already emerged to the public.
The results show that the proposed method is suitable for the proposed domain. All the big blockchain platforms by market capitalization that were not biased by any big scandal or lawsuit and that are still functioning three years after the observation time-frame, such as Bitcoin (BTC), Ethereum (ETH), or OmiseGo (OMG), conform to Benford’s law. However, failing to comfort to Benfords distribution does not necessarily imply fraud. The method can produce false positives in the form of non-conformity of a cryptocurrency and no particular fraudulent reason can be found. This can result from the nature of the transactions of the observed currency. The method does not find the actual anomaly, but it can be used as a preliminary screening that should always lead into fine-grained methods such as Machine Learning methods and graph-based searching. The inspection of the six cryptocurrencies that were classified as non-conforming to Benford’s law revealed three currencies with well-documented anomalies: two (TENX and VERI) were tainted by scandals and lawsuits and one (DOGE) was invented as a joke—and in the first years it was regarded so. As an additional observation, Dogecoin is now a respected cryptocurrency and in the last year grew to USD $50B market capitalization. The method is obviously not suitable to predict the future of an observed cryptocurrency. The transactions of the BAT coin are mostly automatically generated, as this coin is the basis of a digital marketing platform. The two remaining cryptocurrencies that were identified by the method as possible candidates for anomalous behaviour were EOS and PIVX, and although we could speculate to some extension why these two did not conform to Benford’s law, the results are inconclusive.
All major cryptocurrencies that existed in the selected time-frame (2009–2018) were tested for the conformity to Benford’s law. The data availability statement is presented in Section 4.2.
Future work, which is already underway, will focus on newer data. One such possible source has already been identified: Kaggle (Cryptocurrency Historical Prices: https://www.kaggle.com/sudalairajkumar/cryptocurrencypricehistory accessed on 1 March 2021). Another open issue that can be tackled with the same methodology is a comparison of all ERC20 tokens [43]. Ethereum-based cryptocurrencies were selected to ensure a common (thus fair) technical basis—all these cryptocurrencies use the same technological platform, so all possible reasons for differences that arise from basic technology are eliminated.

Author Contributions

Conceptualization, A.T. and J.V.; methodology: A.T. and J.V.; software, A.T.; validation, A.T. and J.V.; formal analysis, A.T. and J.V.; investigation, A.T. and J.V.; funding acquisition and resources, J.V.; data curation, A.T.; writing–original draft preparation, A.T. and J.V.; writing–review and editing, A.T. and J.V.; visualization, A.T. and J.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by H2020 grant number 739574 and by the Slovenian Research Agency (ARRS) grant number J2-2504.

Institutional Review Board Statement

The data gathering process did not involve the use of human subjects.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data that support the findings of this study are openly available on Zenodo (Zenodo: https://zenodo.org/record/4682976 accessed on 1 January 2022, doi:10.5281/zenodo.4682976).

Acknowledgments

The authors gratefully acknowledge the European Commission for funding the InnoRenew project (Grant Agreement #739574) under the Horizon2020 Widespread-Teaming program and the Republic of Slovenia (Investment funding of the Republic of Slovenia and the European Regional Development Fund). They also acknowledge the Slovenian Research Agency ARRS for funding the project J2-2504.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Benford, F. The law of anomalous numbers. Proc. Am. Philos. Soc. 1938, 78, 551–572. [Google Scholar]
  2. Lansky, J. Possible state approaches to cryptocurrencies. J. Syst. Integr. 2018, 9, 19–31. [Google Scholar] [CrossRef]
  3. Noether, S. Ring SIgnature Confidential Transactions for Monero. IACR Cryptol. ePrint Arch. 2015, 2015, 1098. [Google Scholar]
  4. Mettler, M. Blockchain technology in healthcare: The revolution starts here. In Proceedings of the 2016 IEEE 18th International Conference on e-Health Networking, Applications and Services (Healthcom), Munich, Germany, 14–16 September 2016; pp. 1–3. [Google Scholar]
  5. Campbell-Verduyn, M. Bitcoin, crypto-coins, and global anti-money laundering governance. Crime Law Soc. Chang. 2018, 69, 283–305. [Google Scholar] [CrossRef]
  6. Nakamoto, S. Bitcoin Whitepaper. Technical Report. 2008. Available online: Bitcoin.org (accessed on 1 March 2021).
  7. Jakobsson, M.; Juels, A. Proofs of work and bread pudding protocols. In Secure Information Networks; Jakobsson, M., Juels, A., Eds.; Springer: Leuven, Belgium, 1999; pp. 258–272. [Google Scholar]
  8. Newcomb, S. Note on the Frequency of Use of the Different Digits in Natural Numbers. Am. J. Math. 1881, 4, 39–40. [Google Scholar] [CrossRef][Green Version]
  9. Singleton, T.W. IT Audit Basics: Understanding and Applying Benford’s Law. Isaca J. 2011, 3, 6. [Google Scholar]
  10. Kenny, D.A. Measuring Model Fit. 2015. Available online: http://davidakenny.net/cm/fit.htm (accessed on 1 March 2021).
  11. Berger, A.; Hill, T.P. A basic theory of Benford’s Law. Probab. Surv. 2011, 8, 1–126. [Google Scholar] [CrossRef]
  12. Fewster, R.M. A Simple Explanation of Benford’s Law. Am. Stat. 2009, 63, 26–32. [Google Scholar] [CrossRef][Green Version]
  13. Kumar, K.; Bhattacharya, S. Detecting the dubious digits: Benford’s law in forensic accounting. Significance 2007, 4, 81–83. [Google Scholar] [CrossRef]
  14. Nigrini, M.J. Audit sampling using Benford’s law: A review of the literature with some new perspectives. J. Emerg. Technol. Account. 2017, 14, 29–46. [Google Scholar] [CrossRef]
  15. Drake, P.D.; Nigrini, M.J. Computer assisted analytical procedures using Benford’s Law. J. Account. Educ. 2000, 18, 127–146. [Google Scholar] [CrossRef]
  16. Durtschi, C.; Hillison, W.; Pacini, C. The effective use of Benford’s law to assist in detecting fraud in accounting data. J. Forensic Account. 2004, 5, 17–34. [Google Scholar]
  17. Cleary, R.; Thibodeau, J.C. Applying Digital Analysis Using Benford’s Law to Detect Fraud: The Dangers of Type I Errors. Audit. J. Pract. Theory 2005, 24, 77–81. [Google Scholar] [CrossRef]
  18. Hickman, M.J.; Rice, S.K. Digital Analysis of Crime Statistics: Does Crime Conform to Benford’s Law? J. Quant. Criminol. 2010, 26, 333–349. [Google Scholar] [CrossRef]
  19. Burke, J.; Kincanon, E. Benford’s law and physical constants: The distribution of initial digits. Am. J. Phys. 1991, 59, 952. [Google Scholar] [CrossRef]
  20. Zhang, J. Testing Case Number of Coronavirus Disease 2019 in China with Newcomb-Benford Law. arXiv 2020, arXiv:2002.05695. [Google Scholar]
  21. Baum, S.C. Cryptocurrency Fraud: A Look into the Frontier of Fraud. Ph.D. Thesis, Georgia Southern University, Statesboro, GA, USA, 2018. [Google Scholar]
  22. Zuckoff, M. Ponzi’s Scheme: The True Story of a Financial Legend; Random House Incorporated: New York, NY, USA, 2006. [Google Scholar]
  23. Twomey, D.; Mann, A. Fraud and manipulation within cryptocurrency markets. In Corruption and Fraud in Financial Markets: Malpractice, Misconduct and Manipulation; Alexander, C., Cumming, D., Eds.; Wiley: Hoboken, NJ, USA, 2020; pp. 205–250. [Google Scholar]
  24. Gao, B.; Wang, H.; Xia, P.; Wu, S.; Zhou, Y.; Luo, X.; Tyson, G. Tracking Counterfeit Cryptocurrency End-to-end. Proc. ACM Meas. Anal. Comput. Syst. 2020, 4, 1–28. [Google Scholar] [CrossRef]
  25. Brown, S.D. Cryptocurrency and criminality: The Bitcoin opportunity. Police J. 2016, 89, 327–339. [Google Scholar] [CrossRef]
  26. Brenig, C.; Müller, G. Economic Analysis of Cryptocurrency Backed Money Laundering; ECIS 2015 Completed Research Papers; Association for Information Systems: Atlanta, GA, USA, 2015; pp. 1–18. [Google Scholar]
  27. Farrugia, S.; Ellul, J.; Azzopardi, G. Detection of illicit accounts over the Ethereum blockchain. Expert Syst. Appl. 2020, 150, 113318. [Google Scholar] [CrossRef][Green Version]
  28. Sayadi, S.; ben Rejeb, S.; Choukair, Z. Anomaly Detection Model Over Blockchain Electronic Transactions. In Proceedings of the 2019 15th International Wireless Communications Mobile Computing Conference (IWCMC), Tangier, Morocco, 24–28 June 2019; pp. 895–900. [Google Scholar] [CrossRef]
  29. Baek, H.; Oh, J.; Kim, C.Y.; Lee, K. A Model for Detecting Cryptocurrency Transactions with Discernible Purpose. In Proceedings of the 2019 Eleventh International Conference on Ubiquitous and Future Networks (ICUFN), Zagreb, Croatia, 2–5 July 2019; pp. 713–717. [Google Scholar] [CrossRef]
  30. Kamišalić, A.; Kramberger, R.; Fister, I. Synergy of Blockchain Technology and Data Mining Techniques for Anomaly Detection. Appl. Sci. 2021, 11, 7987. [Google Scholar] [CrossRef]
  31. Shi, J.; Ausloos, M.; Zhu, T. Benford’s law first significant digit and distribution distances for testing the reliability of financial reports in developing countries. Phys. A Stat. Mech. Its Appl. 2018, 492, 878–888. [Google Scholar] [CrossRef][Green Version]
  32. Fischer, R.A. Statistical Methods for Research Workers; Oliver and Boyd: Edinburgh, UK, 1925. [Google Scholar]
  33. Carslaw, C.A. Anomalies in income numbers: Evidence of goal oriented behavior. Account. Rev. 1988, 63, 321–327. [Google Scholar]
  34. Gorard, S. Revisiting a 90-year-old debate: The advantages of the mean deviation. Br. J. Educ. Stud. 2005, 53, 417–430. [Google Scholar] [CrossRef]
  35. Nigrini, M.J.M.J. Benford’s Law: Applications for Forensic Accounting, Auditing, and Fraud Detection; Wiley: Hoboken, NJ, USA, 2012; p. 352. [Google Scholar]
  36. Pearson, K.X. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Lond. Edinb. Dublin Philos. Mag. J. Sci. 1900, 50, 157–175. [Google Scholar] [CrossRef][Green Version]
  37. Berger, V.W.; Zhou, Y. Kolmogorov–Smirnov Test: Overview; Wiley Statsref: Statistics Reference Online: Hoboken, NJ, USA, 2014. [Google Scholar]
  38. Freedman, L.S. Watson’s UN2 statistic for a discrete distribution. Biometrika 1981, 68, 708–711. [Google Scholar] [CrossRef]
  39. Nigrini, M. Digital Analysis Using Benford’s Law: Tests and Statistics for Auditors. EDPACS 2001, 28, 1–2. [Google Scholar] [CrossRef]
  40. Druică, E.; Oancea, B.; Vâlsan, C. Benford’s law and the limits of digit analysis. Int. J. Account. Inf. Syst. 2018, 31, 75–82. [Google Scholar] [CrossRef]
  41. Alexander, J.C. Remarks on the Use of Benford’s Law. 2009. Available online: http://dx.doi.org/10.2139/ssrn.1505147 (accessed on 1 March 2021).
  42. Livni, E. Serious money is flowing to the joke cryptocurrency Dogecoin. New York Times, 2 August 2021; pp. 1–2. [Google Scholar]
  43. Somin, S.; Gordon, G.; Altshuler, Y. Network analysis of erc20 tokens trading on ethereum blockchain. In International Conference on Complex Systems; Springer: Berlin/Heidelberg, Germany, 2018; pp. 439–450. [Google Scholar]
Figure 1. The distribution of digits in accordance to Benford’s law [9]. Blue colored bars represent digits that conform to Benford’s law.
Figure 1. The distribution of digits in accordance to Benford’s law [9]. Blue colored bars represent digits that conform to Benford’s law.
Jtaer 17 00016 g001
Figure 3. The leading digits of daily aggregated ETH transaction values in USD conform to Benford’s law. Blue colored bars represent digits that conform and red colored bars represent digits that do not conform to Benford’s law.
Figure 3. The leading digits of daily aggregated ETH transaction values in USD conform to Benford’s law. Blue colored bars represent digits that conform and red colored bars represent digits that do not conform to Benford’s law.
Jtaer 17 00016 g003
Figure 4. The two best conforming (ETC) and (VTC) currencies with “Close conformity” and the two biggest cryptocurrencies (BTC)—“Acceptable conformity” and (ETH)—“Marginally acceptable conformity” for aggregated value in USD transaction history.
Figure 4. The two best conforming (ETC) and (VTC) currencies with “Close conformity” and the two biggest cryptocurrencies (BTC)—“Acceptable conformity” and (ETH)—“Marginally acceptable conformity” for aggregated value in USD transaction history.
Jtaer 17 00016 g004
Figure 5. TENX aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 4 (almost) underflows. Overall, the daily aggregated transaction values do not conform.
Figure 5. TENX aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 4 (almost) underflows. Overall, the daily aggregated transaction values do not conform.
Jtaer 17 00016 g005
Figure 6. VERI aggregated transactions and the conformance to Benford’s law. Digit 1 overflows. Overall the daily aggregated transaction values do not conform.
Figure 6. VERI aggregated transactions and the conformance to Benford’s law. Digit 1 overflows. Overall the daily aggregated transaction values do not conform.
Jtaer 17 00016 g006
Figure 7. DOGE aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 3 underflows. Overall the daily aggregated transaction values do not conform.
Figure 7. DOGE aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 3 underflows. Overall the daily aggregated transaction values do not conform.
Jtaer 17 00016 g007
Figure 8. BAT aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 2 underflows, digit 7 (almost) overflows. Overall the daily aggregated transaction values do not conform.
Figure 8. BAT aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 2 underflows, digit 7 (almost) overflows. Overall the daily aggregated transaction values do not conform.
Jtaer 17 00016 g008
Figure 9. PIVX aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 3 underflows. Overall the daily aggregated transaction values do not conform.
Figure 9. PIVX aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digit 3 underflows. Overall the daily aggregated transaction values do not conform.
Jtaer 17 00016 g009
Figure 10. TENX aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digits 2 and 4 (almost) underflow. Overall the daily aggregated transaction values do not conform.
Figure 10. TENX aggregated transactions and the conformance to Benford’s law. Digit 1 overflows, digits 2 and 4 (almost) underflow. Overall the daily aggregated transaction values do not conform.
Jtaer 17 00016 g010
Table 1. Conformity tests for all major cryptocurrencies in the observed time-period with more than 400 days of transactions on the blockchain. The records are sorted according to MAD Conformity column, from close conforming to nonconforming.
Table 1. Conformity tests for all major cryptocurrencies in the observed time-period with more than 400 days of transactions on the blockchain. The records are sorted according to MAD Conformity column, from close conforming to nonconforming.
CurrencyObs.Pearson’s Chi-Squared TestMantissa Arc TestMADMAD ConformityDistortion FactorStart DateEnd Date
X-Squaredp-ValueL2p-Value
Ethereum Classic (ETC)7501.7660270.98736380.00008610.93747260.00351481Close−0.14093212015-07-302018-08-12
Vertcoin (VTC)16667.309480.50363980.0001656730.75880440.005795195Close−1.6213332014-01-102018-08-12
Metal (MTL)4005.151150.74130570.0015225250.5438890.01089584Acceptable−0.12579452017-06-292018-08-12
Status (SNT)4117.6923960.46407980.0010508240.64928160.01005221Acceptable−1.5604012017-06-192018-08-12
Aragon (ANT)4525.6960920.68123110.0053889130.087528670.01078389Acceptable3.4484952017-05-152018-08-12
Waves (WAVES)6035.149640.74146920.0035019510.12103490.008216651Acceptable−1.7217262016-06-022018-08-12
Iconomi (ICN)65810.176730.25284040.00082523170.58100120.0104604Acceptable0.82354362016-09-302018-08-12
NEO (NEO)6653.8231180.87271920.00089273340.55229790.006478303Acceptable1.3160352016-09-092018-08-12
Lisk (LSK)81111.454780.17723770.0018038850.2315520.009606102Acceptable3.0726452016-04-062018-08-12
Stellar (XLM)10099.6220450.29256140.0022000750.10862260.007992198Acceptable1.2212682014-08-052018-08-12
Verge (XVG)13878.3002410.40470480.0021155920.053166560.007575786Acceptable−2.841822014-10-092018-08-12
MaidSafeCoin (MAID)156010.437710.23563770.0032792880.0060018350.007513696Acceptable3.734072014-04-222018-08-12
Dash (DASH)16415.9580450.65193160.0014185310.097509160.00621291Acceptable0.86159832014-01-192018-08-12
DigiByte (DGB)164925.90.001110.003.210.0050.01088511Acceptable−2.41362014-01-102018-08-12
Bitcoin (BTC)193330.81930.00015129580.00066968280.27403570.01158613Acceptable5.8815062013-04-282018-08-12
Gnosis (GNO)4688.7543440.36344120.0069378940.038893260.01312756Marginally acc.1.1355512017-04-182018-08-12
Golem (GLM)63311.074610.19750740.0036904310.09670960.0129236Marginally acc.6.1313782016-11-112018-08-12
Zcash (ZEC)65320.823150.0076323570.0010296570.51049940.01293599Marginally acc.−0.93722372016-10-282018-08-12
Decred (DCR)91517.68320.023731080.00059751810.57884010.01375337Marginally acc.−1.5867652016-02-082018-08-12
Ethereum (ETH)110225.773990.001150.0003780.6589960.01482756Marginally acc.−0.084313232015-08-072018-08-12
NEM (XEM)123027.133640.00067038070.0082955280.0000370.01417723Marginally acc.3.198542015-03-292018-08-12
Tether (USDT)125834.916830.00002770.01380.000000030.01391653Marginally acc.−5.9697472014-10-062018-08-12
EOS (EOS)40115.363980.052442710.0034949840.24623010.0200535Nonconformity−2.8198782017-06-202018-08-12
TENX token (TENX)40210.50.2340.008080.03890.01539412Nonconformity−7.1193472017-06-272018-08-12
Veritaseum (VERI)43111.321510.18413910.012113390.0054026120.01726905Nonconformity−1.6038992017-04-252018-08-12
Basic Atention T. (BAT)43819.055230.014567070.012939430.0034565980.02196946Nonconformity0.23199422017-05-292018-08-12
PIVX (PIVX)90328.084380.00045846710.011997640.00001970.01890993Nonconformity−7.0316872016-01-302018-08-12
Dogecoin (DOGE)170283.1755 1.12   ×   10 14 0.02422157 1.25   ×   10 18 0.0214206Nonconformity−9.5274952013-12-082018-08-12
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Back to TopTop