Big-Crypto: Big Data, Blockchain and Cryptocurrency

: Cryptocurrency has been a trending topic over the past decade, pooling tremendous technological power and attracting investments valued over trillions of dollars on a global scale. The cryptocurrency technology and its network have been endowed with many superior features due to its unique architecture, which also determined its worldwide efﬁciency, applicability and data intensive characteristics. This paper introduces and summarises the interactions between two signiﬁcant concepts in the digitalized world, i.e., cryptocurrency and Big Data. Both subjects are at the forefront of technological research, and this paper focuses on their convergence and comprehensively reviews the very recent applications and developments after 2016. Accordingly, we aim to present a systematic review of the interactions between Big Data and cryptocurrency and serve as the one stop reference directory for researchers with regard to identifying research gaps and directing future explorations.


Introduction
Cryptocurrency as an emerging topic has been the focus of developers, investors as well as researchers in the past few years.Although the market shows significant volatility [1], the total market value has reached hundreds of billions of US dollars with some experts suggesting it would hit a USD 1 trillion valuation this year [2].In addition, there are new cryptocurrencies, trading platforms, developers, banking and institutional partners joining the market regularly.Today, this nearly trillion valued market is certainly influencing how people invest and transact.
The digitalization and technological progression of the modern world has prompted the collection, analyses and implementations of Big Data analytics, which have been embedded into every aspect of daily life and progressing rapidly [4].The Internet of Things (IoT) [5] are changing the network and communication infrastructure, cloud computing [6] is altering the way of computation and data storage, while data mining techniques, machine learning, and Artificial Intelligence [7] are revolutionizing knowledge extraction, problem solving, decision making and operation optimization.These Big Data analytic technologies are not just the trending focuses of researches and implementations, but also the possible solutions and leading strategies for all aspects of human life, such as disease prediction [8], healthcare [9,10], etc.For instance, the MapReduce programming framework [11] as big data analytics process fusion has provided a significant paradigm for both industry and academia.
As an encrypted digital currency, cryptocurrencies are operated in a system which cannot be materialized, and the well structured comprehensive records of the tremendous overall network satisfy the 5 V feature of Big Data (volume, variety, velocity, veracity and value) [12].Thus, it serves as a good resource for Big Data analytics, while Big Data analytics also holds the keys for the revolution and development of cryptocurrencies.For instance, the global scale of digitalization and popularization of the IoT are prompting the adoption of novel technologies in general, which makes cryptocurrency a more promising alternative.Moreover, Big Data analytics can also help investors and developers to make better decisions and overcome its infrastructure limitations.On the other hand, technologies underlying cryptocurrencies have proved its applicability to a wide range of subjects.This further accelerated the digitalization progress and extended the Big Data analytics network.In brief, there are mutual benefits for exploitation when considering the interactions between Big Data and cryptocurrency and the potentials remain immeasurable.
This academic paper directly focuses on the interactions between Big Data and cryptocurrency, which are two significant concepts that have been comprehensively investigated individually.We aim to present a comprehensive investigation of their convergence and a systematic review of recent developments for all stakeholders.This paper is both academic and industrial friendly for stakeholders who seek to gain a better understanding of the interactions between Big Data and cryptocurrency or aim to explore its future potentials.
The remainder of this paper is organized such that cryptocurrency is comprehensively introduced in Section 2. The interactive researches of Big Data and cryptocurrency are summarized and reviewed by topics in Section 3. Finally, Section 4 concludes the findings and presents directions for future research.

Cryptocurrency
Cryptocurrency is an encrypted digital currency that incorporates the cryptography technique.The first cryptocurrency-Bitcoin-was invented by Satoshi Nakamoto [13] in 2008 and has been in circulation since 2009.Since then, it has transformed to become the most famous cryptocurrency and the representative term for cryptocurrencies/digital currencies [14].As can be seen in Figure 1, the market price of Bitcoin was once at around 20,000 USD by the end of 2017 and today it maintains an average around 8000 USD.Technically, the "mining" of cryptocurrency means that by reinforcing computational powers to participate the cryptocurrency network and approve transactions, very small amount of fees will be paid in cryptocurrency.Although it is now extremely difficult to mine Bitcoin [15], its huge potential and market value have attracted more miners and developers to be part of this growing market.The total supply of Bitcoin has shown an exponentially growing trend ever since it was firstly implemented.Unlike banknotes, cryptocurrency is encrypted digital currency which cannot be materialized.Considering the rapid development of means of payment and transactions over the last decade, although it is still not certain to claim that cryptocurrency will be the future currency, its significance and possible influences should not be underestimated.In this section, we will briefly introduce the history, key features and challenges of cryptocurrency.Those interested in a more detailed introduction to Bitcoin and cryptocurrency technologies are referred to [16].Online payments are now the preferred form of transaction than it was years ago, and payment intermediary platforms like PayPal have further enhanced the security and privacy protection for online payments.However, the use of cryptocurrency can overcome many drawbacks of the existing transaction system by incorporating the cryptography technique [17].The most important features of cryptocurrency are that: it uses decentralized control so that the buyer and seller make the transaction between each other directly (peer-to-peer), the traders will remain anonymous so that privacy is protected to the maximum extent, the records are irreversible, it is applicable worldwide, efficient and concern-free for double spending.
2.1.Blockchain disadvantages, there are complex regulatory issues for blockchain technology to be implemented as the mainstream transaction system [61].Moreover, there is evidence of blockchains being hacked (i.e., Mt.Gox hack, DAO hack of Ethereum, Bitfinex hack, NiceHash hack, the very recent 500 million Coincheck hack to name a few) and it is evident that the substantial value of the cryptocurrency attracted not only developers and investors, but also cybercriminals [62].
To overcome these limitations, developers and researchers have been enhancing this technology through a variety of approaches, and the majority focused on improving security and privacy.In [55], the authors divided the challenges of blockchain into 8 different aspects and collected the corresponding solutions up until 2016.More recent developments and discussions, such as business process adoption, localization and quantum resistance, can be found in [40,63,64].Two more aspects of development are highlighted here considering their substantial significance and under-researched status.

Blockchain and Tangle and Hashgraph
Along with the rapid growth in the cryptocurrency market, there are now two alternative technologies that have been developed to outperform the fundamental blockchain, these are Tangle [65] and Hashgraph [66].Please note that a detailed introduction and comparison of these two alternative technologies can be found in [67].
Tangle is based on the IOTA protocol and has the main feature of directed acyclic graph, it aims to achieve a faster machine to machine micro-payment system that requires no transaction fees [68].Tangle is designed orientating IoT devices and the most obvious difference in brief is that it uses the web tangle instead of chain structure for consensus (A collectionof consensus protocols in practice can be found in [69]).A transaction will be verified by Proof of Work by Tangle instead of all the miners within the blockchain technology with a growing transaction fee (i.e., with Tangle a transaction can be made by X if two random transactions are validated by X).This in theory will substantially improve the efficiency and energy intensive circumstances.Moreover, Tangle technology allows offline operation when nodes are not connected to the main tangle, while blockchain will only function connecting to the network for updating the chain and also preventing double spending.However, since there will be no transaction fees like blockchain, the missing rewards will result in less participation and certainly less developers and investors.
Another technology that applies directed acyclic graph is Hashgraph and it uses gossip for achieving consensus [67]: a transaction is made when the participant shares all its information with a few random nodes in the network and every node gathers all received information along with new transaction information to pass on to multiple random nodes in the network and so on.Hashgraph was initially introduced in [70] and is a patented technology held by Swirlds.Although its efficiency was claimed exceptional, its private ownership has prevented its validation in a public setting.Therefore, the final conclusion of its performance in relation to blockchain and Tangle remains unknown.

Blockchain and Artificial Intelligence (AI)
Cryptocurrency is only a small part of the rapid technological progressions and revolutions in the past decade.Another trending topic that holds an important place nowadays is Artificial Intelligence (AI), which has brought overwhelming positive impacts towards a variety of subjects.The term AI was coined by Prof. John McCarthy in 1956 and defined as 'the science and engineering of making intelligent machines, especially intelligent computer programs' [71].Since then, AI has evolved rapidly and Figure 2 below summarises the various branches of AI (more details can be found in [72]).This figure alone can provide the reader with several ideas into how AI can be beneficial for the banking system.There is evidence of substantial application of AI in banking and those interested are referred to [73] where the authors present a review of 196 studies which employ operational research and AI in the assessment of banking performance up until 2010.More recently, in [74] the authors considered the opportunities, challenges, implications and potential for AI and machine learning in consumer banking.To this end, machine learning too has been well exploited in banking.Those interested in the history of machine learning are referred to [75].In [76] the authors provide a comprehensive review of Big Data mining in banking which also covers various applications of machine learning algorithms.However, our interest is in the interactions between AI and blockchain technologies which can result in considerable gains and immeasurable value for the banking sector.
AI uses advanced computer science to analyze and make sense of complex data, so that the machines are trained to react intelligently in terms of reasoning and problem solving [77].In fact, both AI and cryptocurrency were the buzz at one of Japan's largest technology conferences this year [78].The cryptocurrency platforms and investors are implementing AI technologies for profit optimization and decision making [79,80].The limitations of blockchain may seek a solution with the assistance of AI technology, for instance, making blockchain more energy efficient, customizing blockchain adoption process, and improving security.Moreover, the data with public access that blockchain can offer is a good resource for AI processing, this may also help to improve the artificial trust [81].

Trends
Google Trends has made it possible to know the key words that Internet users are searching for worldwide.These real time indices of interest are significantly beneficial for researchers to keep track of the things that people value the most.For the past year of 2017, Google Trends have announced the top five search records by categories, named "Year in Search 201".It is of note that "Bitcoin" has made the second trending search in global news worldwide, while "how to buy Bitcoin" is recorded the third trending "How to" searches globally.These records have reflected the growing importance of cryptocurrency and its popularity on a global scale.As can be seen in Figure 3, in 2017 the worldwide Google Trends for "Bitcoin" and "How to buy Bitcoin" spring up after April and shoot up to near the maximum value 100 after November.The Google Trends of "Digital Currency" and "Cryptocurrency" are also listed along with "Bitcoin" in Figure 4, which provides their corresponding interest indices since 2016.These relevant terms all indicate the ascending interests after April 2017 and a huge boost in the end of 2017.Referring to Figure 1, this boost confirms its association with the peak market price of Bitcoin near 20,000 US dollars.Along with the ascending interest in cryptocurrency, more specifically, there are certain types of cryptocurrency, trading platforms, techniques that also received overwhelming attention.For instance, Litecoin, Tron, Lumens, Neo, Iota, Monero, Zcash, NEM, Ethereum, EOS, Stellar, Ripple, Binance, and Tangle (These are collectedfrom the top/rising related terms by Google Trends) are a few examples.The wide variety of cryptocurrency and its rapid development due to technological progression have further stimulated the market and attracted both investors and developers all over the world.According to the most up to date cryptocurrency list by Investing.com,there are currently near 2000 cryptocurrencies in the market with new ones joining regularly.This tremendous market that contains value of hundreds of billions are changing the way that people invest and make payments, while its decentralization feature is significantly influencing the allocation of financial power.

When Cryptocurrency Meets Big Data
The era of Big Data has brought overwhelming challenges along with immeasurable opportunities across the globe.The innovations and progressions of a broad range of subjects have been prompted by Big Data analytics, for instance, crime [82], causality analysis [83], energy [84], forecasting [85], and banking [76] are few examples.Exhaustive evidence in [76] indicated that Big Data analytics are assisting the banking sector with regard to security enhancement, risk management, customer relationship management and marketing, which have significantly optimized its operation efficiency and profits.As a rapidly growing industry in recent years, cryptocurrencies are inextricably linked to Big Data in myriad ways [86].In this section, we investigate and summarize the interactions between cryptocurrency and Big Data, two big concepts in the modern digital world.Please note that a recent review of blockchain applications in Big Data can be found in [87], which selectively reviewed a few applications up until 2016.This paper will only present the very recent research progressions post 2016, and provide the most up to date review that systemically summarizes the interactions between Big Data and cryptocurrency.
The convergence between cryptocurrency and Big Data comes in mutual ways.As introduced in the previous section, the nature of the cryptocurrency network has determined its significance as a valuable resource of Big Data analytics.A fundamental blockchain architecture for example, the decentralized system contains all the transaction records for every participators and the data are well structured and accurate, this made it data intensive environment and an ideal resource for applying Big Data analytics.Considering its immeasurable value as an alternative currency and the blockchain technology behind it with wide applicability, the interest from developers and investors popularized cryptocurrency related technology and the growing size of cryptocurrency and its participators who choose to embrace the era of technology and digitalization further boost the data resource for building up Big Data.Cryptocurrency can serve well structured, high quality data to Big Data analytics, and it is also crucial to investigate the values that Big Data can bring to cryptocurrency industry.Here, we review the Big Data and cryptocurrency interactions and two main focuses are summarized below: security and privacy enhancement and analyses and prediction.Briefly, for the first aspect, cryptocurrency related technology serves as secured network of storing and sharing large volume of data among huge network of participants; and Big Data analytic techniques can also on the other hand further enhance the security of the already stable architecture by analysing the Big Data provided by this network, for instance, identifying cybercriminal entities and detecting majority attack.The second perspective looks into Big Data analytic techniques that help to gain a better knowledge of the cryptocurrency industry (e.g., price, users, and adoptions) by either transaction data of the network or other sources of Big Data, i.e. social media.For each perspective, a collection of the corresponding recent research are presented in the following subsections, and Table 1 below gives a detailed directory of reviewed references by summarizing the key techniques and areas of implementations.
It is of note that the research scope of this paper is briefly summarised as Big Data and cryptocurrency related applications since 2016.The keywords searching approach is conducted for systematically selecting literature being reviewed through two search engines, i.e., Google scholar and science direct, while ensuring significant terms of both Big Data and cryptocurrency and/or Bitcoin are directly relevant.Further requirements also limit the academic resources to be reliable and established, for instance established journals and conference proceedings.A manual filtering process is also conducted at the end to ensure the applications identified are both directly relevant and academically reliable.

Security and Privacy Enhancement
The cryptocurrency market now contains approximately 2000 different cryptocurrencies and the market itself is valued at over trillions.Its digital and decentralized features make it a vulnerable and remunerative market that attracts cybercriminals.According to [55], the majority of cryptocurrency research focused on the privacy and security perspectives.This trend stays unaltered when researchers encountered Big Data and its related technologies.
There have been abundant researches that investigated on applying blockchain technology (the key of cryptocurrency) on managing Big Data and controlling its access [88].Researchers extended blockchain technology for achieving decentralized data management while maintaining privacy.For instance, in order to provide patients immutable log and comprehensive access to their medical record across providers and treatment sites, solutions of data management system for such kind of sensitive and private information are investigated by [47,89,90].Specifically, authors in [89] proposed MedRec system which integrated patients, medical information providers and other medical stakeholders to participate the blockchain-based system while serving Big Data to empower researchers.Yue et al. [47] presented secure multi-party computing for enabling untrusted third-party process patient data without violating privacy.A user-centric health data sharing solution was proposed in [90] where healthcare data from personal wearable devices are collected by mobile application using a decentralized and permissioned blockchain.It is also of note that a tree-based data processing method is adopted for personal healthcare Big Data processing.The healthcare application interoperability challenges are discussed in [91].Griggs et al. [92] introduced the smart contracts-based system for achieving secured real-time patient monitoring and medical interventions.A systematic recent review and discussion of the blockchain applications in biomedical and healthcare domain can be found in [93] for more details.
The Internet of Things has been widely embedded to nowadays technological life style on a daily basis and its intersections with cryptocurrency technologies are inevitable [43].Recent surveys on its security and privacy vulnerabilities and solutions can be found in [94,95].A special focus of the intelligent transportation system can be found in [96] where the blockchain structure was applied for capturing departure information, encapsulating block to transport keys and executing rekeying to vehicles.A recent paper by Singh et al. [97] investigated the possibility and performance of Internet of Things based information and communication system on the underground mines safety and productivity, in which the blockchain-based system successfully curbs the penetration and disrupts cyber-attacks due to heterogeneous devices and distributed network.A blockchain-based security framework was proposed in [98] serving with smart devices as the secured communication platform in a smart city.Dorri et al. [99] conducted the smart home case study on the interaction of blockchain technology with Internet of Things security and Privacy.A recent research by Hammi et al. [100] proposed the blockchain-based, decentralized system Bubbles of Trust that enables robust identification and authentication of devices for efficient and satisfying Internet of Things security.Qu et al. [101] introduced the self-organized blockchain structures that verify the credibility for Internet of Things entities, which achieved promising performance with merits of efficient response and storage requirements.
Cloud computing has been widely implemented considering its merits of computational and storage capability, especially nowadays for coordinating the operation and Big Data processing for Internet of Things.A special focus of the blockchain security solutions in cloud computing can be found in [102], in which the authors disassembled the security challenges of settlement, transaction, wallet, software and presented the secured use and removal protocol in cloud computing environment to protect the system to the maximum extent.Moreover, the authors in [103] illustrated the key mechanisms of the blockchain-ed Internet of Things security nexus by adopting centralized cloud servers, and validated its applicability in a wide range of subjects.
Beside the blockchain technology of cryptocurrency that provides decentralized framework for secured peer to peer interactions, on top of this layer, there are advanced Big Data related technologies which are applied by researchers to further enhance security of the already stable architecture.Yin and Vatrapu [104] applied supervised machine learning classification to identify the proportion of cybercriminal entities in the Bitcoin ecosystem.Dey [105] proposed to use machine learning and algorithmic game theory intelligent software agents for the detection of majority attack activity in blockchain network.In regard to reducing the anonymity for the prevention of illicit activities in cryptocurrency market and potential applications to forensics and financial compliance, the authors in [106] incorporated the gradient boosting algorithm for supervised machine learning and successfully revealed the type of yet-unidentified entity.

Analyses and Prediction
Big Data analytics help to gain better knowledge of complex large volume data.For a digitally data intensive market, cryptocurrency technology in a way is partly formed by Big Data.The well structured and high quality Big Data are ideal resources for Big Data analytics.Many researches focused on the volatility of cryptocurrency market and applied a variety of Big Data analytical techniques for better predictions and analyses.These researches mainly aim to assist on profit maximization and reducing risks for investment.
The authors in [107] applied text classification techniques (including supervised machine learning algorithms like support vector machines, logistic regression and navie bayes) on real time Twitter data about cryptocurrency, so to develop advantageous algorithmic trading strategies.Their approach achieved prediction accuracy over 90% on the cryptocurrency market movement.Another research by Kim et al. [108] also works with big text data, the authors analyzed user comments data in online cryptocurrency communities with sentiment classification techniques so to achieve better forecast of the price and number of transactions.Lu et al. [109] worked with big social media data to investigate on the determinants of Bitcoin adoption in Taiwan, the key phrases were identified for social media content mining and sentiment analyses were conducted for finding the significant factor for Bitcoin adoption.
Apart from the big text data, Maesa et al. [110] focused on the cryptocurrency transaction graph records and aimed to identify artificial users behaviours from the topological properties of the users graph.The clustering heuristics are applied and peculiar chains of transactions are detected.In regard to the cryptocurrency market volatility, Jang and Lee [111] applied bayesian neural networks for Bitcoin price prediction and reached satisfying performance with low error rates.A recent research by [24] investigated the high frequency volatility of cryptocurrencies with Big Data analytics technique that combines traditional generalized autoregressive conditional heteroskedasticity (GARCH) model with the machine learning support vector regression, the novelly proposed model outperformed the existing approaches and achieved better prediction for both low and high frequencies.Another Big Data analytics technique, artificial neural networks, was employed in [112] to give better return prediction for the Bitcoin intraday technical trading.McNally et al. [113] aimed to better predict the price movement directions of Bitcoin by applying the bayesian optimised recurrent neural network and a long short term memory network.Moreover, the authors in [114] recently proposed the hybrid volatility forecasting framework for Bitcoin price prediction, the framework incorporated GARCH, artificial neural network with input data by technical analysis and principal components analysis for data preprocessing.Similarly, a few researchers promoted the machine learning applications in cryptocurrency price prediction with a variety of techniques or combinations of techniques [115][116][117].

Conclusions and Future Research
This paper has comprehensively introduced cryptocurrency and the key blockchain technology behind it, as well as provided a systematic review of the researches indicating the close interactions between Big Data and cryptocurrency.It is of note that this paper directly focuses on the convergence of these two concepts in academic friendly format and also presents the most up to date review post 2016 regarding this interactive subject.
We have found that the rapidly growing interests and attention of an emerging, tremendous, and valuable market like cryptocurrency come along with criticisms and misgivings.It provides the worldwide efficient, decentralized, peer to peer transaction system while retaining anonymity and privacy.However, its digitalized and anonymity features also make it voucherable targets for cybercriminals, and less easily adoptable for the majority of people who lack of technological skills.The blockchain technology behind cryptocurrency has widely extended its capacities and was implemented on a variety of subjects, such as smart contracts, data trading and management, governance, and digital ownership.Researchers have been investigating on the solutions of overcoming its limitations and further improving this technology, we have found relatively new progressions like Tangle and Hashgraph technologies that can substantially improve efficiency and reduce costs.Also, the convergence with artificial intelligence has attracted considerable attention.In regard to the interactions between Big Data and cryptocurrency, we have reviewed the relevant developments and summarized them into two main aspects: security and privacy enhancement and analyses and prediction.
It is noticed that the interactions of Big Data and cryptocurrency are under researched in general.There is currently no clear information in terms of full adaptation and the relevant processes for the adoption of cryptocurrency as mainstream currency or the adoption of the blockchain technology behind it [63].Majority of the researches targeted on a limited selection of topics: using Big Data analytics for knowledge extraction and better understanding the cryptocurrency market volatility; applying cryptocurrency technology for enhancing the Big Data management and access control; and adopting Big Data analytics for an extra layer of security enhancement.There are still many potentials that remain untackled and are certainly worth exploring as future research.For instance, the transaction records are not fully exploited, mainly due to the lack of usability of application programming interface.The participators who have access to the transaction Big Data may have less academic interests other than profit-oriented goals.A few disadvantages of cryptocurrency, such as energy inefficiency, computational scalability, market entry barriers and regulatory challenges, can possibly seek sufficient solutions through Big Data analytics.It is also quite unexpected to see the

Figure 1 .
Figure 1.Total Bitcoins in circulation and its market price since 2009.

Figure 2 .
Figure 2. The branches of AI.

Table 1 .
Summary table of Big Data and cryptocurrency related applications since 2016.