Technical Sustainability of Cloud-Based Blockchain Integrated with Machine Learning for Supply Chain Management

: Knowing the challenges of keeping and manipulating more and more immutable transaction records in a blockchain network of various supply chain parties and the opportunities of leveraging sophisticated analyses on the big data generated from these records, design of a robust blockchain architecture based on a cloud infrastructure is proposed. This paper presents this technical design with consideration of the technical sustainability in terms of scalability and big data processing and analytics. A case study was used to illustrate how the technical sustainability is achieved by applying the proposed technical design to the real-time detection of the maritime risk management. This case also illustrates how machine learning mechanism helps to reduce maritime risk by guiding a cargo ship to adjust to the planned or safe route from a detour to a danger zone. This paper also discusses the implications for further research direction.


Introduction
A blockchain is "a distributed database, which is shared among and agreed upon a peer-to-peer network. It consists of a linked sequence of blocks (a storage unit of transaction), holding timestamped transactions that are secured by public-key cryptography (i.e., "hash") and verified by the network community. Once an element is appended to the blockchain, it cannot be altered, turning a blockchain into an immutable record of past activity" [1]. As such, the blockchain technology improves trust and security. With the application of blockchain technology, the parties involving in transactions can create, manage, and sustain transaction records which are distributed and shared among the parties in the blockchain peer-to-peer (P2P) network. In a P2P network, all the peers (or, parties) have the same privilege. There is no centralized control of distributing and sharing activities in that network. Instead, all the peers distribute and share their transaction records with each other in the P2P network. In this way, the transaction completeness and accuracy can be confirmed by employing consensus of the independent verifiers in the blockchain P2P network [2].
However, sustainability of the supply chain management adopting blockchain technology is questionable because it involves the challenges of dealing with tremendous processes and a large number of supply chain parties, and handling the big data generated from these supply chain processes and parties. Furthermore, new opportunities emerge from the big data generated from the supply chain field for big data analysis.
Having all these challenges and opportunities in mind, design of a blockchain architecture using a cloud infrastructure is proposed to ensure the sustainability of the supply chain management using blockchain technology. This paper discusses how machine learning can be integrated into data analytics for efficiently processing, analyzing, and obtaining insights from the supply chain big data because of the immutability and rapid volume growth features of the historical transaction data stored on the blockchains based on cloud infrastructure. A case study on the real business use case of this blockchain architecture implemented with machine learning is presented to illustrate the proposed ideas.
The significance of the research presented in this paper is twofold. First, as shown in the reported studies about the sustainability issues of adopting blockchains for supply chains reviewed in this paper and by other researchers (e.g., [16,20]), many previous studies (e.g., [21][22][23][24][25][26][27][28][29][30]) focused on how blockchain technology can be applied to improve environmental, economic, and social sustainability for supply chain sustainability, but little research explored technical sustainability. Technical sustainability refers to the longterm technical administration, maintenance, development, and support of an innovative technology in order for its continuation, advancement (e.g., increased efficiency [31]) and resilience (e.g., resilient supply chains [32]). The technical sustainability of a blockchain itself is also a challenge and should be rectified. Regardless that the applications of blockchain technology to supply chain field are still in their infancy stage, as blockchains grow and more rapidly expanding blockchains emerge in the blockchain P2P network, the supply chain parties have to face the problem of the blockchain sustainability in terms of its key performance issues such as availability, scalability, latency, throughput, and big data storage capacity [33,34]. This technical problem also hinders the supply chain users' adoption of blockchain technology [20]. The proposed study in this paper attempted to design a blockchain architecture based on cloud infrastructure (or simply, blockchain on cloud) in order to ensure the technical sustainability for supply chain management. Second, tremendous processes and data derived from a variety of supply chain parities are sources for big data analytics which can be enhanced by machine learning. To the best of the authors' knowledge, such scenario was not investigated in previous studies. This study highlighted this important issue of integrating machine learning into data analytics on cloud infrastructure for supply chain management insights and practices.
The rest of the paper is divided into four sections. Section 2 presents a blockchain structure and discusses the challenges and opportunities for developing sustainable blockchains in supply chain. Section 3 conducts the literature review on the blockchain sustainability and machine learning mechanism in supply chains. There are two processes of the research methodology, as provided in Section 4. Section 4.1 shows the first process which is technical review of a cloud infrastructure and other supporting technologies for designing a blockchain architecture and integrating machine learning into that architecture. Section 4.2 shows the other process which is a case study that illustrates how the proposed blockchain architecture in supply chain with machine learning integrated into data analytics on cloud can be applied to the detection of the maritime risk management. The document analysis of the case is also described in this section. Section 5 discusses the findings. Conclusions with implications for further investigations are given in the last Section 6.

Sustainability of Blockchains in Supply Chain
From a perspective other than the study by Park and Li [11] which addressed the effect of blockchain technology on supply chain sustainability, the sustainability of blockchains in supply chain has to be examined. Blockchain technology applicable to supply chain context provides a timestamped series of supply chain transaction records distributed and shared within the network of the supply chain parties.
When a supply chain party creates a transaction record such as delivery schedule, that transaction record is stored in a block. As there is a limited storage capacity for a block, a sequence of blocks is created to hold lots of transaction records. Blockchain technology uses a cryptography method to secure each block. Each block in a blockchain consists of two major parts-the block header and the block body [35], as shown in Figure 1. The block header can be regarded as the identity of that block in a blockchain. The block identity is determined by a hash of its previous block, a timestamp and a Merkle root. A hash is a numerical value returned from a function that converts data of any format such as string and any size to a fixed-sized numerical value. That hash represents the cryptographic signature which helps to determine valid transactions and is essential in blockchain management [36]. The first block header in a blockchain, which is termed the genesis block, has no hash of its previous block. The second block header in the blockchain is determined by the hash of its previous block, the timestamp and its Merkle root. In this sense, the hash of a previous block is a pointer that connects to the previous block. The sequence of these linked blocks becomes a blockchain, as shown in Figure 1. When a block is produced or published (or, mined for the bitcoin case), a timestamp is created to record the time when this block is generated. A Merkle root is built by a cryptography method that uses the concept of a Merkle tree [37], as indicated in Figure 2. The Merkle root can be used to verify the block body transactions [38]. The Merkle tree, which is stored in a block body [39], is constructed by repeatedly using a hash function on pairs of hashes from the body transactions in a bottom-up approach. This approach keeps on using the hash function to compute hashes from bottom up to the top. The top hash is the Merkle root which can be regarded as the hash value of the hashes of all the transactions stored in the block body. That Merkle tree structure in a blockchain enables verifying blockchain data efficiently. A blockchain can be regarded as a digital ledger of transactions which is duplicated and distributed across the network of various supply chain parties [40,41]. With the blockchain structure of the hash of a previous block, the timestamp and the Merkle root as a proof of work, the supply chain parties involving in the transactions can record and publish their transaction records in a distributed and shared ledger in a secure and immutable way because their transaction records are hashed, timestamped and stored in a blockchain, confirming the transactions by the consensus mechanism in the blockchain network [42], imposing impossibility for hackers to reverse the blockchain's entire historical transaction records under the scrutiny of all the users (i.e., supply chain parties) who can access the distributed and shared ledger [43], articulating the decentralization without the presence of a third party (e.g., government authority, clearing house and central bank) [44,45], and providing auditability and transparency of keeping transaction records in the blockchain network [10].
There are challenges of maintaining the sustainability of blockchains in supply chain and opportunities for exploring from the sustainable blockchains in supply chain. With blockchain applications in supply chain, the supply chain stakeholders encounter the challenges of recording and manipulating more and more immutable transaction records derived from the tremendous processes such as transactions, sourcing, procurement, delivery scheduling, tracking and tracing of goods, coordination and collaboration among a large number of suppliers, intermediaries (e.g., customs brokers, freight forwarders, third-party logistics like escrow service providers), and customers. Moreover, there are challenges of keeping secure and permanent longitudinal supply chain data in an immutable and rapidly growing blockchain in a distributed and shared blockchain P2P network. The supply chain stakeholders need to have further investments for upgrading and expanding the on-premised computing resources and infrastructure for scaling the services for these growing transaction data. The supply chain stakeholders also have the opportunities of leveraging sophisticated analytics on the big data generated from a large variety of immutable historical transaction records. They expect advanced technologies to obtain insights from the analysis on the big data in order to enhance the supply chain services including efficient issue identification, further transport cost reduction and optimization of shipment scheduling (e.g., delay avoidance and error prevention) [2].
Knowing these challenges, cloud computing technology can be adopted to save investment costs and enhance the scalability of services [46,47] because cloud computing technology can adjust services by upscaling or downscaling computing resources to meet the actual business demands. Therefore, investment costs can relatively be saved by shifting from capital expenditure to operational expenditure [48]. This enables more enterprises to establish their blockchain use cases, especially in the initial stage.
With the growing supply chain transaction data volume and variety, the supply chain stakeholders need sophisticated tools to analyze such big data to obtain insights and make decisions. Cloud computing provides analytics tools for analyzing the big data [49]. These tools can perform advanced analysis including the artificial intelligence-based machine learning for the data analytic needs on blockchain records.

Literature Review
To address how the sustainability of blockchains in supply chains and machine learning was applied to supply chains in the literature, a literature review was conducted, as illustrated in Figure 3. This review consists of the processes of setting the inclusion criteria, deriving the search terms and conditions, and reviewing the found articles.

Literature Search
Initially, articles to be reviewed were determined by setting the inclusion criteria as well as the search terms and conditions. The inclusion criteria were as follows: • Studies related to the sustainability of blockchains in supply chains. • Studies about machine learning or data analysis on blockchain records in supply chains.
The search terms and conditions derived from the inclusion criteria were categorized into 5W + 1H pattern [50]. 5W stands for 'Who' referring who the researcher was, 'Which' referring to which area the study focused on, 'What' referring to what to investigate, 'Where' referring to where the study was reported to, and 'When' referring to when the study was reported. 1H means 'How' which refers to how the study was conducted (that is, what research methods were used to conduct the study). Table 1 shows the search terms and conditions used initially for the literature search. A combination of the search terms and conditions using logical operators and some key words such as 'blockchain sustainability' and/or 'machine learning' or 'data analytics' or 'data analysis' and 'supply chain' or 'logistics' and 'from 2016 to 2021' were used to search through the Internet search engines (e.g., ERIC, Google, ProQuest, ScienceDirect, Scopus and Web of Science) and the web databases of the journal publishers with high impact factors (e.g., MDPI, Elsevier, Emerald, IEEE, Routledge, Springer, Taylor and Francis) and university libraries.

Search Results
The search results showed that no previous studies related to any topics about blockchain sustainability, machine learning and data analytics/analysis for supply chain field. As mentioned in [16], a few research on blockchain technology application with emerging technologies such as big data analytics and artificial intelligence in supply chains was noted. For example, the studies [54][55][56] explored the integration of technologies such as IoT sensors and data analytics into blockchain technology for tracking and tracing in supply chains. The proposed study presented in this paper integrate machine learning, which is a subset of artificial intelligence, into data analytics in a cloud infrastructure for blockchain application in supply chains.
Then, less stringent conditions were used in the literature search by taking out the specific search terms 'machine learning', 'data analytics' and 'data analysis' from the search. After that, some previous studies and literature review papers (e.g., [16,20,57,58]) on adoption of blockchain for sustainable supply chain (instead of sustainable blockchain) were found. The reference lists from these found papers provided sources for recursive literature search that could identify more previous studies missed by the previous internet search.

Literature about Impact of Blockchain in Supply Chains
As highlighted by Varriale, Cammarano, Michelino and Caputo [20], the previous studies about the sustainable aspects related to impact of blockchain in supply chains explored three forms of sustainability: environmental, economic, and social. Environmental sustainability is concerned with protecting and maintaining environmental resources for the betterment of the living environment. Examples of the previous studies on the environmental sustainability generated by using blockchain technology in supply chains include the study on how blockchain improves the environment by monitoring carbon emissions and transmitting the monitoring messages in the blockchain P2P network [21] and the study on how blockchain facilitates financial transactions and rewards for motivating people to protect the environment by storing recyclable materials [22]. Economic sustainability is the ability of an economy to continue growing by utilizing resources efficiently. Such efficiency like efficient supply chain processing, transaction handling and delivery can be achieved in supply chain management with the help of blockchain [23,24]. The study by Mercuri, della Corte and Ricci [25] found that smart contracts in blockchains for supply chains increase trust among the subjects, improve the traceability and promote transparency of transactions, leading to the elimination of intermediaries and therefore achieving economic sustainability by reducing transaction costs. Social sustainability is about the ability of keeping the well-being of a society. The well-being of a society is predicated on some elements like fairness, ethics, human rights, and diversity acceptance. Some previous studies (e.g., [26][27][28][29][30]) examined how blockchain helps to maintain social sustainability. For example, the recent case studies by Tseng and Shang [55] indicated that the blockchain's features of security, immutability, resiliency, auditability, and permissibility render trust and antitampering among the blockchain users. For another example, the study by Choi [56] showed that the immutability and transparency of blockchain historical transaction records can ensure authentication of a diamond and prevent illegal diamond trades. Some other studies found a trade-off between some forms of sustainability. For example, Christidis and Devetsikiotis [41] found that smart-contract adoption in blockchains leads to transaction cost reduction, thus achieving economic sustainability. However, disintermediation and unemployment result from the smart-contract adoption, causing the problem of maintaining social sustainability.
Then, after the search term 'sustainability' was deleted from the search, the search results showed that many articles were published on the impact of blockchain in supply chains. For example, the study by Kshetri [59] found that blockchain affects some key supply chain performance dimensions such as cost, speed, dependability, and risk reduction. As blockchain was still a new technology to supply chain field, some researchers focused on whether the supply chain users accept using blockchain technology. For example, based on Technology Acceptance Model (TAM) [60], the study by Yang [61] investigated the impact of blockchain in maritime shipping supply chain and found that customs clearance and management, digitalizing and easing paperwork, standardization and platform development influence the supply chain stakeholders' intention to use blockchain technology. Another example is the study by Queiroz and Wamba [62] which explored the supply chain stakeholders' intention to use the blockchain technology by applying the extended TAM called Unified Theory of Acceptance and Use of Technology (UTAUT) [63]. This study tested the hypothesis that "social influence affects the supply chain user's intention to use blockchain" and found that this hypothesis was accepted by the supply chain users in India but not accepted by the supply chain users in USA. Wong, Tan, Lee, Ooi, and Sohal [64] also used UTAUT in their study and found that the facilitating condition, technology readiness and technology affinity have a positive influence on their intention.

Literature about Applicability of Blockchain in Supply Chains
The applicability of blockchain technology in supply chains is also a major research direction in the literature. Two main areas were explored. One area is the investigation on the applications to a particular supply chain operation. For example, the previous studies investigated whether blockchain technology can be applied to order management [65,66], co-ordination [67,68], logistics [2,69,70], risk management [30,[71][72][73] and smart contracts [74][75][76].
The other area is the exploration of how the blockchain features improve supply chain operations. These features are security, shareability (or transparency), immutability, auditability, cost efficiency, reliability (or resiliency), traceability and trust. The security of blockchain is guaranteed by the elimination of data tempering because the data are decentralized and distributed among the blockchain users [77], the adoption of encryption technology to prevent information theft and modification [78] and the consensus mechanism [79]. The consensus mechanism establishes trust among the blockchain users [42]. The shareability of blockchain is featured by distributing the transaction records (e.g., supply chain legal and contracting records) to all supply chain parties involved in the transactions. Park and Li [11] and Wang, Wu, Chen, and Evans [68] illustrated how the blockchain system developed by IBM and Maersk allows information sharing. Wang, Jie and Abareshi [80] illustrated how blockchain enables transparency as any transaction updates will be propagated to the parties (e.g., delivery company, manufactory, and stock management department) in the blockchain network. Once the transactions are approved by all the involved parties in the blockchain network, the transaction records stored at all the parties are immutable, facilitating the verification processes among the parties for auditing [29], achieving cost efficiency by eliminating unnecessary intermediaries and so reducing transaction costs [29], and enabling traceability of transactions [11,81,82]. The case study in Park and Li [11] demonstrated how Wal-Mart has achieved traceability using blockchain technology and improved the process of tracking the food delivery from a supplier to a consumer. The reliability of blockchain can be maintained by the community of the blockchain users and the decentralization of the user nodes in the blockchain network-any user node fails or exits in the blockchain network, the whole blockchain can keep running as the other user nodes can continue to perform the blockchain functions [29].

Insights from the Literature
In line with the technical sustainability difficulties identified by Lim, Li, Wang and Tseng [16] and the use of big data and artificial intelligence for further studies recommended by Liu, Zhang and Zhen [83], this study proposed an architecture that tackles the technical sustainability difficulties of scaling and dealing with tremendous blockchain processes and transmission delay and storing, manipulating and analyzing the supply chain transactional big data stored in the blockchains with the application of machine learning. In order to implement blockchain technology for supply chain management, improvement in storage management and advanced cloud computing infrastructure are required [24].

Technical Review and Design
To design a blockchain architecture, technical review was conducted to explore the appropriate cloud computing technology that can fit the blockchain features. In addition to the blockchain features, the technical sustainability challenges in terms of scalability and big data processing and analytics were considered when exploring the cloud technology. The documents used for this exploration included the websites, published books, articles, technical reports, user manuals and training kits of the cloud computing products (e.g., [84][85][86][87]). With reference to the results from the document analysis, AWS was adopted as a reference model for the main reason that AWS provides the most reliable and the fastest serverless functions among the prominent cloud technologies [88]. As such, the AWS cloud infrastructure provides the service components that should be contained in any cloud infrastructure. These service components are appropriate for building blockchains in supply chains, as presented in sub-Section 4.1.1. Based on the cloud infrastructure, design of blockchain architecture could be conducted, as presented in Section 4.1.2.
For big data analytics, the cloud infrastructure provides the data analytics architecture, as shown in Section 4.1.3. Machine learning can be used to enhance the big data analytics for supply chain operations. To apply machine learning to supply chain operations, machine learning algorithm was developed. This algorithm is illustrated in Section 4.1.4. To enable machine learning in supply chain management, the developed machine learning algorithm was integrated into data analytics which was built on top of the blockchain on cloud, as illustrated in Section 4.1.5.

Cloud Infrastructure for Blockchains
Cloud computing technology should contain high availability, high performance, scalability, security, and alert features. It should also contain application programming tools for developers to customize cloud computing services, storage for big data and analytics tools. AWS cloud infrastructure is taken into consideration as it comprises many service components that exhibit these features. These service components are also available in the other prominent cloud infrastructures such as IBM Cloud, Microsoft Azure and Google Cloud and classified into six categories, namely, virtualization, integration, security, storage, analytics, and notification.
Virtualization service component provides scalable cloud computing service which allows scaling up or down computing resources (e.g., memory, storage, and processing power) to handle different users' requirements. This service component allocates parts, called virtual servers, of cloud computing servers to users and allows them to develop and deploy applications, configure security and networking, and manage storage on the virtual servers.
Integration service component provides application programming interface (API) for programmers to develop web server programs that can be integrated with cloud computing. API acts as an intermediary between two applications or systems. Programmers can develop a program using the API to forward signals and messages between the two applications and therefore allow the two applications to communicate with each other. The integration service component lets developers create custom API to power the blockchain applications.
Security service component generates and hosts encryption keys for encrypting message during message transmission deals with exchange, use, destruction and replacement of encryption keys. For a blockchain, security service component facilitates creation and management of the blockchain's encryption keys, strengthening the security of transmitting the encrypted transactions and ensuring privacy in the blockchain P2P network.
Storage service component contains a cloud storage that allows users to store and retrieve objects and data stored in cloud computing. It also provides highly scalable, available, and efficient data storage. The storage service component is based on database technologies developed to address big data issues. It is built on two database models that can handle big data, namely, key-value database model and document database model. The storage service component can be leveraged to serve as an off-chain database solution to support blockchain applications and store metadata (that is, information about the data). Because there is limited storage capacity for a block in a blockchain, large amounts of transaction data have to be converted into two forms. One form is the original transaction data which can be stored in the off-chain database, meaning the transaction data can be stored outside a block; the other form is a hash value of the original transaction data which can be stored on-chain, meaning the hash value can be stored in the block [89]. With this mechanism, the integrity of the transaction data can be maintained as the hash value in the block which is used to link the corresponding large-sized transaction data outside the block and stored in the cloud storage while the immutability of the transaction data can be examined with the use of the hash value.
The analytics service component comprises analytics tools which utilize artificial intelligence algorithms for real-time analysis on blockchain data. This component can collect, process, and analyze real-time streaming data and provides a cloud data warehouse designed for big data storage and analysis. In these regards, the analytics tools allow users to gain insights in nearly real time.
The notification service component allows messaging among users, administrators, and even other service components in the distributed cloud systems. This service com-ponent allows users to queue and then process messages. It also allows application-toapplication and application-to-person communication. Application-to-application communication includes messaging between distributed systems, microservices, and event-driven serverless applications. Application-to-person communication feature enables events in the cloud computing to send messages to other users at scale via short message service, mobile push, and email. Besides, the notification service lets developers run a code without provisioning or managing cloud servers. The developers just write the code and upload the code to the notification service component which then scales the cloud computing resources for running the code. With the notification service component, the developers can enjoy serverless computing which is a cloud computing execution model that allows cloud computing resources on demand, leading to the situation that the developers only pay less for the resources the cloud platform has allocated for running the code without paying more for the resources for developing and running the code. The notification service component sends notifications or leverage serverless computing to respond to events related to processing blockchain records.

Designing Blockchain Architecture on Cloud
To maintain the sustainability of blockchains in supply chain, the blockchain on cloud using different technologies, as shown in Figure 4, is proposed. The infrastructure layer contains the service components. The actual cloud services used depend on specifications required by each blockchain application in supply chain. Above the infrastructure layer is a platform layer which contains API, blockchain framework and container technology sublayers. API sublayer provides application programming tools for programmers to develop supply chain application services such as sourcing, procurement, delivery, inventory management and insurance claiming processes. The programmers can use the integration service component at the infrastructure layer to allow the programs developed at the API sublayer to connect to and talk with the cloud infrastructure. The API sublayer also defines interactions between multiple systems. Spring Boot and Node.js are some examples of application programming tools in this API sublayer.
Blockchain framework sublayer is where the blockchain technology is applied for supply chain transactions. This sublayer contains Hyperledger and Ethereum. Hyperledger is the project hosted by the Linux Foundation to develop applications, solutions and tools using blockchain technology. Hyperledger Fabric is the subproject of Hyperledger which provides an open-sourced blockchain framework for developing applications, solutions, and tools with a modular architecture. This modular architecture satisfies a broad range of industry use cases. Ethereum is a distributed public blockchain network of computers running decentralized applications. These decentralized applications include voting, global supply chains, methods of payment, and the financial systems.
Container technology sublayer is a form of operating system virtualization that contains all necessary executables, application libraries and configuration files. Operating system virtualization is the act of allocating an operating system to many computers in order to optimize the usage of computing resources. Each allocated version of the operating system is not the actual operating system and is so called virtual operating system or container. Because containers in this sublayer are lightweight (that means containers are small in size with several megabytes and quick to start) and portable (that means it is easy to move containers across computers) with significantly less overhead, they result in consuming fewer resources. Docker, Kubernetes and OpenShift are some examples of container technologies.

Data Analytics Architecture for Supply Chain Management with Blockchains on Cloud
Data analytics architecture for supply chain management with blockchains on cloud contains sequential collecting, storing, processing and consuming stages, as shown in Figure 5. At stage 1, supply chain data are first classified into structured, semi-structured, and unstructured data types, and then collected. The rationale for this classification is that different data types require different storage types and different processes. Structured data refer to the data organized (or normalized) in a structure that can be stored in a relational database. Codd [90] designed a relational model for reducing data redundancy. The database system developed with reference to the relational model is called relational database. Structured data can be manipulated by the standard language called structured query language (SQL) in relational databases. Semi-structured data do not form a structure based on any particular data model, but somehow form a semi-structure based on organizational properties with the use of identifiers or tags. For example, the identifiers or tags defined in extensible markup language (XML) are usually used to characterize semi-structured data. As semi-structured data do not conform to any relational model, they cannot be stored in relational databases and manipulated by SQL. These semi-structured data have to be further manipulated for use and analysis. Unstructured data refer to the data in raw or in their original format such as text, graphics, audio, and video. They do not conform to any data model to form a structure or semi-structure, and therefore cannot be processed by SQL and cannot be defined by XML.
At stage 2, the collected supply chain data are stored in a database or a data lake. Structured and semi-structured data can be stored in a database while unstructured data are usually stored in a data lake which is a storage repository that holds raw data in its native format. These unstructured data such as video, photos and codes may be generated from mobile devices and sensors like radio frequency identification readers and IoT devices used in logistics for supply chain.
At stage 3, the stored supply chain data are retrieved for processing. This may involve data cleansing, transforming, sorting and aggregating processes. This is the stage at which data analytics mainly occur. Meanwhile, machine learning algorithms can be applied to explore patterns from the processed data. For example, the historical sales records can be analyzed with machine learning in order to build stock reordering level for supply chain management.
At stage 4, the analytical results from the previous stage are consumed. That is, the analytical results are presented in order to obtain insights. This consuming stage may involve visualization and business intelligence software tools to build charts, tables, dashboards for presenting the insights to supply chain stakeholders.

Machine Learning for Supply Chain Management with Blockchains on Cloud
"Machine learning is a subfield of artificial intelligence that gives computers the ability to learn without explicitly being programmed" [91]. In this regard, machine learning is an artificial intelligence method for systems to automatically learn. The systems can automate to learn from the processed data, identify patterns from the data, and then adjust and modify the pre-programmed machine learning algorithms accordingly in order to perform specific tasks better. Machine learning is exposed and can adapt to new data and keeps on training and modifying for better performance and decisions.
For data analytics for supply chain management with blockchains on cloud, the cloud infrastructure provides tools to build, train and deploy machine learning. Figure 6 shows a cycle of building, training, and deploying stages. At the first building stage, supply chain scenarios and use cases are defined. Then, training data are collected and prepared for appropriate machine learning algorithms to use like the cases specified by Li [92].
At the second training stage, the infrastructure for machine learning on cloud is set up. Then, the training data are loaded to the machine learning on cloud with the intention to build and validate machine learning models.
At the final deploying stage, the machine learning models built at the previous stage are deployed as API endpoints for the developers to make, configure, and even scale and manage the cloud infrastructure.
New supply chain training data will make the data analytics tools to restart the building process and go through training and deploying again. The data analytics tools keep this cycle of building, training, and deploying for better performance.

Integrating Machine Learning into Data Analytics for Supply Chain Management with Blockchains on Cloud
The integration process proposed by Yeung, Wong, Tam and So [49] can be applied to supply chain management with blockchain on cloud. That process is to integrate machine learning ( Figure 6) into data analytics process ( Figure 5) to obtain insights, as shown in Figure 7. At the collecting stage, the supply chain data are classified into structured, semistructured, and unstructured data types. Batch load and streaming are used to load these structured, semi-structured and unstructured data into a database or a data lake at the storing stage. Batch load is to load the data into a database or a data lake through extraction, transformation and loading processes periodically. Streaming is to load data into a database or a data lake continuously from data sources at real time. At this storing stage, machine learning can be integrated to structure streaming data to a format that is ready for analysis and to explore the patterns and characteristics of the vast amounts of raw data from the data lake to make them to be analysis-ready data. Machine learning helps to categorize, standardize, aggregate, annotate and transform data to facilitate data analysis. Some data can be used as training data for machine learning development at the processing stage. New data models may be developed by machine learning and then deployed as an API endpoint for developers to build applications, solutions, or tools at the consuming stage. Meanwhile, some other data will be analyzed by business intelligence tools to carry out data analysis for reporting at the consuming stage.

Case Study
Di Vaio, Varriale and Alvino [93] (pp. 230-231) explained that "the case study approach is well-documented and recognized throughout the academic literature as a useful method for examining phenomena still unexplored [94][95][96]. Case studies allow the investigation of phenomena separately from their context examining specific variables [97,98]". In doing so, a real business use case of the technology similar to the proposed blockchain on cloud integrating machine learning into data analytics was studied. This was a single case study as there were no other approachable cases adopting this new design of the blockchain on cloud with machine learning integrated into data analytics for supply chain management due to confidentiality agreement. The technical sustainability, in terms of scalability and big data manipulation and analysis, was adopted as a theoretical base to analyze the case.

Background
The case company is the multinational shipping company Maersk which has been using TradeLens for its international logistics. TradeLens is a blockchain application based on IBM cloud technology and accessible via the open API powered by HyperLedger Fabric blockchain technology. The applicability of TradeLens blockchain features to Maersk were reported in the literature like cost efficiency [11,59], security, shareability, and immutability [11], traceability and trust [59].
In addition to applying blockchain on cloud to supply chain operations, Maersk wanted to integrate machine learning into data analytics on the cloud-based blockchain for marine insurance. Partnered with Ernst and Young, Guardtime and Microsoft, Maersk developed a blockchain-based platform integrated with machine learning to improve efficiency, transparency, and accuracy, and maintain security and compliance with a large variety of supply chain parties. This blockchain-based platform is based on Microsoft Azure cloud infrastructure.

Implementation
After an order notification is received and a shipment is requested, Maersk will schedule the shipment and decide an appropriate route for a cargo ship to go through from the port of origin to the port of destination. The information of the shipment schedule and route, and the required documents (e.g., bill of lading, letter of credit and forwarder's certificate of transport) are relayed to the parties involved (e.g., the bank, the supplier, the port authorities, and the buyer). The blockchain-based platform integrated with machine learning mainly addresses the detection of the maritime risk assessment, as illustrated in Figure 8. The machine learning algorithm for this detection is based on computational and statistical analyses on the big data of previously assigned routes and historical danger zones. That algorithm involves a cycle of computational and statistical analyses on new data. This cycle, known as learning or improving through experience, will render better predictions. For this case, the big data of previously assigned routes and historical danger zones provide sources for the machine learning algorithm to learn and make better decisions on a new route and a safe zone. A new contract will be generated and stored in a blockchain network on cloud when a shipment is confirmed and assigned with a planned route. The assigned cargo ship will follow the planned route from the source to the destination. The geo-location data of the ship travelling along the planned route will be traced continuously by global positioning system (GPS) sensors installed in the ship.
If the ship unexpectedly detours to a danger zone (or, no-go area [99] e.g., bad weather zone), the GPS sensors will detect the detour. Like an IoT device, the GPS sensors will send the updated geo-location data of the ship to the blockchain network on cloud. Then, the contract will be updated accordingly. That updated route stored in the blockchain and processed by machine learning integrated into data analytics on cloud. The analytical results generated from the machine learning algorithm integrated into the data analytics processes on cloud show that the detour to the danger zone is prone to accidents and therefore induces a higher insurance premium because of the higher shipment risk (e.g., piracy) [100]. The machine learning algorithm enables learning process for guiding the ship to return to the planned route or to find a new safe route to the destination. Once the ship is adjusted to travel in the safe zone, the contract will be updated, the shipment risk level will be lowered, and the insurance premium will be reduced. These transaction records are stored in the blockchain and shared among the supply chain parties (e.g., the supplier, the shipping company, the insurance company, and the buyer).

Data Collection and Analysis
A confirmatory approach was used to carry out document analysis with the intention to confirm whether the technical sustainability of scalability and big data processing and analytics can be achieved. The documents used for analysis included the websites, news, reports and technical manuals (e.g., https://www.refrigeratedtransporter.com/carriers/ article/21720715/maersk-teams-with-microsoft-for-digital-transformation (accessed on 17 June 2020)) about the blockchain with machine learning on Microsoft Azure used by Maersk.

Findings and Discussion
The blockchain with machine learning based on Microsoft Azure should achieve the technical sustainability of scalability as Microsoft Azure Autoscale service component can scale automatically to handle increased workload, as reported in many technical documents (e.g., https://azure.microsoft.com/en-us/features/autoscale (accessed on 8 December 2020)) and Maersk did not report any severe degrade, downtime or delay of supply chain services. Otherwise, the global supply chain will be collapsed. Moreover, the technical sustainability of the big data processing and analytics is achieved as the blockchain with machine learning based on Microsoft Azure manages to detect maritime risk assessment. Moreover, the blockchain with machine learning on Microsoft Azure helps to achieve reliability by providing real-time visibility into in-transit assets to establish accurate, dynamic, and fair underwriting and pricing. It also helps to achieve auditability by ensuring regulatory reporting and compliance through an audit trail provided by accurate and transparent data sharing across the supply chain parties in the blockchain P2P network and streamlining claim and settlement processes while reducing errors.
As mentioned before, the case company Maersk uses the technology which is like the proposed design. When approaching the companies that use the proposed design, the concern of those companies to stay anonymous in whatever publications and reports is understandable because of the situation where the companies are still testing the proposed blockchain architecture with machine learning and therefore are not confident that their use cases are success cases to share. This situation also reflects that the supply chain parties are suspicious of the new blockchain technology on cloud integrated with machine learning owing to the uncertainty about skills and knowledge for adopting this technology.

Conclusions and Implications
One of the strongest drivers for change in today's society is sustainability. This will continue for years to come, influencing consumer behaviors, government regulations and business activities. Blockchain technology establishes a relatively trustable network for real data exchange and sharing among different parties in the community. However, the sustainability of supply chain management using blockchain technology is subject to performance and storage issues because of the immutability and the growing volume and variety of supply chain transaction data in the blockchain P2P network of various supply chain parties. To maintain the sustainability of supply chain management using blockchain technology, the blockchain architecture with the support of sophisticated cloud infrastructure was proposed. The cloud infrastructure provides all the necessary web services with scalability, availability, security and serverless computing features that facilitate distributing, sharing, and storing a considerable amount of immutable transaction records with the help of security and privacy web services and the use of the notification web services. On top of the cloud infrastructure, the platform layer of the proposed blockchain architecture contains the API, blockchain framework and container technology tools for developing blockchain applications in supply chain. The big data derived from the ever-expanding volume and variety of the supply chain transaction data in a blockchain provide opportunities for exploring sophisticated analysis tools on these big data. The proposed analysis tool is to integrate machine learning into data analytics for supply chain management with blockchains on cloud. A use case about the detection of the maritime risk assessment is highlighted as an example of applying blockchain technology in supply chain management. Moreover, this case demonstrates how to create additional business values or collaborations in the practical environment, with other advanced technologies like cloud, data analytics and machine learning services.
The major contribution of this study is the determination that blockchain with machine learning on cloud achieves the technical sustainability as the cloud technology can scale to meet the expanding blockchain requirements and learn from the blockchain big data. However, there are limitations of this study. First, this study was restricted to the single case due to the limited number of approachable organizations using the blockchain with machine learning. Therefore, this single case study is subject to the argument of generalizing the findings. As stated by Yin [98a], the findings from a case study are generalizable to theoretical proposition. This proposition provides a knowledge base for a better understanding in the next similar case study. In near future, more cases on the use of the blockchain architecture on cloud integrated with machine learning should be conducted. Second, this study focuses on achieving the technical sustainability at the sacrifice of the economic sustainability and environmental sustainability with increasing operational costs and energy consumption. Owing to the serverless computing (that is, the on-demand nature) feature of the blockchain on cloud, the supply chain stakeholders can start from a small-scaled blockchain network and scale up to meet the actual business needs later. However, the trend of increasing operational costs when a blockchain network becomes larger and larger provides an implication for further investigation on the economic sustainability of this financial technology issue. Meanwhile, more energy consumption due to more blockchain operations is a challenge in environmental sustainability. Third, this study lacks consideration of the social sustainability. With the use of blockchain architecture, the transaction records stored in the blockchain network are immutable due to the consensus mechanism of blockchain technology, leading to a verifiable, traceable, and trustworthy environment but bringing the access right and privacy issues on the immutable everlasting records especially when there are legal or political changes among different regions and countries involved in the supply chain. The social sustainability of the access right and privacy issues has to be investigated. This paper presents a conceptual idea of applying blockchain technology to maintain the sustainability of supply chain management. As such, an empirical study on the impacts of blockchain technology on sustainability performance is a potential research for theoretical development. As pointed out by Lim, Li, Wang, and Tseng [16], the quantitative studies on blockchain-based supply chain are inadequate. Moreover, " . . . quantitative metrics reflecting blockchain in the business world are needed" [20]. In these regards, more quantitative empirical studies on sustainable supply chain performance model are needed. In addition, few use cases of the blockchain technology, especially with machine learning, are mainly due to the supply chain users' scarcity and lack of knowledge and understanding of the benefits and technical aspects of blockchain technology. Blockchain technology in supply chains involves a large variety of supply chain users. As a result, there is a hypothesis that the use of the blockchain technology is influenced by the technology usage expectations of the peer supply chain users (e.g., business partners, the suppliers, and the buyers). Therefore, further investigation is required to explore the factors (e.g., the peer influence) affecting the blockchain adoption behaviors in supply chain field. Furthermore, stimulation or experimental work is proposed to build up smart logistics and digital supply chain [101], which are a trend of industry 4.0, by using blockchain technology.