Design of an IOTA Tangle-Based Intelligent Food Safety Service Platform for Bubble Tea

: Many food safety incidents have occurred in the world in the past 20 years, causing major threats and harm to human life and health. Each country or region has established different food safety management systems (FSMSs) in response, to increase food safety and to reduce food safety risks. Hence, it is important to develop an FSMS service platform with convenience, consistency, effectiveness, scalability, and lightweight computing. The aim of this study is to design and propose an IOTA Tangle-based intelligent food safety service platform for bubble tea—called IF4BT—which modularizes and integrates hazard analysis and critical control point (HACCP) principles to increase data transparency. The deep learning inference engine is based on long short-term memory and Siamese networks to check and extract signiﬁcant rare data of high-risk factors, exception factors, and noises, depending on daily check and audit. IF4BT can ensure the correctness of the information of food manufacturers, so as to increase food safety and to reduce food safety issues such as allergen cross-contamination, food expiration, food defense, and food fraud.


Introduction
In the past 20 years, many major food safety incidents have occurred in the world, e.g., the melamine incident, plasticizer incident, toxic starch incident, toxic soy sauce incident, gutter oil incident, etc. [1][2][3]. Food safety incidents can present major threats to human life and health. Governments and industries of various countries are actively investing in the field of food safety [4,5]. Each country or region has formulated different food safety management systems (FSMSs) to increase food safety and reduce food safety risks [6]. Each national accreditation body (AB) and certification body (CB) issues a certification after certifying that the production line or the product meets food safety standards. Manufacturers need to sell goods internationally, so they must have different food safety certifications. The CB needs to audit production status of the manufacturer from time to time. In fact, many problems can be solved by the manufacturer's own internal audit [5]. However, there is a lack of professional confirmation of each process. Because it is uncertain whether or not the internal audit information has been tampered with, a third-party verification agency is still required in order to ensure that all processes are correct.
The latter must conduct internal and external audits and prepare verification procedures and related documents for different certification units. It takes a lot of time to prepare many kinds of documents. It is not easy to save the collected data and to find the missing process. Some unscrupulous manufacturers even have inconsistent documents with the production line, which may cause food safety issues such as food expiration, food defense, or food fraud. Therefore, effective tracking and tracing, and verifying the correctness of the data-preventing unscrupulous manufacturers from tampering with the content-is an important issue.
Bubble tea-milk tea with tapioca balls-is a very popular ethnic local and international beverage. The ingredients of a cup of bubble tea are tapioca balls, milk, tea, and sugar. Each of its ingredients may have food safety issues, food defense issues, and food fraud issues. Hence, to use IT techniques to construct a standard operating procedure (SOP)-based service platform for food safety is an important issue. Consequently, five problems must be considered: convenience, consistency, effectiveness, scalability, and lightweight computing.
(1) Convenience: Convenience is defined as records being quickly and automatically convertible into officially checkable documents for auditing, applications, or examinations. This can save a lot of labor time and manual input errors. To tackle this problem, a web-based service platform using Extensible Markup Language (XML) and an intelligent mapper can effectively interchange document formats. Extensible Markup Language (XML) is a tag-based language, which can easily migrate, integrate, and transfer data with other services [7]; (2) Data consistency: Consistency is defined as each user seeing a consistent view of the data. Data cannot be deliberately modified. To tackle this problem, the blockchainbased framework is a decentralized architecture. When data are uploaded on the distributed ledger, it records all collected data, transactions, and documents via a tamper-resistant mechanism [8]. To tackle this problem, the distributed ledger ensures that the raw materials, production processes, processing, sales, and audit data in the food safety supply chain are consistent; (3) Effectiveness: Effectiveness is defined as the ability of participants to effectively track and trace goods in the food safety supply chain. Traditional tracking and tracing of goods require a lot of data to consult. However, the data are often broken, and the paper data and electronic data are interlaced and coexist, so they are not easy to acquire. To tackle this problem, transactions are electronically stored in blockchain or Tangle. Participants can effectively track and trace authorized data and transactions via authorities [9]; (4) Scalability: Scalability is defined as a manufacturer being able to use low-cost computer equipment or a low-power IoT device to create a new node of the blockchain, reducing the manufacturer's construction cost. To tackle this problem, devices can participate in the editing of the distributed ledger. IOTA Tangle is helpful for data transmission and connection by multiple IoT devices [9]. Allowing devices to directly participate in data and message transmission can reduce the chance of data tampering [10]; (5) Lightweight computing. Lightweight computing is defined as a low-power device being able to quickly analyze and confirm exceptional data or conditions. To tackle this problem, a participant uses lightweight equipment-e.g., Raspberry Pi-to automatically set multiple triggers, thresholds, and rules by the server for detecting and computing exceptions of ingredients and foods [11].
Regarding the aforementioned five design issues, this study designs and proposes an IOTA Tangle-based intelligent food safety service platform for bubble tea-called IF4BTwhich modularizes and integrates hazard analysis and critical control point (HACCP) principles to increase data transparency. IF4BT is composed of a risk management service platform and a cloud-based deep learning service platform; it can check daily transactions and exceptions. Users can easily trace and track goods. IF4BT can ensure the correctness of the information available to food manufacturers, so as to increase food safety and reduce food safety issues, such as allergen cross-contamination, food expiration, food defense, and food fraud.
The remainder of this paper is organized as follows: Section 2 presents the literature review. Section 3 describes the research methodology. Section 4 contains the results and discussion. Finally, Section 5 presents the conclusion.

Literature Review
In recent years, people have experienced various food safety incidents, and the requirements for food safety have become greater and greater. It is important to prevent food safety problems before they happen. Hence, many researchers have devoted efforts to designing and developing various information technology (IT)-based applications to help, check, and reduce food safety risks, which can be roughly divided into the following four categories: (1) The Internet of Things (IoT)-based Applications: To deploy various IoT devices-such as RFID, environmental sensors, or GPS to monitor and control food safety risks in the food safety supply chain-Raju et al. [12] proposed near-field and ultrahigh-frequency (UHF) wireless passive sensors for monitoring of food quality indices and food spoilage indicators. Zheng et al. proposed a meat food traceability system based on RFID technology to achieve full process supervision on a food supply chain; the authors built a fault-tolerant RFID mechanism to ensure the practicability of the system [13]. Prajwal et al. designed a device to check the quality of food, using pH sensors, gas sensors, temperature sensors, etc.; this helped them to identify food conditions, preventing the consumption of rotten food [11]. Pal et al. proposed an IoTbased sensing and communications infrastructure for the fresh food supply chain to help reduce food waste; it efficiently improved transportation and distribution, and quickly removed contaminated or spoiled products from the fresh food supply chain [14].
(2) Informational assistance: To assist manufacturers in inspections and audits during production processes via hazard analysis and critical control point (HACCP)-based food safety standards, Dusenko et al. proposed a system to ensure the safety of food services based on the HACCP method. The proposed system uses digital technology to monitor products and processes during manufacturing, storage, transportation, sales, and disposal [15]. Jing discussed applications of different links in the grain storage and transportation process; the author proposed a food storage and transportation safety information management platform to enhance the informatization process of food enterprises [16]. Gallo et al. proposed a traceability support system to control safety and sustainability indicators in food distribution, aiding practitioners in the monitoring of unsupervised operations throughout the supply chain [17]. Todorović et al. proposed the use of augmented reality to provide relevant information on food production and packaging systems. the authors tried to apply mobile augmented reality in the food market. This can help consumers to obtain information on goods quickly and in a user-friendly manner [18].
(3) To build a blockchain: To use the decentralized, tamperproof, trackable, and traceable characteristics of the blockchain, all data are stored in the blockchain to ensure that the data will not be arbitrarily tampered with. Yu et al. proposed an architecture based on blockchains and RFID. By scanning the traceable QR code of the foot ring with a smart device, all the data of each process in the chicken life cycle can be obtained [19]. Tao et al. proposed a food safety supervision system based on a hierarchical multidomain blockchain network and a secondary check mechanism. The proposed system can support the timely correction and replacement of malicious nodes through the joint governance of regional nodes, auxiliary verification of supervisory nodes, and arbitration of higher level regions [20]. Tian proposed a supply chain traceability system for food safety based on HACCP, blockchain, and Internet of Things; the information collection stage and food transaction stage are the purpose of the proposed system. All supply chain members are open, transparent, neutral, reliable, and secure [21].
(4) To design a data analysis technology: To design a technology to analyze data in the supply chain and predict possible food safety incidents, Wahyuni et al. used a Bayesian network to measure and analyze Processes 2021, 9, 1937 4 of 19 food safety risks. The proposed method was used to identify risk events in a case study, determine their probability, develop a Bayesian network structure, calculate condition probability tables, and analyze of food safety [22]. Dong et al. proposed a hazard analysis and critical control point (HACCP)-based system combined with fault tree analysis (FTA) to build a traceability model for the production and sales processes of vegetables. The proposed system can ensure food safety and hygiene by controlling key factors in the supply chain [23]. Hebbar used IoT technology to detect the gas released by food, and used deep learning to estimate the probability of food spoilage; if an abnormality was found, an alarm was issued [24].

Methods
This study designs and proposes an IOTA Tangle-based intelligent food safety service platform for bubble tea, which is called IF4BT. IF4BT modules and divides hazard analysis and critical control points (HACCP), and increases data transparency [25]. IF4BT aims to control food risk, including four aspects: (1) Food safety: To list all hazards in the processes, including acceptance testing of raw materials, processing, manufacturing, packaging, storage, and transportation. This ensures food hygiene and food safety, which can reduce the pathogenic bacteria and prevent food poisoning; (2) Allergens: In addition to the outer packaging being clearly marked, allergens must be controlled, and sources of allergens must be prevented from cross-contaminating the product; (3) Food defense: Prevent food from being physically, chemically, biologically, or radioactively contaminated in a deliberate manner; (4) Food fraud: Manufacturers are prohibited from making fakes, counterfeiting, diluting, false labeling, using raw materials from unknown sources, or making false or misleading claims about products.
According to the above-mentioned purpose and design concept, the scenario of IF4BT is shown in Figures 1 and 2: (1) the ingredients are tapioca balls, milk, tea, and sugar; (2) the service platform is based on IOTA Tangle; (3) the food safety management system is based on hazard analysis and critical control point (HACCP) principles; and (4) reports are in the form of audit forms, warning messages, analysis reports, and XML reports.

Participants
The participants in the HACCP-based Tangle system are consumers, manufacturers, accreditation bodies, and certification bodies.

Consumer
Consumers use QR codes to inquire about the purchased products, from raw materials to product sales across the entire food supply chain for the processing and production of the relevant details of the products. Consumers can inquire about product-related information and passed certification marks on the risk management service platform. The data and records of Tangle should comply with the General Data Protection Regulation (GDPR) principles and related Computer-Processed Personal Data Protection Law [26]. If a problematic product is purchased, it can also help consumers report to it to the manufacturer, and quickly notify consumers who have purchased the same batch of products. This can effectively accelerate the entire response to the food safety incident.

Participants
The participants in the HACCP-based Tangle system are consumers, manufacturers, accreditation bodies, and certification bodies.

Consumer
Consumers use QR codes to inquire about the purchased products, from raw materials to product sales across the entire food supply chain for the processing and production of the relevant details of the products. Consumers can inquire about product-related information and passed certification marks on the risk management service platform. The data and records of Tangle should comply with the General Data Protection Regulation (GDPR) principles and related Computer-Processed Personal Data Protection Law [26]. If a problematic product is purchased, it can also help consumers report to it to the manufacturer, and quickly notify consumers who have purchased the same batch of products. This can effectively accelerate the entire response to the food safety incident.

Participants
The participants in the HACCP-based Tangle system are consumers, manufacturers, accreditation bodies, and certification bodies.

Consumer
Consumers use QR codes to inquire about the purchased products, from raw materials to product sales across the entire food supply chain for the processing and production of the relevant details of the products. Consumers can inquire about product-related information and passed certification marks on the risk management service platform. The data and records of Tangle should comply with the General Data Protection Regulation (GDPR) principles and related Computer-Processed Personal Data Protection Law [26]. If a problematic product is purchased, it can also help consumers report to it to the manufacturer, and quickly notify consumers who have purchased the same batch of products. This can effectively accelerate the entire response to the food safety incident.

Manufacturer
Manufacturers upload relevant documents such as product information, production and sales history, inspection reports, etc., to Tangle. IOTA Tangle distributes and synchronizes ledgers of data in secure, distributed, decentralized, and permissionless environments [9]. After data, transactions, and documents have been uploaded to the IF4BT, they cannot be tampered with. Hence, this information on data, transactions and documents is helpful for audit and improvement processes. When performing an internal audit, it can be supplemented with other equipment information to perform the audit based on HACCP. Furthermore, it allows third-party auditors to quickly understand the on-site situation, and the certification body can continue to understand whether the company's audit behavior is correct. In this way, its production process can be confirmed, as can its coaching improvement status.

Accreditation Body and Certification Body
The accreditation body and certification body mainly store the third-party audit status, missing content, missing correction status, and actual factory visits of the manufacturer to Tangle. In addition, allowing the certification body to objectively understand the internal audit situation of the manufacturer requires (1) reconfirming the internal audit of the manufacturer based on the collected data; (2) allowing the certification body to objectively observe production status; (3) pre-preparation before the actual visit to the factory; and (4) helping auditors to perform key audits in accordance with the established important audit procedures.
Therefore, in order to enable effective communication and sharing of information, relevant information on manufacturers is uploaded to the main chain of the risk management service platform for all relevant parties to view relevant reports.

Risk Management Service Platform
The risk management service platform is responsible for constructing and maintaining subchains, allowing users to upload raw data, documents, and reports. After transactions are uploaded to the subchain, data of transactions are normalized by the data normalization unit and evaluated through the constructed rules. If an exception occurs, warning messages will be issued. All uploaded data are formatted and normalized by the data normalization unit, providing it to the deep-learning-oriented food safety, risk, reasoning, and analysis model for reasoning out rules. The working flow of the data normalization unit is shown in Figure 3. After data being collected from the supply chain, they are manipulated by the normalization phase and Shapes Constraint Language (SHACL) validation. Data are tagged by XML tags. These data are changed to resource description framework (RDF) serialization formats, which can be stored in Tangle using the same data format.

Manufacturer
Manufacturers upload relevant documents such as product information, production and sales history, inspection reports, etc., to Tangle. IOTA Tangle distributes and synchronizes ledgers of data in secure, distributed, decentralized, and permissionless environments [9]. After data, transactions, and documents have been uploaded to the IF4BT, they cannot be tampered with. Hence, this information on data, transactions and documents is helpful for audit and improvement processes. When performing an internal audit, it can be supplemented with other equipment information to perform the audit based on HACCP. Furthermore, it allows third-party auditors to quickly understand the on-site situation, and the certification body can continue to understand whether the company's audit behavior is correct. In this way, its production process can be confirmed, as can its coaching improvement status.

Accreditation Body and Certification Body
The accreditation body and certification body mainly store the third-party audit status, missing content, missing correction status, and actual factory visits of the manufacturer to Tangle. In addition, allowing the certification body to objectively understand the internal audit situation of the manufacturer requires (1) reconfirming the internal audit of the manufacturer based on the collected data; (2) allowing the certification body to objectively observe production status; (3) pre-preparation before the actual visit to the factory; and (4) helping auditors to perform key audits in accordance with the established important audit procedures.
Therefore, in order to enable effective communication and sharing of information, relevant information on manufacturers is uploaded to the main chain of the risk management service platform for all relevant parties to view relevant reports.

Risk Management Service Platform
The risk management service platform is responsible for constructing and maintaining subchains, allowing users to upload raw data, documents, and reports. After transactions are uploaded to the subchain, data of transactions are normalized by the data normalization unit and evaluated through the constructed rules. If an exception occurs, warning messages will be issued. All uploaded data are formatted and normalized by the data normalization unit, providing it to the deep-learning-oriented food safety, risk, reasoning, and analysis model for reasoning out rules. The working flow of the data normalization unit is shown in Figure 3. After data being collected from the supply chain, they are manipulated by the normalization phase and Shapes Constraint Language (SHACL) validation. Data are tagged by XML tags. These data are changed to resource description framework (RDF) serialization formats, which can be stored in Tangle using the same data format. Tangle uses decentralized application (DApp) to ensure that the decentralization of the data provided by the participants cannot be controlled or tampered with by a part- Tangle uses decentralized application (DApp) to ensure that the decentralization of the data provided by the participants cannot be controlled or tampered with by a partner. Therefore, Tangle can be used to improve the speed and reliability of tracking and tracing Processes 2021, 9, 1937 7 of 19 when food safety incidents occur, and to assist in the construction of inference models based on collected data. The backbone of IF4BT is based on IOTA Tangle-a distributed ledger based on Tangle, which is illustrated in Figure 4. IOTA Tangle is based on a directed acyclic graph (DAG) [9,27]. IOTA Tangle is an improved technology based on blockchain; it is also a distribute-and-synchronize ledger platform that uses blocks and connects the blocks in series [11]. IOTA improves the transaction speed and power consumption; hence, IOTA Tangle is helpful for data transmission and connection by multiple IoT devices [28][29][30]. Buffered blocks can be quickly proofed and added to Tangle [31,32]. Furthermore, the unspent transaction output (UTXO) model is responsible for making transactions with the accounting balance, where the indicated total inputs are equal to the total outputs and total unspent outputs. The unspent transaction output model is illustrated in Figure 5. UTXO can prevent transaction conflicts, making transactions more effectively.

SHACL Validation
Processes 2021, 9, 1937 7 of 19 ner. Therefore, Tangle can be used to improve the speed and reliability of tracking and tracing when food safety incidents occur, and to assist in the construction of inference models based on collected data. The backbone of IF4BT is based on IOTA Tangle-a distributed ledger based on Tangle, which is illustrated in Figure 4. IOTA Tangle is based on a directed acyclic graph (DAG) [9,27]. IOTA Tangle is an improved technology based on blockchain; it is also a distribute-and-synchronize ledger platform that uses blocks and connects the blocks in series [11]. IOTA improves the transaction speed and power consumption; hence, IOTA Tangle is helpful for data transmission and connection by multiple IoT devices [28][29][30]. Buffered blocks can be quickly proofed and added to Tangle [31,32]. Furthermore, the unspent transaction output (UTXO) model is responsible for making transactions with the accounting balance, where the indicated total inputs are equal to the total outputs and total unspent outputs. The unspent transaction output model is illustrated in Figure 5. UTXO can prevent transaction conflicts, making transactions more effectively.

Confirmed Blockchain Buffered Blocks
One Block Multiple Blocks

Cloud-Based Deep Learning Service Platform
A cloud-based deep learning service platform is responsible for identifying the influential factors on food safety and risk management. The inference engine is called the deep-learning-oriented food safety, risk, reasoning, and analysis model (DORAM), which is composed of a heterogeneous data integration unit, a weight-added class-imbalance learning unit, and a deep learning training unit, and is used to identify significant rare data. Significant rare data are defined as rarely occurring, and being a serious matter once they do. Figure 6 illustrates the working flow of the deep-learning-oriented food safety, risk, reasoning, and analysis model. Processes 2021, 9, 1937 7 of 19 ner. Therefore, Tangle can be used to improve the speed and reliability of tracking and tracing when food safety incidents occur, and to assist in the construction of inference models based on collected data. The backbone of IF4BT is based on IOTA Tangle-a distributed ledger based on Tangle, which is illustrated in Figure 4. IOTA Tangle is based on a directed acyclic graph (DAG) [9,27]. IOTA Tangle is an improved technology based on blockchain; it is also a distribute-and-synchronize ledger platform that uses blocks and connects the blocks in series [11]. IOTA improves the transaction speed and power consumption; hence, IOTA Tangle is helpful for data transmission and connection by multiple IoT devices [28][29][30]. Buffered blocks can be quickly proofed and added to Tangle [31,32]. Furthermore, the unspent transaction output (UTXO) model is responsible for making transactions with the accounting balance, where the indicated total inputs are equal to the total outputs and total unspent outputs. The unspent transaction output model is illustrated in Figure 5. UTXO can prevent transaction conflicts, making transactions more effectively.

Confirmed Blockchain Buffered Blocks
One Block Multiple Blocks

Cloud-Based Deep Learning Service Platform
A cloud-based deep learning service platform is responsible for identifying the influential factors on food safety and risk management. The inference engine is called the deep-learning-oriented food safety, risk, reasoning, and analysis model (DORAM), which is composed of a heterogeneous data integration unit, a weight-added class-imbalance learning unit, and a deep learning training unit, and is used to identify significant rare data. Significant rare data are defined as rarely occurring, and being a serious matter once they do. Figure 6 illustrates the working flow of the deep-learning-oriented food safety, risk, reasoning, and analysis model.

Cloud-Based Deep Learning Service Platform
A cloud-based deep learning service platform is responsible for identifying the influential factors on food safety and risk management. The inference engine is called the deep-learning-oriented food safety, risk, reasoning, and analysis model (DORAM), which is composed of a heterogeneous data integration unit, a weight-added class-imbalance learning unit, and a deep learning training unit, and is used to identify significant rare data. Significant rare data are defined as rarely occurring, and being a serious matter once they do. Figure 6 illustrates the working flow of the deep-learning-oriented food safety, risk, reasoning, and analysis model.  Figure 6. The working flow of the deep-learning-oriented food safety, risk, reasoning, and analysis model.

Heterogeneous Data Integration Unit
The heterogeneous data integration unit is responsible for data normalization and integration [33]. Figure 7 illustrates a detailed flowchart of the heterogeneous data integration unit. The data formats generated by each platform are different from the table defined in each database. If the raw data are not filtered and formatted, the cost of subsequent operations will be greatly increased in data allocation, formatting, and acquisition. Data are tagged as three different types, including real-time data, audit data, and law data. After tagging, data are packaged and then delivered to subchains and the weight-added class-imbalance learning unit.

Weight-Added Class-Imbalance Learning Unit
The weight-added class-imbalance learning unit is responsible for observing and extracting the significant rare data of high risk. The main purpose of this unit is to establish hypotheses for observing and filtering significant rare data via automatic neighborhood size determination-synthetic minority oversampling technique (AND-SMOTE), edited nearest neighbors (ENN), agglomerative hierarchical clustering (AHC), ordering points to identify the clustering structure (OPTICS), and affinity propagation (AP). The significant rare data are not produced easily, but have a high effect on the food safety. When a manufacturer performs production-related activities, they follow standard procedures. IF4BT monitors exceptional events, and uses the weight-added mode to accurately analyze the occurrence of food safety risk factors. Therefore, this study proposes a weight-added class-imbalance learning unit to solve this problem. The original data use AND-SMOTE to generate new data points, strengthening the weight value of all data for analysis, such that the information will not be ignored because some data are too small [34]. After generating new relevant data points, the weight values of all data are readjusted, using feature extraction algorithms to reduce the amount of data that need to be calculated and to maintain the original appearance of the data [35]. Figure 8 illustrates the flowchart of the weight-added class-imbalance learning unit, including the oversampling phase, the downsampling phase, and the create hypothesis phase.

Heterogeneous Data Integration Unit
The heterogeneous data integration unit is responsible for data normalization and integration [33]. Figure 7 illustrates a detailed flowchart of the heterogeneous data integration unit. The data formats generated by each platform are different from the table defined in each database. If the raw data are not filtered and formatted, the cost of subsequent operations will be greatly increased in data allocation, formatting, and acquisition. Data are tagged as three different types, including real-time data, audit data, and law data. After tagging, data are packaged and then delivered to subchains and the weight-added class-imbalance learning unit.  Figure 6. The working flow of the deep-learning-oriented food safety, risk, reasoning, and analysis model.

Heterogeneous Data Integration Unit
The heterogeneous data integration unit is responsible for data normalization and integration [33]. Figure 7 illustrates a detailed flowchart of the heterogeneous data integration unit. The data formats generated by each platform are different from the table defined in each database. If the raw data are not filtered and formatted, the cost of subsequent operations will be greatly increased in data allocation, formatting, and acquisition. Data are tagged as three different types, including real-time data, audit data, and law data. After tagging, data are packaged and then delivered to subchains and the weight-added class-imbalance learning unit.

Weight-Added Class-Imbalance Learning Unit
The weight-added class-imbalance learning unit is responsible for observing and extracting the significant rare data of high risk. The main purpose of this unit is to establish hypotheses for observing and filtering significant rare data via automatic neighborhood size determination-synthetic minority oversampling technique (AND-SMOTE), edited nearest neighbors (ENN), agglomerative hierarchical clustering (AHC), ordering points to identify the clustering structure (OPTICS), and affinity propagation (AP). The significant rare data are not produced easily, but have a high effect on the food safety. When a manufacturer performs production-related activities, they follow standard procedures. IF4BT monitors exceptional events, and uses the weight-added mode to accurately analyze the occurrence of food safety risk factors. Therefore, this study proposes a weight-added class-imbalance learning unit to solve this problem. The original data use AND-SMOTE to generate new data points, strengthening the weight value of all data for analysis, such that the information will not be ignored because some data are too small [34]. After generating new relevant data points, the weight values of all data are readjusted, using feature extraction algorithms to reduce the amount of data that need to be calculated and to maintain the original appearance of the data [35]. Figure 8 illustrates the flowchart of the weight-added class-imbalance learning unit, including the oversampling phase, the downsampling phase, and the create hypothesis phase.

Weight-Added Class-Imbalance Learning Unit
The weight-added class-imbalance learning unit is responsible for observing and extracting the significant rare data of high risk. The main purpose of this unit is to establish hypotheses for observing and filtering significant rare data via automatic neighborhood size determination-synthetic minority oversampling technique (AND-SMOTE), edited nearest neighbors (ENN), agglomerative hierarchical clustering (AHC), ordering points to identify the clustering structure (OPTICS), and affinity propagation (AP). The significant rare data are not produced easily, but have a high effect on the food safety. When a manufacturer performs production-related activities, they follow standard procedures. IF4BT monitors exceptional events, and uses the weight-added mode to accurately analyze the occurrence of food safety risk factors. Therefore, this study proposes a weight-added class-imbalance learning unit to solve this problem. The original data use AND-SMOTE to generate new data points, strengthening the weight value of all data for analysis, such that the information will not be ignored because some data are too small [34]. After generating new relevant data points, the weight values of all data are readjusted, using feature extraction algorithms to reduce the amount of data that need to be calculated and to maintain the original appearance of the data [35]. Figure 8 illustrates the flowchart of the weight-added class-imbalance learning unit, including the oversampling phase, the downsampling phase, and the create hypothesis phase. (1) Oversampling phase In the oversampling phase-although the data have first been processed by the heterogeneous data integration unit-in order to increase effects of the significant rare data, the AND-SMOTE algorithm is responsible for increasing the balance and weight of significant rare data. The AND-SMOTE algorithm is composed of the automatic neighborhood size determination (AND) and the synthetic minority oversampling technique (SMOTE). The AND uses characteristic values to divide data into appropriate adjacent area sizes, as seen in Equations (1) where i and j are counters, m is the number of samples, k is the neighborhood size, and T is number of minority class samples.
After determining the neighborhood size k for SMOTE, xref is a randomly selected object from the minority neighborhood of x by AND. The magnification α is a random variable between 0 and 1, generated from the neighbor xnew randomly selected from the overall data object. Equation (4) is used to determine the xnew. Equation (5) is used to produce a new dataset S to increase the weighting of the original x.
(2) Downsampling phase This phase uses the edited nearest neighbors (ENN) algorithm to clean the database by removing samples close to the decision boundary, based on the heterogeneous value difference metric (HVDM) [36,37]. HVDM considers the similarity of each value to calculate the distances between these values. The ENN algorithm weighs the weight of neighboring points based on the distance between each point and target point, Sq. The weight value is calculated by the ENN algorithm, as described in Equation (6). (1) Oversampling phase In the oversampling phase-although the data have first been processed by the heterogeneous data integration unit-in order to increase effects of the significant rare data, the AND-SMOTE algorithm is responsible for increasing the balance and weight of significant rare data. The AND-SMOTE algorithm is composed of the automatic neighborhood size determination (AND) and the synthetic minority oversampling technique (SMOTE). The AND uses characteristic values to divide data into appropriate adjacent area sizes, as seen in Equations (1)-(3): where i and j are counters, m is the number of samples, k is the neighborhood size, and T is number of minority class samples. After determining the neighborhood size k for SMOTE, x ref is a randomly selected object from the minority neighborhood of x by AND. The magnification α is a random variable between 0 and 1, generated from the neighbor x new randomly selected from the overall data object. Equation (4) is used to determine the x new . Equation (5) is used to produce a new dataset S to increase the weighting of the original x.
(2) Downsampling phase This phase uses the edited nearest neighbors (ENN) algorithm to clean the database by removing samples close to the decision boundary, based on the heterogeneous value difference metric (HVDM) [36,37]. HVDM considers the similarity of each value to calculate the distances between these values. The ENN algorithm weighs the weight of neighboring points based on the distance between each point and target point, Sq. The weight value is calculated by the ENN algorithm, as described in Equation (6).
C represents 1 to k datasets with Sq. f (x) is a parameter depending on the x type.Ŝ is the closest k dataset with Sq, and the distance function d calculates its similarity. δ(a, b) is a parameter depending on a and b; if a is equal to b, δ(a, b) = 1; otherwise δ(a, b) = 0.
(3) Create hypothesis phase This phase is responsible for identifying the high-effect significant rare data via a constructed hypothesis, avoiding the wrong results. Hence, this phase is composed of three algorithms: agglomerative hierarchical clustering (AHC), ordering points to identify the clustering structure (OPTICS), and affinity propagation (AP).
Agglomerative hierarchical clustering (AHC) is used to group objects in clusters based on their similarity [38]. Through repeated iterative evolutions, all data are aggregated and classified according to the weights of each layer and, finally, a tree structure is produced. This study used the minimum variance method and Ward's method to construct the agglomerative hierarchical clustering, as shown in Formula (7). The data form clusters, which combine according to their total variance. Ward's method merges the data with the smallest differences in total variance.
Ordering points to identify the clustering structure (OPTICS) is an algorithm for finding density-based clusters in spatial data [39]. The OPTICS algorithm needs to define two parameters: ε and MinPts. ε is the maximum distance (radius) centered on this point, while MinPts is the minimum point within the range that needs to be included. ε and MinPts define the core point p, as shown in Equation (8): To calculate the distance from p to each MinPts TH , and to calculate the reachabilitythe distance that can reach p from another point o-the expressions of core distance and reachability distance are shown in Equations (9) and (10), respectively.
According to the constant iteration of the above formula, each o point is divided into the group with the most similar p points. The algorithm continuously calculates and divides the weight value of each candidate p point to each remaining o point, outputting the result when there is no independent or unclassified o point in the space. The OPTICS algorithm does not need to specify the number of clusters in advance; it is a clustering algorithm divided by data density, so it can find highly unique cluster centers that are suitable for constructing hypotheses.
The affinity propagation (AP) algorithm uses the similarity and attractiveness of each point to group data [40]. The best clustering result is based on multiple iterations and weight correction. The calculation process is to set all sample points as the center point of the candidate group, in order to find the exemplar point that best represents the center of the group, and to establish the similarity between all points. Equation (11) describes the similarity: After the similarity is calculated, the iterative method is used to continuously adjust the information on the degree of attraction between the sample points-responsibility and availability-as shown in Formulae (12) and (13).
a(i, k) ← min 0, r(k, k) + ∑ i / ∈{i,k} max(0, r i , k ) for i = k and a(k, k) ← ∑ i / ∈k max(0, r i , k ) (13) where r() is responsibility and a() is availability. Continuously correcting the r(i,k) and a(i,k) and the position of the group center, until all of the data stop changing, the final group center position can be found.

Deep Learning Training Unit
The deep learning training unit is based on long short-term memory mechanisms, and is responsible for observing the hypotheses of the reasoned effect factors and significant rare data. A traditional artificial neural network can manipulate the data at single point in time. The long short-term memory combines the data of past events in order to comprehensively consider the correlation between time and data. Using the Siamese network, the past section time data and the new section time data can be converted into comparable values in order to further understand whether the production situation has changed. Hence, the key advantage of the deep learning training unit is that it does not use data at a single point in time as the basis for judgment, but combines historical data to analyze new event conditions and understand the current status by comparing the data of the period of time. Figure 9 illustrates a detailed flowchart of the deep learning training unit.
Long short-term memory networks are used to classify, process, and make predictions based on time-series data [41]; they can handle the unknown relationship between important events in a time series, and work via multiple cells, each with an input gate and an output gate. Hence, this study developed them to manipulate the significant rare data and effects. Figure 10 illustrates the working flow of the long short-term memory network.
After the similarity is calculated, the iterative method is used to continuously adjust the information on the degree of attraction between the sample points-responsibility and availability-as shown in Formulae (12) and (13).
where r() is responsibility and a() is availability. Continuously correcting the r(i,k) and a(i,k) and the position of the group center, until all of the data stop changing, the final group center position can be found.

Deep Learning Training Unit
The deep learning training unit is based on long short-term memory mechanisms, and is responsible for observing the hypotheses of the reasoned effect factors and significant rare data. A traditional artificial neural network can manipulate the data at single point in time. The long short-term memory combines the data of past events in order to comprehensively consider the correlation between time and data. Using the Siamese network, the past section time data and the new section time data can be converted into comparable values in order to further understand whether the production situation has changed. Hence, the key advantage of the deep learning training unit is that it does not use data at a single point in time as the basis for judgment, but combines historical data to analyze new event conditions and understand the current status by comparing the data of the period of time. Figure 9 illustrates a detailed flowchart of the deep learning training unit.
Long short-term memory networks are used to classify, process, and make predictions based on time-series data [41]; they can handle the unknown relationship between important events in a time series, and work via multiple cells, each with an input gate and an output gate. Hence, this study developed them to manipulate the significant rare data and effects. Figure 10 illustrates the working flow of the long short-term memory network.

Deep Learning Training Unit
After the similarity is calculated, the iterative method is used to continuously adjust the information on the degree of attraction between the sample points-responsibility and availability-as shown in Formulae (12) and (13).
where r() is responsibility and a() is availability. Continuously correcting the r(i,k) and a(i,k) and the position of the group center, until all of the data stop changing, the final group center position can be found.

Deep Learning Training Unit
The deep learning training unit is based on long short-term memory mechanisms, and is responsible for observing the hypotheses of the reasoned effect factors and significant rare data. A traditional artificial neural network can manipulate the data at single point in time. The long short-term memory combines the data of past events in order to comprehensively consider the correlation between time and data. Using the Siamese network, the past section time data and the new section time data can be converted into comparable values in order to further understand whether the production situation has changed. Hence, the key advantage of the deep learning training unit is that it does not use data at a single point in time as the basis for judgment, but combines historical data to analyze new event conditions and understand the current status by comparing the data of the period of time. Figure 9 illustrates a detailed flowchart of the deep learning training unit.
Long short-term memory networks are used to classify, process, and make predictions based on time-series data [41]; they can handle the unknown relationship between important events in a time series, and work via multiple cells, each with an input gate and an output gate. Hence, this study developed them to manipulate the significant rare data and effects. Figure 10 illustrates the working flow of the long short-term memory network.  (1) The first step is to decide whether the information needs to be forgotten. This decision is determined by σ-the forget gate-as shown in Equation (14). The value of σ between the previous output and the current input will be between 0 and 1. If f t is 1, it means that the memory is completely retained; if f t is 0, it means that the memory is completely forgotten;

Deep Learning Training Unit
(2) Data will be recorded and updated by the value of σ between 0 and 1 from the input gate. If i t is 1, it means complete replacement; if i t is 0, it means no replacement is allowed. These processes are shown in Equations (15) and (16). After determining i t , C t is based on tanh() to create a new candidate value and to update the cell state; (3) C t is a new state that combines C t−1 and C t , as described in Equation (17). ft is used to determine the influence of previous memories from the forget gate, as well as to determine the current state of influence from the input gate; (4) Finally, the hypothesis value h(t) is used to evaluate whether the significant rare data are memorized or forgotten. The detailed formulae are described in Equations (18) and (19): After constructing the long short-term memory network, this study used a Siamese network to evaluate the status and to check the consistency of auditing and HACCP [42,43]. A schematic of the Siamese network is shown in Figure 11.
(2) Data will be recorded and updated by the value of σ between 0 and 1 from the put gate. If it is 1, it means complete replacement; if it is 0, it means no replacem is allowed. These processes are shown in Equations (15) and (16). After determin it, ̃ is based on tanh() to create a new candidate value and to update the cell st Ct is a new state that combines −1 and ̃, as described in Equation (17). ft is u to determine the influence of previous memories from the forget gate, as well a determine the current state of influence from the input gate; = * −1 + * ̃ ( (4) Finally, the hypothesis value h(t) is used to evaluate whether the significant data are memorized or forgotten. The detailed formulae are described in Equat (18) and (19): After constructing the long short-term memory network, this study used a Siam network to evaluate the status and to check the consistency of auditing and HAC [42,43]. A schematic of the Siamese network is shown in Figure 11. After factors are calculated via the Siamese network formed by the long short-t memory model, the fully connected layer evaluates food safety factors, allergen fac food defense factors, and food fraud factors. Figure 12 illustrates the fully conne layer used to evaluate the scenario of food safety. Figure 13 illustrates each node of fully connected layer in detail. After factors are calculated via the Siamese network formed by the long short-term memory model, the fully connected layer evaluates food safety factors, allergen factors, food defense factors, and food fraud factors. Figure 12 illustrates the fully connected layer used to evaluate the scenario of food safety. Figure 13 illustrates each node of the fully connected layer in detail.
The function of the fully connected layer is to multiply the value of all connected upper-layer nodes by the weight of their edge, then combine all of the products and bias to obtain the value of the node [44]. Finally, the result of the tanh function, with a value between 1 and −1, is output, as shown in Equation (20).
The function of the fully connected layer is to multiply the value of all connecte upper-layer nodes by the weight of their edge, then combine all of the products and bi to obtain the value of the node [44]. Finally, the result of the tanh function, with a valu between 1 and −1, is output, as shown in Equation (20). = tanh(�( * ) + ) (20 Hence, this study designed an inference engine to collect and classify relevant da using heterogeneous data integration units. The significant rare data balance unit is use to find important significant rare data. The deep learning training unit is used to extra the significant rare data and to generate a deep learning model that identifies factors r lated to food safety risks

Results and Discussion
This section presents the IOTA Tangle-based intelligent food safety service platfor for bubble tea, including the significant rare data and system interfaces. The current st tus of each batch number can be seen in the system, and the boxes show the current st tus of the Tangle system. Figure 14 illustrates the states of Tangle. The green box mean that the data are uploaded to Tangle, the orange box means that the data are missin and the gray box means that is the data are scheduled for production jobs. Each circ represents an HACCP audit point. Green indicates processes that have been complete and passed the CCP, blue indicates current working processes, and gray indicates wa ing processes. Figure 15 illustrates the detailed hazard analysis and critical control poi of the tea. The production CCPs is checked and recorded. Table 2 shows an example the HACCP for bubble tea ingredients. Processes 2021, 9, 1937 13 of 19 The function of the fully connected layer is to multiply the value of all connected upper-layer nodes by the weight of their edge, then combine all of the products and bias to obtain the value of the node [44]. Finally, the result of the tanh function, with a value between 1 and −1, is output, as shown in Equation (20).
Hence, this study designed an inference engine to collect and classify relevant data using heterogeneous data integration units. The significant rare data balance unit is used to find important significant rare data. The deep learning training unit is used to extract the significant rare data and to generate a deep learning model that identifies factors related to food safety risks

Results and Discussion
This section presents the IOTA Tangle-based intelligent food safety service platform for bubble tea, including the significant rare data and system interfaces. The current status of each batch number can be seen in the system, and the boxes show the current status of the Tangle system. Figure 14 illustrates the states of Tangle. The green box means that the data are uploaded to Tangle, the orange box means that the data are missing, and the gray box means that is the data are scheduled for production jobs. Each circle represents an HACCP audit point. Green indicates processes that have been completed and passed the CCP, blue indicates current working processes, and gray indicates waiting processes. Figure 15 illustrates the detailed hazard analysis and critical control point of the tea. The production CCPs is checked and recorded. Table 2 shows an example of the HACCP for bubble tea ingredients. Output = tanh ∑ (X n * W n ) + bias (20) Hence, this study designed an inference engine to collect and classify relevant data using heterogeneous data integration units. The significant rare data balance unit is used to find important significant rare data. The deep learning training unit is used to extract the significant rare data and to generate a deep learning model that identifies factors related to food safety risks

Results and Discussion
This section presents the IOTA Tangle-based intelligent food safety service platform for bubble tea, including the significant rare data and system interfaces. The current status of each batch number can be seen in the system, and the boxes show the current status of the Tangle system. Figure 14 illustrates the states of Tangle. The green box means that the data are uploaded to Tangle, the orange box means that the data are missing, and the gray box means that is the data are scheduled for production jobs. Each circle represents an HACCP audit point. Green indicates processes that have been completed and passed the CCP, blue indicates current working processes, and gray indicates waiting processes. Figure 15 illustrates the detailed hazard analysis and critical control point of the tea. The production CCPs is checked and recorded. Table 2 shows an example of the HACCP for bubble tea ingredients. Processes 2021, 9,1937 14 of 19  IF4BT observes and extracts significant rare data that are high risk via the deep learning inference engine. Figure 16 illustrates the significant rare data of high risk. When a rare factor is found, IF4BT informs the manager. Although this point is not included in the CCP, it should be considered a factor of the CCP.  IF4BT observes and extracts significant rare data that are high risk via the deep learning inference engine. Figure 16 illustrates the significant rare data of high risk. When a rare factor is found, IF4BT informs the manager. Although this point is not included in the CCP, it should be considered a factor of the CCP. IF4BT observes and extracts significant rare data that are high risk via the deep learning inference engine. Figure 16 illustrates the significant rare data of high risk. When a rare factor is found, IF4BT informs the manager. Although this point is not included in the CCP, it should be considered a factor of the CCP.  Figure 17 illustrate the system interface of each CCP. Each CCP is listed by the batch number. The green cycle is the process is completed and passed the CCP. The blue cycle is the current working process. The gray process is the waiting process. If the internal check is finished, the status, the system time and the signature are recorded. Figure 18 illustrates a detailed report of a CCP. If the manager wants to know the detailed information on each check point, they can click the check point to receive it. This will show who performed the check, the detailed time stamp of the check, and the signature of the checker. Finally, the result of the check will be displayed.
Processes 2021, 9,1937 15 of 19 Using thermometer; stainless steel dryer. Figure 17 illustrate the system interface of each CCP. Each CCP is listed by the batch number. The green cycle is the process is completed and passed the CCP. The blue cycle is the current working process. The gray process is the waiting process. If the internal check is finished, the status, the system time and the signature are recorded. Figure 18 illustrates a detailed report of a CCP. If the manager wants to know the detailed information on each check point, they can click the check point to receive it. This will show who performed the check, the detailed time stamp of the check, and the signature of the checker. Finally, the result of the check will be displayed.

Conclusions
This study presents an IOTA Tangle-based intelligent food safety service platform for bubble tea (IF4BT), which modularizes and integrates hazard analysis and critical control point (HACCP) principles to increase data transparency. The inference engine is based on a deep learning mechanism composed of long short-term memory, a Siamese network, and a fully connected layer. IF4BT can enable inspectors to quickly carry out daily inspections in accordance with the HACCP procedures. All data are stored in Tangle for fast tracking and tracing. Five problems are considered and overcome: convenience, consistency, effectiveness, scalability, and lightweight computing-(1) IF4BT uses a web-based service platform to overcome the problem of convenience; (2) IF4BT uses distributed ledger techniques to overcome the problem of data consistency; (3) IF4BT

Conclusions
This study presents an IOTA Tangle-based intelligent food safety service platform for bubble tea (IF4BT), which modularizes and integrates hazard analysis and critical control point (HACCP) principles to increase data transparency. The inference engine is based on a deep learning mechanism composed of long short-term memory, a Siamese network, and a fully connected layer. IF4BT can enable inspectors to quickly carry out daily inspections in accordance with the HACCP procedures. All data are stored in Tangle for fast tracking and tracing. Five problems are considered and overcome: convenience, consistency, effectiveness, scalability, and lightweight computing-(1) IF4BT uses a web-based service platform to overcome the problem of convenience; (2) IF4BT uses distributed ledger techniques to overcome the problem of data consistency; (3) IF4BT

Conclusions
This study presents an IOTA Tangle-based intelligent food safety service platform for bubble tea (IF4BT), which modularizes and integrates hazard analysis and critical control point (HACCP) principles to increase data transparency. The inference engine is based on a deep learning mechanism composed of long short-term memory, a Siamese network, and a fully connected layer. IF4BT can enable inspectors to quickly carry out daily inspections in accordance with the HACCP procedures. All data are stored in Tangle for fast tracking and tracing. Five problems are considered and overcome: convenience, consistency, effectiveness, scalability, and lightweight computing-(1) IF4BT uses a webbased service platform to overcome the problem of convenience; (2) IF4BT uses distributed ledger techniques to overcome the problem of data consistency; (3) IF4BT uses IOTA Tangle to overcome the problems of effectiveness and scalability; (4) IF4BT uses lightweight equipment (a Raspberry Pi) to automatically set reasoned rules via the cloud-based deep learning service platform to overcome the problem of lightweight computing. Hence, the four contributions of this study: (1) To develop a verification unit for food safety risk factors. IF4BT is responsible for collecting the manufacturer's usual production information and internal and external audit information in order to analyze the risk factors.; (2) To construct an HACCP-based food safety management system. It has highly flexible for ISO 22000, FSSC 22000, SQF, and BRCGS; (3) To design a deep learning inference engine. The inference engine is responsible for identifying the significant rare data of the risk factors for food safety; (4) To maintain data consistency and transparency. Using the tamperproof mechanism of the IOTA Tangle platform, the internal inspection data are input to the food safety supply chain-including raw materials, production processes, processing, sales, and audit data-on a daily basis in order to ensure the correctness of the information of the food manufacturer. Each check is an audit to increase food safety and reduce food safety issues such as food expiration, food defense, and food fraud.
Three areas of study are proposed for future researchers: (1) to design and to develop a knowledge base-this increases the HACCP domain knowhow of other products, and can improve the flexibility of the proposed intelligent food safety service platform; (2) to increase other food safety certifications-this increases the ability of data compatibility; and (3) to integrate with the business flow, logistics flow, information flow, and money flow in a service platform. Finally, IF4BT can be a reference model for researchers and engineers.