Integrating Machine Learning, Radio Frequency Identiﬁcation, and Consignment Policy for Reducing Unreliability in Smart Supply Chain Management

: Adopting smart technologies for supply chain management leads to higher proﬁts. The manufacturer and retailer are two supply chain players, where the retailer is unreliable and may not send accurate demand information to the manufacturer. As an advanced smart technology, Radio Frequency Identiﬁcation (RFID) is implemented to track and trace each product’s movement on a real-time basis in the inventory. It takes this supply chain to a smart supply chain management. This research proposes a Machine Learning (ML) approach for on-demand forecasting under smart supply chain management. Using Long-Short-Term Memory (LSTM), the demand is forecasted to obtain the exact demand information to reduce the overstock or understock situation. A measurement for the environmental effect is also incorporated with the model. A consignment policy is applied where the manufacturer controls the inventory, and the retailer gets a ﬁxed fee along with a commission for selling each product. The manufacturer installs RFID technology at the retailer’s place. Two mathematical models are solved using a classical optimization technique. The results from those two models show that the ML-RFID model gives a higher proﬁt than the existing traditional system.


Introduction
In supply chain management (SCM), each manufacturer observes customer demand to decide production quantity. If the demand is uncertain, either overstock or understock situations occur at the stockholder. The supply chain players play an essential role in this situation. Demand uncertainty can be caused if the players are unreliable or if they are not exchanging proper information between them. Depending on the demand, the manufacturer produces products and sends those products to the retailer for selling. The retailer controls the overall inventory and sends money to the manufacturer after selling the products. However, sometimes, the retailer is untrustworthy. The retailer does not provide accurate information to the manufacturer. In this case, a consignment policy is applied to reduce the information asymmetry between supply chain players. Additionally, using smart technology, such as RFID, gives an extra advantage for getting real-time information about the products in the inventory. Incorporating these features takes supply chain management to an advanced level as compared to traditional systems.
A consignment policy is an approach where the manufacturer manages the financial part, and the retailer maintains the operational functions. In this policy, the manufacturer acts as a leader, and the retailer acts as a follower. The overall inventory is controlled by the manufacturer only. The retailer sells those products and sends its revenue to the manufacturer ( [1,2]). The manufacturer provides specific incentives to the retailer for each product sold ( [3]). By a mutual agreement between both the players, unsold products are returned to the legal owner. Additionally, manufacturing the products needs to be perfect to improve the quality of the products (Sarkar and Sarkar [4]). A service-level constraint for quality improvement was introduced by Moon et al. [5]. By implementing a service-level restriction in this supply chain model, the product quality is improved by optimizing the model through a distribution-free approach ( [6]). This improved service strategy for the consignment stock is applied in this research to extend the traditional model to gives a higher profit.
A typical warehouse has certain limitations when tracking and tracing products, which causes information asymmetry with unnecessary time consumption. As an advanced smart technology, RFID is implemented to observe every single movement of each product in the inventory ( [7]). The real-time tracking, tracing, and inspecting of products are done in a completely automatic procedure with minimum human interaction and minimum labor cost. This automation policy allows for smart supply chain management in the model ( [8]). Another profitable strategy for using RFID is that RFID tags can reuse, and some amount of data can be stored if needed ( [9]). RFID tags are attached to products and traced by scanners mounted in the warehouse. The real-time information of each product is sent to a server, and users can get information about the inventory from the server by using a computer or smartphone. RFID implementation in supply chain management for optimizing unreliability and increasing the overall profit was briefly described by Biswal et al. [10].
Demand forecasting plays a crucial role in this smart supply chain model. When the retailer is unreliable and does not send accurate demand information to the manufacturer, demand uncertainty arises ( [11]). Thus, the manufacturer cannot predict the exact market demand. In this situation, the holding cost or shortage cost increases, and the supply chain management system's total profit decreases ( [12]). Based on previous demand data, LSTM is applied to forecast future demand. A multi-layer LSTM network was proposed by Abbasimehr et al. [13], which can predict volatile demand data by considering various combinations of LSTM hyperparameters for a provided time series. Using the expected demand leads to higher profit over the traditional model ( [14]). By the forecasted demand, an essential competitive advantage is created for generating superior business performance ( [15]). As information asymmetry makes this type of smart supply chain management unreliable, this study maximizes the entire smart supply chain's total profit by minimizing the unreliability between supply chain players.
In the existing traditional system, the retailer is unreliable and does not provide accurate inventory information to the manufacturer. As a result, information asymmetry arises between supply chain players. The manufacturer does not have precise demand information for future production. The main contribution of this research is forecasting the future demand using an LSTM method and incorporating RFID and a consignment policy into the traditional model where the profit will be higher than the traditional system. As information asymmetry arises in the traditional model, RFID is implemented to get accurate information about the inventory on a real-time basis for reducing unreliability. A consignment policy is applied in which the manufacturer controls the overall inventory, and the retailer sells the products and sends the money to the manufacturer. The manufacturer provides a fixed amount of money and a commission to the retailer for selling each product. As the demand is forecasted using LSTM, the manufacturer produces goods based on the predicted future demand ( [16]). In this case, the holding cost or the shortage cost is minimized, and the total profit is much higher than the existing traditional model. Section 1 represents the introduction and contribution of authors along with problem definition, assumptions, and notation of this research. The rest of the research is designed as follows. Section 2 signifies the traditional model part. The modeling part of ML-RFID model is formulated in Section 3. The results of this research are described in Section 4 followed by a sensitivity discussion on the parameters. Section 5 contains managerial implications and Section 6 concludes this research.

Contribution of Authors
An author's contribution table based on keywords and literature is described in Table 1 which clearly illustrates the contrast of this research with previous research.

Problem Definition, Assumptions, and Notation
The problem definition of this study is demonstrated in this section. The assumptions and the notation used to mathematically validate the proposed model are described below.

Problem Definition
In this study, the difference between the traditional and ML-RFID models in smart supply chain management is described for reducing unreliability. Two supply chain players, i.e., the manufacturer and the retailer, play an essential role but have asymmetric power. Due to the unreliable retailer, the Stackelberg game approach is applied to determine the decision-maker. A future demand distribution is incorporated, dependent on the service based on a machine learning technique, i.e., LSTM. Under a consignment policy, a fixed fee is given to the retailer by the manufacturer to implement RFID, which reduces the players' information asymmetry by providing accurate information about the inventory. The retailer sells the products, sends the revenue to the manufacturer, and then gets a commission to sell each product. This research gives a higher profit than the traditional model after implementing RFID and a consignment policy with machine learning, providing smart supply chain management in this supply chain.

Assumptions
The following assumptions are used for this research.

1.
A supply chain management is considered with manufacturer and retailer as two players.

2.
Demand depends on the service b, and it is a decision variable.

3.
In the traditional model, the inventory is controlled by the retailer. In the ML-RFID model, the manufacturer controls the inventory by giving a fixed fee along with a commission to the retailer for each product sold. 4.
The lead time demand does not follow a known probability distribution function but it has a known value of the mean and standard deviation ( [26]).

5.
As the retailer is unreliable, the manufacturer installs RFID technology at the retailer's place to get real-time information about the inventory ( [24]). 6.
The demand dataset consists of ten years of data, and 65% of the data are taken as the training dataset while the remaining 35% are taken as the test dataset. 7.
The data in the test set do not exist in the train set and the demand is always greater than zero. 8.
Using Stackelberg game theory, the manufacturer acts as a leader and controls the inventory. The retailer acts as a follower of the manufacturer ( [27]).

Notation
The notation associated with this research is is attached in Appendix A.

Traditional Model
The traditional model is a conventional model where the manufacturer makes products and sends them to the retailer. In this model, the inventory is controlled by the retailer. The manufacturer receives demand information and the money after selling products from the retailer. This traditional model has some limitations related to information asymmetry. When the retailer is unreliable and does not send accurate demand information to the manufacturer, the overstock or understock situation arises.
Further, to reduce its environmental impact, each industry must follow some legislative regulations. Each industry mitigates its carbon footprint by incorporating a cap and trade policy ( [28]). For the approach, a measurement of the environmental effect is considered in this model ( [29]). This model consists of three subsections: the retailer's traditional model, the manufacturer's classic model, and the total expected profit for the conventional model. Figure 1 represents the process diagram for the traditional model.

Retailer's Traditional Model
In the traditional model, the retailer's holding cost of the total product's inventory is incurred. The entire lot is purchased at a wholesale price by the retailer from the manufacturer. Thus, the retailer maintains the products' proprietorship, and products are sold to the customer at a selling price. In this model, the manufacturer no longer has responsibility for the products. The manufacturer does not acquire holding costs for a single product after making the delivery to the retailer. The demand of the retailer is D = ab γ e β η , where a is the scaling parameter, γ and β are shape parameters, b is the service provided to the customers, and e η is the measurement of the environmental effect.
The total profit of the retailer under the traditional model is: As the retailer is unreliable, the manufacturer does not get proper demand information. Thus, an overstock or understock situation arises in this conventional model. The necessity of the lead time demand information is essential to calculate the holding or shortage during the lead time period. According to Gallego and Moon [18], without having distribution information, one can easily calculate the expected holdings or shortages during the lead time demand using a lemma that places an upper bound. Using this lemma, one has the following: The expected quantity for overstock: (ii) The expected quantity for understock: Using Equations (2) and (3), the expected profit for the retailer is calculated for Equation (1), and it can be written as:

Manufacturer's Traditional Model
The manufacturer follows the optimized decision in the traditional model. The revenue of the manufacturer is ωQ r and the manufacturing cost is kQ r where the unit production cost is denoted by k. The expected profit of the manufacturer under the traditional model is:

Total Expected Profit for the Traditional Model
By adding Equations (4) and (5), the total expected total profit under traditional model is According to Sardar and Sarkar [25], the maximum profit can be obtained by taking the derivative of Equation (6) with respect to Q r , L, b, and e η . Now, the optimum values for the decision variables are:

Integrating ML, RFID, and Consignment Policy to Reduce Unreliability (ML-RFID Model)
In the ML-RFID model, products are sent to the retailer by the manufacturer, and the retailer maintains the inventory. The retailer does not have to pay any money until the products are sold. In a consignment policy, two different segments are present. In the first segment, the inventory prepares for receiving the products. In the second segment, the retailer sells the products, and the revenue is sent to the manufacturer. The Stackelberg game approach is used where the manufacturer behaves as a leader by controlling the overall inventory, and the retailer acts as a follower ( [30]). The manufacturer maintains the consignment stock and sends a commission to the retailer for each product sold with a fixed fee; thus, the retailer's revenue is generated. Implementation of RFID reduces the unreliability by real-time tracking of each product in the inventory, which leads to higher profits than the traditional model. LSTM is used to remove the uncertainty of demand so that the holding cost and shortage cost are minimized, increasing the profit of the whole smart supply chain management strategy. The implementation of demand forecasting over the ML-RFID model is shown in Figure 2.  In the LSTM method, the training and test datasets are taken in some specific ratio. The demand patterns are similar for each day. A preprocessing step is done by assigning different weights to the input data, affecting the forecasted data ( [33]). For LSTM, a dataset matrix is formed by converting the array value, followed by reshaping the inputs for predicting and checking the performance matrices. Figure 3 is a pictorial representation of the LSTM architecture used in demand forecasting.

Formulation of LSTM
Suppose, there is a time series dataset of length t j that consists of {i(t 1 ), i(t 2 ), i(t 3 ), . . . , i(t j )} which should first be preprocessed.
The time series i(t j ) is normalized and the result is given below: In Equation (7), max(s) and min(s) are the maximum and minimum values of s.
Reshaping the data is done in this step followed by splitting the data into a training set and test set. The training set is considered as the input of the LSTM model and the prediction evaluation in terms of the performance metrics is done in the test set.
i(t j ) is the input demand of the first LSTM hidden layer with previous state h 1 (t j−1 ), which gives the forecasted output demand at time t j−1 . The output value of the first layer is h 1 (t j ). The output forecasted demand of the first layer is used by the second hidden layer with the previous output. The output of hidden layer h 1 (t j ) is used with the previous state to calculate the output value of h 2 (t j ). Then, h 2 (t j−1 ) becomes the input demand of the next hidden layer along with output value h 2 (t j ). This process continues in each layer by calculating the time series demand data prediction and passing the value to the next layer until the last layer computes the final forecasted demand. The forecasted output for the output gate is denoted by o(t j ) at time t j .

Retailer's Expected Profit for the ML-RFID Model
The retailer carries the product's inventory but does not hold the total product's inventory holding cost. This setup functions as a follower in the Stackelberg game approach. Only the operation parts, such as the stock handling and storage area, are carried by the retailer, which means the holding cost (h CM r ) and the shortage cost (s CM r ) are carried by the retailer. To remove the unreliability of the traditional model, RFID is introduced and the cost for RFID implementation (Q r ) is borne by the retailer. The retailer also receives an amount of money (T) for installing the RFID system from the manufacturer. The retailer agrees to a contract and makes revenue from the commission and the fixed fee given by the manufacturer. Therefore, the retailer's expected total profit for the ML-RFID model is (2) and (3), the expected profit of the retailer can be written as

Manufacturer's Expected Profit for the ML-RFID Model
As per the Stackelberg game approach, the manufacturer is the leader. Thus, the ownership of the warehouse is controlled by the manufacturer only. The manufacturing cost of the manufacturer is kQ r . The financial part of holding cost (h CM m ) and shortage cost (s CM m ) are controlled by the manufacturer over traditional model. The manufacturer also gives a fixed fee (A) to the retailer.
Therefore, the manufacturer's expected total profit for the ML-RFID model is Again, using Equation (2) and Equation (3) the expected profit of the manufacturer can be written as

Total Expected Profit for the ML-RFID Model
By adding Equations (8) and (9), the expected total profit under the ML-RFID model can be formulated as: The maximum profit can be obtained by taking the derivative of Equation (10) with respect to Q r , L, b, and e η (see Appendix B). Now, the optimal values for the decision variables are:

Experimental Results
The experimental results section of this research along with validation of the model are given below. We have added subsections that describe the results of demand forecasting in Section 4.1, compared the proposed model and the traditional model in Section 4.2, and presented the sensitivity analysis in Section 4.3.

Forecasting Using LSTM
The demand dataset for forecasting consists of ten years of data. Depending on the complexity and number of features of the dataset, different numbers of hidden layers and cells are considered. Initially, one hidden layer with four memory cells is used to forecast the demand. The dataset is divided into two separate parts: training and test sets. The training set is input to the LSTM network to obtain forecasting model. The test set is used to evaluate the predictive power of the built model in terms of performance metrics. Based on the dependent and independent features, the first 65% of data are used for the training dataset and the remaining data are used as the test dataset. Thus, data in the test set do not exist in the training set. If the performance is not increasing, another hidden layer and memory cells are added to make the network deeper. As a result, the forecasted demand is incorporated with the mathematical model and the total profit with the ML-RFID model is observed to be higher than it is with the traditional system. To find the optimized hyper-parameters, Table 2 represents the values for the experiment and Figure 4 represents the result obtained by the experiment. The blue line represents the plotting of the 10-year demand dataset, the orange line denotes the training set, and the green line indicates the test set on forecasting.

Numerical Analysis
A numerical study is used to validate the experimental results. Supportive data are taken from Sarkar et al. [23] and Sardar and Sarkar [25].  Table 3.
The expected profits of both the traditional and ML-RFID models are computed in this numerical experiment. The optimal result is evaluated by optimizing the decision variables and considering parameters values. The traditional model has given a good result. However, forecasting the future demand using LSTM reduces the holding cost and the shortage cost which produces better result. Implementation of RFID and consideration of environmental measurements lead the ML-RFID model to higher profit than the traditional model. The total profit with the ML-RFID model is 43% higher compared to the traditional model. The retailer's order quantity is increased in the ML-RFID model. The lead time crashing cost is used to reduce the lead time which is giving a time of three weeks. The result shows the ML-RFID model gives better service than the traditional model. It also indicates the environmental measurement of the ML-RFID model is $65.433/unit where the environmental measurement of the traditional model is $96.587/unit. Thus, the environmental measurement of the ML-RFID model is much lower than the traditional model. Thus, the environment will be better protected and benefited from this proposed ML-RFID model. Therefore, it can be considered that the ML-RFID model is better in every aspect compared to the traditional model.

Sensitivity Analysis
The sensitivity analysis was carried out numerically by analyzing how changing the parameters impacted the expected profit. This is done by varying each parameter from −50% to +50% while keeping the other parameters unchanged. The key parameters taken for the sensitivity analysis are p (retail price), h CM r (retailer's holding cost for the ML-RFID model), s CP r (retailer's shortage cost for the ML-RFID model), h CP m (manufacturer's holding cost for the ML-RFID model), s CP m (manufacturer's shortage cost for the ML-RFID model), (cost of one RFID tag), T (fixed cost given by the manufacturer to the retailer for implementing RFID system), and k (manufacturing cost). Positive changes indicate more total profit, while negative changes indicate less total profit. The retail price p is the most sensitive parameter and the fixed cost for RFID implementation T is the least sensitive parameter. Table 4 represents the sensitivity analysis of the key parameters and a graphical representation is shown in Figure 5.

Managerial Implications
In this research, the manufacturer is the leader, and the retailer is the follower. As the leader typically has more power than the follower, the manufacturer makes all the decisions. The retailer needs to sell the product and send money to the manufacturer. It is how the manufacturer generates profit from the business. The manufacturer gives a fixed fee and a commission for each product sold for the retailer's gain. The total profit with the ML-RFID model is higher than it is with the traditional model. Implementation of advanced technology, such as RFID, reduces the unreliability and helps industry managers earn more profit. The holding cost and shortage cost are minimized by forecasting the demand using LSTM, which is the most important thing for the business. Another essential matter is that security is increased by installing RFID technology. The fixed fee provided by the manufacturer to the retailer is helpful for the retailer. It shows that the industry can choose the consignment policy and a fixed price to secure a higher profit. Thus, the combination of each factor in the ML-RFID model is highly productive for the industry.

Conclusions
The proposed ML-RFID model provided a higher profit than the traditional model by reducing the unreliability. As the retailer was unreliable and did not provide proper demand information to the manufacturer, demand uncertainty raised. LSTM was used to forecast the future demand, and the holding cost and the shortage cost were minimized compared to the traditional model. The manufacturer took overall control of the inventory by implementing advanced smart technology RFID at retailer's place. The consignment policy played an essential role in generating profit for the manufacturer and the retailer's profit was developed by a fixed fee and a commission given by the manufacturer for selling each product. As a result, the joint total profit of the ML-RFID model was much higher than the traditional model. Although the unreliability was reduced in the proposed model and a higher yield was generated, the degree of unreliability was not considered, which is a current limitation of this research. As the retailer sells the products traditionally, online and offline selling policies were not contemplated, which can be a future extension of this research. Another immediate future extension can be done by comparing this model with Cournot competition model and the Bertrand competition model, which will open a new era of research in smart supply chain management. The environmental effect measurement was minimized in the ML-RFID model, which is essential as the environmental concerns regarding carbon emissions continue to increase around the world.

Appendix B
The maximum profit can be obtained by taking the derivative of Equation (10) with respect to Q r , L, b, and e η .