Forecasting and Inventory Planning: An Empirical Investigation of Classical and Machine Learning Approaches for Svanehøj’s Future Software Consolidation

Wahedi, Hadid J.; Heltoft, Mads; Christophersen, Glenn J.; Severinsen, Thomas; Saha, Subrata; Nielsen, Izabela Ewa

doi:10.3390/app13158581

Open AccessArticle

Forecasting and Inventory Planning: An Empirical Investigation of Classical and Machine Learning Approaches for Svanehøj’s Future Software Consolidation

by

Hadid J. Wahedi

^1,†,

Mads Heltoft

^1,*,†,

Glenn J. Christophersen

²,

Thomas Severinsen

²,

Subrata Saha

^1,*

and

Izabela Ewa Nielsen

¹

Department of Materials and Production, Aalborg University, Fibigerstræde 16, 9220 Aalborg, Denmark

²

Svanehøj Danmark A/S, Fabriksparken 6, 9230 Svenstrup, Denmark

^*

Authors to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2023, 13(15), 8581; https://doi.org/10.3390/app13158581

Submission received: 14 June 2023 / Revised: 8 July 2023 / Accepted: 18 July 2023 / Published: 25 July 2023

Download

Browse Figures

Versions Notes

Abstract

Challenges related to effective supply and demand planning and inventory management impose critical planning issues for many small and medium-sized enterprises (SMEs). In recent years, data-driven methods in machine learning (ML) algorithms have provided beneficial results for many large-scale enterprises (LSE). However, ML applications have not yet been tested in SMEs, leaving a technological gap. Limited recourse capabilities and financial constraints expose the risk of implementing an insufficient enterprise resource planning (ERP) setup, which amplifies the need for additional support systems for data-driven decision-making. We found the forecasts and determination of inventory management policies in SMEs are often based on subjective decisions, which might fail to capture the complexity of achieving performance goals. Our research aims to utilize the leverage of ML models for SMEs within demand and inventory management by considering various key performance indicators (KPI). The research is based on collaboration with a Danish SME that faced issues related to forecasting and inventory planning. We implemented the following ML models: Artificial Neural Network (ANN), Long Short-Term Memory (LSTM), Support Vector Regression (SVR), Random Forest (RF), Wavelet-ANN (W-ANN), and Wavelet-LSTM (W-LSTM) for forecasting purposes and reinforcement learning approaches, namely Q-learning and Deep Q Network (DQN) for inventory management. Results demonstrate that predictive ML models perform superior concerning the statistical forecasting approaches, but not always if we focus on industrial KPIs. However, when ML models are solely considered, the results indicate careful consideration must be regarded, given that model evaluation can be perceived from an academic and managerial perspective. Secondly, Q-learning is found to yield preferable economic results in terms of inventory planning. The proposed models can serve as an extension to modern ERP systems by offering a data-driven approach to demand and supply planning decision-making.

Keywords:

reinforcement learning; inventory management system; machine learning; forecasting

1. Introduction

Businesses within supply chains often consist of different relative sizes, resulting in firms having various financial, management, knowledge, and technology capabilities and resources available to secure efficient supply chain planning [1]. These capabilities make substantial impacts on the business goals of small and medium-sized enterprises (SME) experience [2]. To investigate the origin of this key issue, ref. [3] describes the future of Enterprise resource planning systems (ERP) and the associated software complications in their comprehensive study on the practical implications of ERP systems. The authors found that digitalization will enable new opportunities for SMEs, since domain knowledge can serve as a modifier to the forecasting framework by incorporating distinctive features.

In the literature on SMEs and ERP implementation, we found there are substantial barriers and challenges still remaining, and ERP implementations should not be categorized as standard IT projects [4]. Rather, an ERP implementation is an organizational change that requires interdepartmental participation across the whole organization [5]. Ref. [6] found reduced inventory to be the most common, while reduced planning, decreased lead time, and improved communication and decision-making are all established organizational gains after successful ERP implementation. Although ERP systems can impact SMEs positively, acknowledgment and awareness regarding potential implementation risks and issues associated with unsuccessful implementation must be considered. Ref. [7] identified the risks in the ERP adaptation for SMEs are mainly SMEs limited resources and their distinct operating characteristics, making the implementation case different from larger enterprises. ERP systems can adapt to manufacturing environmental changes but not uncertainty, which generally constitutes the random interventions of the real world. SMEs tend to use buffering and dampening techniques to manage the aspect of uncertainty impact with the competitive environment, strategic objective, and manufacturing structures. As a response to the standardized commercial commodity ERP software solutions, more recently, ref. [8] proposed free, open-source ERP as a framework, enabling cost-effective, customized ERP solutions for SMEs, who lack the finical capacity to invest in commercial ERP systems. Additionally, the authors argue free, open-source ERP can serve and support.

Accurate demand forecast forms the backbone of supply chain planning since a forecast is a main facilitator of efficient procurement, production planning, control, and inventory management [9,10]. Forecasts establish the operational and strategic information background that establishes the future decision-making processes of an organization. Data limitations or limited sources of historical data often prevent the application of data-driven models. To combat this issue, ref. [11] propose a set of grouping schemes based on different grouping criteria for product classification to enhance the application of machine learning models with the objective of learning future demand. However, the qualitative forecast methodology lacks a generic method approach to track the overall performance related the forecast accuracy and the potential benefits of combing the qualitative forecast with a data-driven forecast to aid the identification of trends and patterns.

The forecasting inefficiency also makes a cascade effect on inventory planning. Inventory management should be assessed as an enabling competence, and in order to fully utilize the company’s forecasting tools, inventory management and forecasting should be combined [12]. In practice, many SMEs reduce their competitive advantage by having inadequate inventory management. Some of the main contributing factors to inadequate inventory management within SMEs, which can be listed as lack of established planning processes, insufficient information flow between sales and the supply chain and production department and lastly, key contributing factors such as the lack of knowledge and skills of the employees [13]. Most SMEs either use Material Resource Planning (MRP) or re-order point methods as their inventory policies. But the parameters these methods use need to be closely monitored and updated in order for the models to adjust to the dynamic environmental changes in a given supply chain [14,15,16]. Theoretical methods which focus on this specific problem associated with a joint multi-item and multi-stage optimum have been widely investigated, but due to limitations and some methods being to complex, a lot of the theoretical methods are deemed infeasible. As a consequence, a gap between theoretical inventory models and practical industrial application has appeared, which [17] discusses.

Recently, we have worked with Svanehøj Danmark A/S (https://www.svanehoj.com/, accessed on 23 February 2023), where the sponsor company relies on qualitative forecasts based on the sales representative’s market intuitions in combination with external market research report. The main objective was to gain insights into future requirement quantities for the bill of material (BOM)-level raw materials, optimize the supply chain planning, and close internal information gaps between outbound and inbound logistics by conducting demand planning. The process’ aim was to initiate the initial steps of conducting sales and operations planning (S&OP) on the different product groups to balance and reduce gaps between the demand and supply chain planning. Within this case study, the company manufactures after the Make-to-Order (MTO) principle. Svanehøj’s current supply chain planning process is restrained as an MRP environment in their ERP system, which operates deterministic and suggests future procurement quantities based on a set of coverage groups. However, due to human intervention and scaled up labor costs, they find difficulties: Products with large inventory and stockouts disturb their timely delivery commitment.

This paper aims to evaluate data-driven algorithms for forecasting and inventory planning, and validate how those can benefit SMEs. We implemented eight forecasting methods and three inventory replenishment algorithms to verify future software consolidation possibilities. Therefore, our study contributes to applying Q-Learning for an inventory management problem in the reward mechanisms to allow a more efficient inventory management process. The organization of the article is as follows: In the following subsection, a comprehensive review is conducted on state-of-art literature where inventory and forecasting issues are discussed. In Section 2, we briefly introduce the sponsor company. The methodology implemented to evaluate forecast and inventory planning methods and key performance indicators (KPIs) used for testing. Section 3 represents the results. Managerial insights are drawn in Section 4, and finally, conclusive remarks are presented in Section 5.

Literature Review

Within recent years, reinforcement learning has shown potential progress within the field of operations research. Inventory planning can be semi-atomized and modeled as an environment that an agent interacts with. We present a brief overview on recent applications of machine learning methods on inventory planning in Table 1.

Table 1 reflects that various algorithms are implemented for inventory planning. We refer to recent review work by [26] for a more detailed overview in this stream of research. Regarding the application of forecasting methods, we also refer to [27,28] for recent development. Note that researchers mostly focus on inventory planning and forecasting as two separate directions. Evaluations are performed while either disregarding the importance of forecasting or assuming the idealistic forecast is already available. However, this is not the case, at least for our sponsor company. And for SMEs, the improvement of forecast accuracy and inventory performance might not be aligned and create issues with various magnitudes. Therefore, our study supplements the literature and highlights the possible consequences for SMEs.

2. Methods

2.1. Company Background

This study is based on collaboration with the Danish SME Svanehøj Danmark A/S (https://www.svanehoj.com/, accessed on 23 February 2023), which produces high- technology pump solutions for cargo pumps and safe handling of complicated fluids for the marine industry. In recent years, Svanehøj has experienced a vast demand spike across their whole product portfolio. While being affected by the supply chain disruption, mostly due to COVID and technology integration in their products, they face several issues. Primarily, the company’s challenges have been within the procurement process of raw materials while maintaining appropriate inventory levels to serve future expected demand. We use three of their critical sub-components to facilitate their production planning, to study whether their current procurement planning and inventory management are possible fundamental issues that cause the degraded company performance.

2.2. Data Used

The quantitative data used in this study are based on their ERP system and were extracted through Microsoft Dynamics AX. The data consist of two different data types, either categorical or numerical. The categorical data are both ordinal (e.g., the BOM list) and nominal (e.g., supplier information). The categorical and nominal data were used for labeling and storing the data. The numerical data were both discrete and continuous and serve to give insight into numbers and measurements. Based on the discussion with the sponsor company representatives, we selected three important sub-components, which are used for producing multiple finished products.

2.3. Methodology

The following Figure 1 provides the step-wise overview of the computational work.

As shown in Figure 1, we start by processing the data and performing time series decomposition on the processed data, to reveal trends, seasonality and residuals. We then perform a stationary test of each data set, to see whether the data are stationary or not. Lastly, we prepare the data sets to be applied on ML methods, by dividing the data into train and test sets. We then compute each method before validating the forecasts and computing their performances based on chosen KPIs. For the inventory models, we feed the models with demand information and observe how they behave and perform according to the chosen KPIs. The models are then compared and the best performing methods are identified.

2.4. Technical Details

All the methods used in this paper have been developed and written in Python Language and their implementation in TensorFlow [29].

2.4.1. Methods for Forecasting

Different statistical and data-driven methodologies are investigated in this study to evaluate a reliable forecast framework to evaluate their performance. There are respective impacts on the business from both a managerial and mathematical perspective. The selection of the forecasting methods we tested are categorized as: (i) statistical (e.g., Simple Exponential Smoothing (SES), Autoregressive Integrated Moving Average (ARIMA)); (ii) artificial intelligence (e.g., RF, SVR, W-ANN, W-LSTM). When we use a data-driven approach, we divide the data set into an 80:20 ratio to consider as a training and test set. All the forecasting methods are implemented and written in Python programming language. A brief description of each of these methods is presented below:

Simple Exponential Smoothing (SES): Simple exponential smoothing is a method that considers the previous observation in a data set and forecasts the next value by a predetermined smoothing constant

α

, typically between 0 and 1, which is added to the difference between the past value and the next. This error gap between the past and next observations and the simplicity of the model is the main driver for its piratical implementation. The smoothing constant used in our study is optimized automatically. SES is easy to learn and apply and only requires a small sample of historical data. However, the disadvantage is that the model is inaccurate for long-term forecasts and is negatively affected by volatile demand [28].

Autoregressive Integrated Moving Average (ARIMA): Autoregression integrated moving average is a statistical forecast model that seeks to capture correlations between observations in a time series data set. ARIMA can be understood by describing its components: (i) Autoregression (AR), (ii) Integrated (I), and (iii) Moving Average (MA). The (AR) uses observations from past time steps as input to the regression. The parameter p specifies the number of past time steps used. (I) is the differentiation of the original observations, which transforms non-stationary data into stationary. The parameter d specifies how many orders of difference are considered to be differentiated to become stationary. (MA) establishes autocorrelation between observations and outputs a residual error to lagged observations. The parameter q determines the number of moving averages included in the model [30]. To determine the order of the ARIMA(p,d,q) models in this paper, the Akaike Information Criterion (AIC) has been used. The ARIMA model has the advantages of capturing trends and seasonality, however, but fails to capture intrinsic non-linearity [31].

Support Vector Regression (SVR): SVR is a classical approach based on statistical learning theory. The method is based on the principle of identifying a function that minimizes risk in opposite to finding empirical errors like in linear regression [32,33]. Three hyper-parameters govern the performance of a given SVR model. The parameters are determined either through theoretical error bounds or experimentation of error by cross-validation. However, in practical cases, parameter tuning is difficult, based on the complex nature of the parameter space. This means it can be challenging to determine the relationship between the parameter and the input noise [33,34,35]. We identified SVR key advantages by the model’s ability to comprehend small data sets that consist of non-stationary and volatile data.

Random Forest (RF): Random Forest is an algorithm within machine learning, which can be used for both classification and regression. RF has the ability to serve as a method for a wide range of predictive problems especially within piratical problems [36]. RF is a method that has the advantage of being able to handle the data of both relatively small samples as well as large-scale complex data structures. The model simplicity make it easy to use and the model consists of few parameters [37,38]. The average of all trees’ predicted value is then considered as the final predicted value [38]. From a forecast perspective, RF regression ability can be used for a time series data set for value prediction purposes, where the performance can be improved by tuning a set of parameters and a variable selection (future selection). As a predictor RF is based on growing M randomized regression trees, where a predicted value is denoted by a random variable that re-samples a fitting set before growing M trees.

Wavelet-Artificial Neural Network (W-ANN): The artificial neural network is a deep learning model that operates as a universal approximator and can accurately predict values. The model consists of a network structure with input, hidden, and output layers and considers non-linearity between the data points. This mentioned strength enables ANN to forecast future time series values [39,40]. However, to decrease and smooth the time series data and further increase the predicted accuracy, wavelet transformation can be used in combination with an ANN model. Wavelet transformation enables a counter approach to decompose the series and cope with local trends and seasonality. We refer to [41,42] for detailed discussion on W-ANN. The main advantages are ANN’s ability to learn, comprehend, and approximate solutions to learning problems that are complex in nature, but the methodology is computationally intensive and hyper-parameter sensitive [43].

Wavelet-Long Short-Term Memory (W-LSTM): In the field of deep learning are, LSTM models, which are an extension of Recurrent Neural Network (RNN). An LSTM model enables the benefit of handling non-linearity and is able to cope with sequential data while capturing long-term dependency. The LSTM model consists of a memory cell and an output gate where an activation function controls the output [44]. Hence, the LSTM model does incorporate additional parameters in comparison with the ANN model [45], but as previously mentioned, the wavelet transformation allows for the decomposition of a time series data set, which is the main argument for using WT on the LSTM model [41]. While we evaluated performance, we tested standard LSTM and W-LSTM. We found LSTM offers the advantage of handling long data sequences and complex learning patterns, but the method is more prone to overfitting, especially when training data are limited. Although LSTMs were designed to mitigate the vanishing gradient problem, in practice, they are still prone to this issue.

2.4.2. Performance Measures for Evaluating Forecasting Methods

The following classic performance measures are used to evaluate forecasting methods in the perspective of theoretical background: (i) Mean absolute error (MAE) =

\frac{1}{n} \sum_{t = 1}^{n} | y_{t} - f_{t} |

, (ii) Mean absolute percentage error (MAPE) =

\frac{1}{n} \sum_{t = 1}^{n} |\frac{y_{t} - f_{t}}{y_{t}}|

, (iii) Root mean squared error (RMSE) =

\sqrt{\frac{1}{n} \sum_{t = 1}^{n} {(y_{t} - f_{t})}^{2}}

, where

y_{t}

represents the actual value and

f_{t}

represents the forecasted value at Period t.

However it is found that the selection based on those performance measures might not lead to the desirable outcome from a company’s perspective [46]. To keep the accuracy high, it is common in data-driven approaches to procure a large volume, at least at the beginning of the training data period [47]. Therefore, the following KPIs are also used to judge forecasting accuracy: (iv) Fill Rate (FR): The percentage of demand that can be fulfilled with the stock on-hand at a given point in time. (v) Average Inventory Level (AIL): (vi) Total Holding Cost (THC): The amount of holding cost aggregated for each period, (vii) Stock-outs: How many times the actual demand is above the inventory level.

The listed KPIs are deliberately selected based on both academic and industrial implications, given that the objective is to measure performance from a mathematical and industrial perspective.

2.4.3. Methods for Inventory

Over the years, many replenishment decisions have been proposed for defining the replenishment strategy under time-varying demand. Definitely, stochasticity is an issue, but in this study, we focus on replenishment decisions under time-varying demand. One of the simplest ways to decide a replenishment value is the implementation of Economic Order Quantity (EOQ), based on aggregated demand level for a fixed planning horizon. Very recently, data-driven approaches are also being tested. We made the following assumptions when we implemented various replenishment policies: (i) demand is considered to be time-varying, (ii) capacity of the supplier is unlimited, (iii) for each order, the sponsor company needs to incur a fixed ordering cost, (iv) the holding cost is considered as constant, the purchase cost, and sales price remain the constant during the evaluation period and (v) the lead time is constant for each product and remains the same throughout the evaluation period.

EOQ: The economic order quantity (EOQ) model is a simple fixed order quantity model that minimizes the holding and setup cost, which are both associated with carrying and procuring inventory on an annual basis [48]. However, the disadvantages in the context of our study is that it fails to capture the time-varying nature.

Q-Learning Algorithm: Q-Learning is a class of reinforcement learning, which gain growing popularity to semi-automatised inventory replenishment decision in recent years. The algorithm operates as an off-policy, which means the optimal policy is obtained independently of the agent’s interactions with a given environment. In Q-learning algorithms, the decision-maker needs to set the following parameters:

α

, the learning rate and

γ

, the discount factor to optimize the model’s performance [49].

If we assume Q(S,A) represents the Q-value for a given action, A, and the state, S. The algorithm updates the Q-value using a learning rate

α

, the reward function R, which is obtained by taking an action, the discount factor

γ

, and the maximum Q-value of the next state–action pairs. The reward function defined in the Q-learning and DQN is as follows:

T o t a l p r o f i t (T P) = < sales revenue > - < holding cost > - < purchase cost > - < ordering cost > - < salvage cost >

(1)

Note that if the product is non-deteriorating, we exclude the salvage cost.

Q (S, A) \leftarrow (1 - α) Q (S, A) + α [R + γ m a x Q (S^{'}, A) - Q (S, A)]

(2)

where R represents reward, which is problem-specific. For instance, when we implement th algorithm, we use the reward function as presented in Appendix B. Note that we use the following cost parameter, as presented in Table 2, to make inventory planning.

Based on the literature, the main advantages of Q-learning are the algorithm’s ability to learn from past experience and that it can be adjusted to prioritize long-term or short-term gain. The disadvantage of Q-learning is the algorithm becomes inefficient when interacting with large scale environments and the algorithm is sensitive to hyper-parameter adjustment.

Deep Q-Network (DQN): As we mentioned earlier, although Q-learning is extensively used in various decision-making contexts, we also introduce DQN, which is more robust in terms of computational efficiency and capabilities [50]. DQN incorporates neural networks by utilizing neural networks’ ability to approximate the optimal action-value function. The neural networks perform this function by minimizing a loss value. The neural network facilitates the composition by taking the temporal difference between the estimated action value function

q_{θ} (s, a)

and the associated observed cost after simulating x given periods(s) and attach the last value functions of the last state and transition to

m i n_{a} q_{θ} (s^{'}, a)

, where a temporal difference for a finite list that represents a sequence of elements

(s, a, c, s^{'})

is outlined, where the difference is:

q_{θ} (s, a) - (c + γ m i n_{a} q_{θ} (s^{'}, a))

(3)

When we optimize the performance, we use ADAM Optimizer, and the loss function is defined as [51]

L_{i} (θ_{i}) = \sum_{e \sim U (B)} {[[r_{t} + γ max_{a_{t + 1}} Q (s_{t + 1}, a_{t + 1}; θ_{i}^{-})] - Q (s_{t}, a_{t}; θ_{i})]}^{2}

(4)

A loss function that guides a gradient-based search to update

θ_{k}

seeks to minimize the mean squared error. The main advantage of considering a value-based approach is this method tends to learn better policies in offline settings. For both Q-learning and DQN, we consider a value function that consists of price, procurement quantity, holding cost and ordering cost [52,53]. We found, based on the literature, that DQN enables the advantage of interacting with large scale environments unlike Q-learning, but DQN is a computationally intensive approach and requires a large sample of data for model training.

3. Results

The manager of the stock keeping unit (SKU) of our sponsored company needs to manage thousands of products. Therefore, the product categorization is key, although it is not directly associated with our study. But we briefly introduce one of the practices used for product categorization before jumping into the main analysis.

The product categorization is performed by the determination of coefficient of variation (CV) =

\frac{S t d D e v}{M e a n}

for forecast-ability. Although we only consider only three items, we still follow the sponsor company’s current practice of item classification as shown in Figure 2.

Therefore, we found that the requirements are not unique. One of the objectives of the study is also to test which forecasting method or replenishment decision is suitable for slow-, medium- and fast-moving raw materials. Another type of classification is variance analysis, which is commonly used in industry, and which might support the categorization of SKU items [54]. Therefore, we found that, (i) Item-1: Slow-moving, High Cost, and Low Variance, (ii) Item-2: Slow-moving, High Cost, and High Variance, and (iii) Item-3: Fast-Moving, Low Cost, and Medium Variance.

The data were aggregated on a monthly basis and converted to a time series data set consisting of 46 data points, from April 2019 to January 2023, and the three critical items are used with no missing value. These products are mostly, sourced from Denmark and Asia and used in almost every finished product. As we mentioned earlier, with all forecasting methods we divide the data set into 80:20 ratio, to train and test the outcome, when we rely on machine learning methods.

As we mentioned earlier, whether we keep the focus on forecasting accuracy based on classical measures or KPIs, we can find that based on the choice; there is no unique method that ensures higher performance. The item with the lowest cV = 36.94, ARIMA(2,1,0) is the best performing, in terms of MAE and RMSE, while SVR has the best MAPE score. From a managerial perspective, LSTM and W-LSTM have the highest fill rate, ARIMA has the lowest AIL and THC, while LSTM and W-LSTM have the least stockouts. The item with the highest cV = 64.95, is best performing with RF and SES when looking at MAE, W-ANN has the best MAPE score of, while LSTM has the best RMSE. Looking at FR, W-LSTM is the best performing, while W-ANN performs the best looking at AIL and THC. Both SVR and W-LSTM have the least stockouts.

Noticeably, if the cV is increased, for instance, itemID with the highest cV, wavelet-incorporated machine learning methods lead to higher performance, from a managerial perspective. Definitely, some of the advantages of wavelet transformation incorporated with machine learning are: (i) the capture of both smooth and abrupt changes in data, and (ii) highlighting both high and low-frequency components. In most studies, when forecasting measures are considered, they ignore the effect of industrial KPIs. Therefore, our results clearly demonstrate that the selection of forecasting techniques based on classical measures (e.g., MAPE, RMSE) might fail to ensure managerial transition. It also reflects that data-driven methods should not be blindly implemented as a forecasting tool in the industrial context. Managers in SMEs have several operational limitations, such as limited technology infrastructure, given SMEs industrial data management processes and data quality often lacks volume and accuracy [55]. If there is a frequent stockout scenario (negative AIL), it does not bring any value in terms of delivery performance, and it might hurt the reputation of SMEs. Noticeably, the 100% fill rate is an idealistic scenario, and some of the forecasting methods (W-LSTM), in our study context, ensure a 100% fill rate, but companies would have to keep an extensively high inventory. And their capacity may be over-utilized, still we found that classical approaches, such as ARIMA, might be a viable option, and in terms of a data-driven approach, LSTM is a possible solution. Note that, similar to our study, LSTM was also tested by [56], and the author also found reasonably good performance.

Figure 3 below shows an overview of the comparison of the forecasting models’ performance for all three items.

The SVR is not performing in an optimized way, which is also common in the literature [57]. Another point, which literature also supports, is that in instances where the coefficient of variation (cV) has a low value, conventional forecasting models, such as SES, are often adequate [58]. In the context of data sets with a high cV value, literature posits that machine learning models are highly capable of predicting with high accuracy, even though there is a presence of high volatility [59]. We refer to Appendix C.1 for the parameter settings for both forecasting and inventory models.

Previously, we compared the performance of data-driven and classical approaches, and we found that the classical approach can still be implemented, at least in forecasting. But if we look at inventory management, data-driven solutions like Q-learning are a possible solution for replenishment. We refer to Table 3 and Table 4 for the detail results.

Definitely, in terms of the total cost, the method outperforms the EOQ and EOQ is good when we consider the holding cost and ordering cost, but in our case, due to the high ordering cost compared to the low inventory holding cost, we obtain the desirable solution under the Q-learning method. The results in Table 3 demonstrate that the EOQ method orders more frequently, compared to the number of orders for Q-learning and DQN, and generally performs much worse. Although DQN is another possible data-driven approach, it is also not performing well in the context of our study. The rationale behind this is that, compared to Q-learning, DQN requires a much larger data set to be able to train the agent properly. This study only has four years of data, so it might be considered a bottleneck for DQN’s potential performance [60]. In addition to non-deteriorating items, we also compute the replenishment decision for deteriorating items, and the results are presented in Table 4 and Figure 4. We refer to Appendix A for the detail calculation of profit function. As expected, the cost must be increased and the results demonstrate that fact.

4. Discussion

Throughout the years, research on forecasting models has been gaining more popularity and the models have become more advanced. Statistical methods from an academic point of view have shown a strong foundation in improving performance and great promise to improve potential smooth operations. However, when looking at the practical use of forecasting methods, there is not much research on how the models perform in terms of industrial KPIs. This paper explores both the selected models’ performance from an academic perspective as well as the company’s perspective.

Similar work for forecasting and inventory management has been performed, but it mainly focuses on either inventory replenishment decisions or improving forecasting accuracy. For instance, ref. [61] also implements forecasting techniques; however, the author does not consider combining forecasting and inventory management. Similar to our work, ref. [24] also implement DQN, but lack the practical perspective. Furthermore, they only consider inventory management. Our study clearly reflects that the two stream should be analyzed together. In the subsequent subsections, we discussed the implications in detail [62].

4.1. Time Series Forecast Implications

The findings presented in Table 5 provide a comprehensive view of the performance of selected forecasting methods, taking into account both classical academic and managerial KPIs. The SES model, from an academic perspective, performs better than most of the machine learning models in all instances, making it more desirable from an academic viewpoint. Looking at SES from a managerial perspective, it does not outperform any of the other forecasting methods. Looking at the industrial KPIs on Table 5, it consistently under-forecasts and experiences stockouts in every period for all three items. In contrast, the W-LSTM model performs well from a managerial standpoint, despite having higher error scores than the SES. W-LSTM boasts a higher FR in all instances, although experiencing stockouts in some periods, it is still more desirable than SES from an managerial perspective. This performance balance illustrates the importance of considering various KPIs from multiple perspectives when evaluating forecasting models. Looking at it from a managerial perspective, there has to be a trade-off between high inventory levels and holding costs or 100% FR. In conclusion, there is no single approach to selecting a forecast model. The choice heavily depends on the business context and the trade-offs a company is willing to make between various KPIs. For instance, a company that values customer satisfaction and wants to avoid stockouts at all costs might prefer the W-LSTM model, while a company that wants to minimize costs might choose the W-ANN model because of the lower AIL level and lower holding cost. However, this comes at the cost of having more frequent stockouts and lower customer satisfaction.

4.2. Inventory Management Implications

In terms of the implementation ease, the EOQ model has the upper hand. As one of the most classical inventory management methods, EOQ can be readily implemented using basic mathematical calculations and does not require specialized software. However, its assumptions of constant demand and non-deteriorating items may limit its effectiveness in our scenario of constant-deteriorating items with time-varying demand. Conversely, the Q-Learning technique, being a type of reinforcement learning algorithm, necessitates an advanced understanding of ML and AI. However, it is potentially more suited for our scenario, given its ability to adapt to variable demand and optimize the total profit. The resources required for implementing each method vary significantly. The EOQ model, due to its simplicity, has minimal resource requirements. It can be calculated using basic tools and does not demand a high level of technical expertise. In contrast, the Q-Learning model requires substantial computational resources and data science expertise. Despite higher requirements, the superior performance of the Q-Learning model, particularly in terms of total profit, may make this investment worthwhile for companies with sufficient resources.

4.3. Time-Varying Demand

Given the actual practice at Svanehøj during the study period, the results at Figure 3 show Svanehøj’s current inventory practice is the worst performing for, respectively, item 1 and item 2, while ranking as the second best for item 3. Looking at the ML models, Q-Learning delivers the best performance in terms of the total profit for all three products. Looking at the tables, one of the reasons why Q-learning is outperforming all the other methods is because it is extremely efficient at keeping the inventory levels very low by balancing order quantities and for the given demand. Ultimately leading to a much higher profit than the other methods. Lastly, the most complex method, DQN, does not show any improvements compared to Q-Learning. One issue that DQN seems to have is the model tends to procure higher quantities than the actual demand. From these comparisons, it is clear that different inventory management techniques have unique impacts on cost components and total profit for non-deteriorating items with time-varying demand. Furthermore, it is also visible that the Q-Learning model is superior to the DQN, EOQ, and actual practice.

4.4. Software Consolidation

The proposed framework we present should be viewed as an ERP extension tool, with the intention of decreasing the gap between SMEs and the identified software consolidations we have previously highlighted. Additionally, the presented methodology intent to demonstrate how ML models can be an applicant approached, to facilitate data-driven decision making within SME, and how ML models from a forecast and inventory perspective can support effective production planning and control. However, the governing input parameters for ML models must be compliant with the internal domain knowledge, and ML model output must be benchmarked against carefully selected KPIs to facilitate effective model implementation.

5. Conclusions

We are focus on the interaction between demand forecasting and inventory planning, in the context of SMEs. The empirical results demonstrate a selection of ML models that yield superior performance in comparison to the classic statistical forecast methods. However, when ML models are solely considered, the results indicate careful consideration must be regarded, given that model evaluation can be perceived from both an academic and managerial perspective. The performance indicator from all three items signifies several ML models yield different outcomes based on the nature of the given input data and the selected hyperparameter settings. This concludes from both a practical and academic domain that no single ML model can be considered; rather, a selection of several ML models must be evaluated in accordance with the presented framework in Figure 1. Secondly, drawing from the empirical findings based on the two scenarios with non-deterioration and deterioration. Conclusively, it can be deduced that the use of RL in combination with the Q-learning algorithm obtains the most preferable economic results from all cases. Furthermore, since RL generates a unique policy based on the provided data and the environmental setting, the results show RL is a promising data-driven approach that can solve inventory problems and generate custom inventory policies for individual items, which indeed can serve to close a gap between theoretical inventory models and piratical industrial use.

As a last remark, this paper aid in the reduction of the current industrial research gap identified by [26], which outlines the lack of RL models application on real-world problems within the field of production planning and control. Further, investigation is required to evaluate the performance in a multi-echelon setup. In this study, we aggregated the data on a monthly basis, as the sponsor company also operates in same manner. Definitely, for more thorough analysis, sensitivity analysis for data aggregation, rolling horizon or more practical demands, such as stock dependent demand and price dependent demand, can provide more critical insights.

Author Contributions

Conceptualization, H.J.W., M.H. and S.S.; methodology, H.J.W. and M.H.; software, H.J.W. and M.H.; validation, S.S., G.J.C., I.E.N. and T.S.; formal analysis, H.J.W. and M.H.; investigation, H.J.W. and M.H.; resources, G.J.C. and T.S.; data curation, M.H., G.J.C. and T.S.; writing—original draft preparation, H.J.W., M.H. and S.S.; review and editing, all authors; supervision, S.S.; project administration, S.S.; funding acquisition, I.E.N. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data generated or analyzed during this study are included in this published article. The raw data are available on request from the author (madsheltoft@hotmail.com).

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. The company had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Appendix A. List of Notations

The following notations are used in this article as listed in Table A1:

Table A1. Summary of notations used in this study for inventory management.

Symbol	Description
p	Retail price
h	Per unit holding cost
k	Fixed cost per order
$k_{1}$	Per unit purchase cost
$θ$	Deterioration rate
$I (t)$	Instantaneous inventory level at time t
Q	Order quantity of product at each cycle
D	demand
$T_{1}$	Length of each replenishment cycle

Appendix B. Optimization Model for Deteriorating Products

Assume that Q is the amount of inventory replenishment the beginning of each cycle. Therefore, inventory level (

I (t)

) decreases due to the quantity demanded up to time

T_{1}

, the length of the replenishment cycle. Therefore, the inventory at time

t \in (0, T_{1})

is governed by the differential equation

\frac{d I}{d t} = - D - θ I (t)

(A1)

with the initial condition

I (T_{1}) = 0

. Solving the differential equation, we obtain

I (t) = Q^{θ t} - \frac{D (1 - e^{- θ t})}{θ}

(A2)

Therefore, the holding cost for the entire cycle is obtained as follows:

\int_{0}^{T_{1}} I (t) d t = Q \frac{e^{θ T_{1}} - 1}{θ} - \frac{D}{θ} (T_{1} - \frac{1 - e^{- θ T_{1}}}{θ})

(A3)

From the boundary condition,

I (T_{1}) = 0

, we obtain the replenishment quantity for each cycle as follows:

Q = \frac{D (e^{θ T_{1}} - 1)}{θ}

(A4)

The total profit for each cycle is obtained as follows:

T P = < sales revenue > - < holding cost > - < purchase cost > - < ordering cost > - < salvage cost >

(A5)

Based on Equation (A5), we make the following two cost functions: (i) profit function when the product is not deteriorating (

θ = 0

), (ii) profit function when the product is deteriorating (

0 < θ < < 1

).

Appendix C. Description of Machine Learning Algorithm for Inventory Replenishment Decision

In recent years, the data-driven approach is gaining popularity to reduce computational burden. In this study, we use Q-learning and DQN. Note that from a implementation perspective, we did not modify the key structure of the algorithm, but extensive sensitivity analysis is conducted before implementing such an algorithm [22].

Algorithm A1 Q-learning: (off-policy) Learn function

Q : S \times A \to R

.

Require:

States $S = {1, \dots, n_{s}},$ $w h e r e : S t a t e s i s i n v e n t o r y l e v e l$
Actions $A = {1, \dots, n_{a}}, A : S \Rightarrow A$ , $w h e r e : A c t i o n i s t h e r e p l e n i s h m e n t q u a n t i t y$
Reward Function $R : S \times A \to R$ , $w h e r e : R e w a r d i s t h e o b t a i n e d p r o f i t$
Black-box (probabilistic) transition function $E = •, w h e r e : 1 - ϵ$
Learning rate $α \in [0, 1]$ , typically $α = 0.1$
Discounting factor $γ \in [0, 1]$
procedure Q-Learning( $S$ , A, R, $ϵ$ , $α$ , $γ$ )
Initialize $Q : S \times A \to R$ arbitrarily
while Q is not converged do
Start in state $s \in S$
while s is not terminal do
Calculate $ϵ$ according to Q and exploration strategy (e.g., $ϵ (s) \leftarrow a r g m a x_{a} Q (s, a)$ )
$a \leftarrow ϵ (s)$
$R \leftarrow R (s, a)$ ▹ Receive the reward
$s^{'} \leftarrow E (s, a)$ ▹ Receive the new state
$Q (S, A) \leftarrow (1 - α) \cdot Q (S, A) + α \cdot (R + γ \cdot {max}_{Q} (S^{'}, A) - Q (S, A)$
$s \leftarrow s^{'}$
end while
end whilereturn Q
end procedure

Appendix C.1. Forecasting and Inventory Model Parameters

We conduct extensive sensitivity analysis before selecting hyper-parameters for machine learning models as well as classical approaches for forecasting, and the overview of parameter settings is presented below in Table A2. As we found, the cV for all three products are different, and due to that fluctuation, the parameters need to be adjusted for the model’s outcome. Definitely, we do not integrate any hyper-parameter optimization tools, but in practice the operator might fix it by conducting sensitivity analysis, and this is also not a key focus of this study. For all the neural-network-based models, where an activation function is required, ReLU activation has been utilized.

Algorithm A2 DQN Algorithm.

1:: Initialize replay memory D to capacity N
2:: Initialize action-value function Q with random weights $θ$
3:: Initialize target action-value function $\hat{Q}$ with weights $θ^{-} = θ$
4:: for episode $= 1, M$ do
5:: Initialize sequence $s_{1} = x_{1}$ and preprocessed sequence $ϕ_{1} = ϕ (s_{1})$
6:: for $t = 1, T$ do
7:: With probability $ϵ$ select a random action $a_{t}$
8:: otherwise select $a_{t} = {argmax}_{a} Q (ϕ (s_{t}), a; θ)$
9:: Execute action $a_{t}$ in emulator and observe reward $r_{t}$ and image $x_{t + 1}$
10:: Set $s_{t + 1} = s_{t}, a_{t}, x_{t + 1}$ and preprocess $ϕ_{t + 1} = ϕ (s_{t + 1})$
11:: Store transition $(ϕ_{t}, a_{t}, r_{t}, ϕ_{t + 1})$ in D
12:: Sample random minibatch of transitions $(ϕ_{j}, a_{j}, r_{j}, ϕ_{j + 1})$ from D
13:: Set $y_{j} = \{\begin{cases} r_{j} & for terminal ϕ_{j + 1} \\ r_{j} + γ {max}_{a^{'}} \hat{Q} (ϕ_{j + 1}, a^{'}; θ^{-}) & for non - terminal ϕ_{j + 1} \end{cases}$
14:: Perform a gradient descent step on ${(y_{j} - Q (ϕ_{j}, a_{j}; θ))}^{2}$ according to $θ$
15:: Every C steps reset $\hat{Q} = Q$
16:: end for
17:: end for

Table A2. Forecasting models parameter settings.

Forecasting Parameters
Method	Item-1	Item-2	Item-3
SES	$α$ :0.2	$α$ :0.2	$α$ :0.2
ARIMA	p,d,q(2,1,0)	p,d,q(1,0,1)	p,d,q(1,0,1)
ANN	Epochs: 200 Neurons: 80, 60 Hidden layer(s): 2	Epochs: 100 Neurons: 80, 60 Hidden layer(s): 2	Epochs: 200 Neurons: 100, 60 Hidden layer(s): 2
LSTM	Epochs: 200 Neurons: 100, 60 Hidden layer(s): 2	Epochs: 200 Neurons: 120, 80 Hidden layer(s): 2	Epochs: 200 Neurons: 64, 32 Hidden layer(s): 2
SVR	Epochs: 200 Neurons: 120, 60 Hidden layer(s): 2	Epochs: 200 Neurons: 80, 60 Hidden layer(s): 2	Epochs: 100 Neurons: 80, 20 Hidden layer(s): 2
RF	Ntree: 961 Mtry: 1 Max_depth: default (None)	Ntree: 1500 Mtry: 1 Max_depth: default (None)	Ntree: 1449 Mtry: 1 Max_depth: default (None)
W-ANN	Epochs: 200 Neurons: 80, 20 Hidden layer(s): 2	Epochs: 200 Neurons: 24, 8 Hidden layer(s): 2	Epochs: 100 Neurons: 64, 32 Hidden layer(s): 2
W-LSTM	Epochs: 100 Neurons: 80, 40 Hidden layer(s): 2	Epochs: 200 Neurons: 60, 24 Hidden layer(s): 2	Epochs: 200 Neurons: 80, 40 Hidden layer(s): 2

Similarly, we present the parameters used for implementing Q-learning and DQN policy, for determining optimal procurement schedule as shown in Table A3. The following parameter settings are also used by [63].

Table A3. Inventory model parameter settings.

Inventory Management Parameters
Method	Item-1	Item-2	Item-3
EOQ	Null	Null	Null
Q-Learning	$α$ : 0.01 $γ$ : 0.99 $ϵ$ : 1.0 # states: 500 # actions: 500 # episodes: 5000	$α$ : 0.0001 $γ$ : 0.99 $ϵ$ : 1.0 # states: 200 # actions: 200 # episodes: 5000	$α$ : 0.4 $γ$ : 0.2 $ϵ$ : 1.0 # states: 40,000 # actions: 40,000 # episodes: 500
DQN	$α$ : 0.0001 $γ$ : 0.99 $ϵ$ : 1.0 TAU: 0.0001 buffer size: 50,000 batch size: 128 action size: 150 # episodes: 500	$α$ : 0.00001 $γ$ : 0.99 $ϵ$ : 1.0 TAU: 0.0001 buffer size: 50,000 batch size: 128 action size: 120 # episodes: 500	$α$ : 0.1 $γ$ : 0.99 $ϵ$ : 1.0 TAU: 0.0001 buffer size: 50,000 batch size: 128 action size: 40,000 # episodes: 50

References

Foli, S.; Durst, S.; Davies, L.; Temel, S. Supply chain risk management in young and mature SMEs. J. Risk Financ. Manag. 2022, 15, 328. [Google Scholar] [CrossRef]
Setyaningsih, S.; Kelle, P.; Maretan, A.S. Driver and Barrier Factors of Supply Chain Management for Small and Medium-Sized Enterprises: An Overview. In Economic and Social Development: Book of Proceedings of the 58th International Scientific Conference on Economic and Social Development, Budapest, Hungary, 4–5 September 2020; Varazdin Development and Entrepreneurship Agency: Varazdin, Croatia, 2020; pp. 238–249. [Google Scholar]
Jacobs, R.F.; Weston, T. Enterprise resource planning (ERP)—A brief history. J. Oper. Manag. 2007, 25, 357–363. [Google Scholar] [CrossRef]
Christofi, M.; Nunes, M.; Chao Peng, G.; Lin, A. Towards ERP success in SMEs through business process review prior to implementation. J. Syst. Inf. Technol. 2013, 15, 304–323. [Google Scholar] [CrossRef]
Ahmad, M.M.; Cuenca, R.P. Critical success factors for ERP implementation in SMEs. Robot. Comput. Integr. Manuf. 2013, 29, 104–111. [Google Scholar] [CrossRef]
Kale, P.T.; Banwait, S.S.; Laroiya, S.C. Performance evaluation of ERP implementation in Indian SMEs. J. Manuf. Technol. Manag. 2010, 21, 758–780. [Google Scholar] [CrossRef]
Haddara, M.; Zach, O. ERP systems in SMEs: A literature review. In Proceedings of the 2011 44th Hawaii International Conference on System Sciences, IEEE, Kauai, HI, USA, 4–7 January 2011; pp. 1–10. [Google Scholar]
Olson, D.L.; Johansson, B.; De Carvalho, R.A. Open source ERP business model framework. Robot. Comput. Integr. Manuf. 2018, 50, 30–36. [Google Scholar] [CrossRef]
Syntetos, A.A.; Boylan, J.E.; Disney, S.M. Forecasting for inventory planning: A 50-year review. J. Oper. Res. Soc. 2009, 60, S149–S160. [Google Scholar] [CrossRef]
Pournader, M.; Ghaderi, H.; Hassanzadegan, A.; Fahimnia, B. Artificial intelligence applications in supply chain management. Int. J. Prod. Econ. 2021, 241, 108250. [Google Scholar] [CrossRef]
Zhu, X.; Ninh, A.; Zhao, H.; Liu, Z. Demand forecasting with supply-chain information and machine learning: Evidence in the pharmaceutical industry. Prod. Oper. Manag. 2021, 30, 3231–3252. [Google Scholar] [CrossRef]
Heuts, R.M.J.; Strijbosch, L.W.G.; van der Schoot, E.H.M. A Combined Forecast-Inventory Control Procedure for Spare Parts. Ph.D. Thesis, Faculty of Economics and Business Administration, Tilburg University, Tilburg, The Netherlands, 1999. [Google Scholar]
Chan, S.W.; Tasmin, R.; Aziati, A.N.; Rasi, R.Z.; Ismail, F.B.; Yaw, L.P. Factors influencing the effectiveness of inventory management in manufacturing SMEs. In IOP Conference Series: Materials Science and Engineering; IOP Publishing: Bristol, UK, 2017; Volume 226, p. 012024. [Google Scholar]
Dey, K.; Roy, S.; Saha, S. The impact of strategic inventory and procurement strategies on green product design in a two-period supply chain. Int. J. Prod. Res. 2019, 57, 1915–1948. [Google Scholar] [CrossRef]
Jonsson, P.; Mattsson, S.A. Inventory management practices and their implications on perceived planning performance. Int. J. Prod. Res. 2008, 46, 1787–1812. [Google Scholar] [CrossRef]
Saha, S.; Goyal, S.K. Supply chain coordination contracts with inventory level and retail price dependent demand. Int. J. Prod. Econ. 2015, 161, 140–152. [Google Scholar] [CrossRef]
Perona, M.; Saccani, N.; Zanoni, S. Combining make-to-order and make-to-stock inventory policies: An empirical application to a manufacturing SME. Prod. Plan. Control 2009, 20, 559–575. [Google Scholar] [CrossRef]
Kara, A.; Dogan, I. Reinforcement learning approaches for specifying ordering policies of perishable inventory systems. Expert Syst. Appl. 2018, 91, 150–158. [Google Scholar] [CrossRef]
Kemmer, L.; von Kleist, H.; de Rochebouët, D.; Tziortziotis, N.; Read, J. Reinforcement learning for supply chain optimization. In Proceedings of the European Workshop on Reinforcement Learning, Lille, France, 1–3 October 2018; Volume 14. [Google Scholar]
Peng, Z.; Zhang, Y.; Feng, Y.; Zhang, T.; Wu, Z.; Su, H. Deep reinforcement learning approach for capacitated supply chain optimization under demand uncertainty. In Proceedings of the 2019 Chinese Automation Congress (CAC), IEEE, Hangzhou, China, 22–24 November 2019; pp. 3512–3517. [Google Scholar]
Sultana, N.N.; Meisheri, H.; Baniwal, V.; Nath, S.; Ravindran, B.; Khadilkar, H. Reinforcement learning for multi-product multi-node inventory management in supply chains. arXiv 2020, arXiv:2006.04037. [Google Scholar]
De Moor, B.J.; Gijsbrechts, J.; Boute, R.N. Reward shaping to improve the performance of deep reinforcement learning in perishable inventory management. Eur. J. Oper. Res. 2022, 301, 535–545. [Google Scholar] [CrossRef]
Wang, Q.; Peng, Y.; Yang, Y. Solving Inventory Management Problems through Deep Reinforcement Learning. J. Syst. Sci. Syst. Eng. 2022, 31, 677–689. [Google Scholar] [CrossRef]
Oroojlooyjadid, A.; Nazari, M.; Snyder, L.V.; Takáč, M. A deep q-network for the beer game: Deep reinforcement learning for inventory optimization. Manuf. Serv. Oper. Manag. 2017, 24, 285–304. [Google Scholar] [CrossRef]
Gijsbrechts, J.; Boute, R.N.; Van Mieghem, J.A.; Zhang, D.J. Can deep reinforcement learning improve inventory management? performance on lost sales, dual-sourcing, and multi-echelon problems. Manuf. Serv. Oper. Manag. 2022, 24, 1349–1368. [Google Scholar] [CrossRef]
Esteso, A.; Peidro, D.; Mula, J.; Díaz-Madroñero, M. Reinforcement learning applied to production planning and control. Int. J. Prod. Res. 2022, 61, 2104180. [Google Scholar] [CrossRef]
Cheng, C.; Sa-Ngasoongsong, A.; Beyca, O.; Le, T.; Yang, H.; Kong, Z.; Bukkapatnam, S.T. Time series forecasting for nonlinear and non-stationary processes: A review and comparative study. IEEE Trans. 2015, 47, 1053–1071. [Google Scholar] [CrossRef]
Hanifi, S.; Liu, X.; Lin, Z.; Lotfian, S. A critical review of wind power forecasting methods—Past, present and future. Energies 2020, 13, 3764. [Google Scholar] [CrossRef]
Abadi, M.; Barham, P.; Chen, J.; Chen, Z.; Davis, A.; Dean, J.; Devin, M.; Ghemawat, S.; Irving, G.; Isard, M.; et al. TensorFlow: A system for Large-Scale machine learning. In Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), Savannah, GA, USA, 2–4 November 2016; pp. 265–283. [Google Scholar]
Box, G.; Jenkins, G.; Reinsel, G. Time Series Analysis: Forecasting and Control, 4th ed.; John Wiley & Sons Inc.: Hoboken, NJ, USA, 2008. [Google Scholar]
Khashei, M.; Bijari, M.; Ardali, G.A.R. Improvement of auto-regressive integrated moving average models using fuzzy logic and artificial neural networks (ANNs). Neurocomputing 2009, 72, 956–967. [Google Scholar] [CrossRef]
Pai, P.F.; Lin, K.P.; Lin, C.S.; Chang, P.T. Time series forecasting by a seasonal support vector regression model. Expert Syst. Appl. 2010, 37, 4261–4265. [Google Scholar] [CrossRef]
Yang, H.; Huang, K.; King, I.; Lyu, M.R. Localized support vector regression for time series prediction. Neurocomputing 2009, 72, 2659–2669. [Google Scholar] [CrossRef]
He, W.; Wang, Z.; Jiang, H. Model optimizing and feature selecting for support vector regression in time series forecasting. Neurocomputing 2008, 72, 600–611. [Google Scholar] [CrossRef]
Montesinos López, O.A.; Montesinos López, A.; Crossa, J. Multivariate Statistical Machine Learning Methods for Genomic Prediction; Springer Nature: Berlin, Germany, 2022; p. 691. [Google Scholar]
Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Biau, G.; Scornet, E. A random forest guided tour. Test 2016, 25, 197–227. [Google Scholar] [CrossRef]
Tyralis, H.; Papacharalampous, G. Variable selection in time series forecasting using random forests. Algorithms 2017, 10, 114. [Google Scholar] [CrossRef]
Shanmuganathan, S.; Samarasinghe, S. Artificial Neural Network Modelling; Springer International Publishing: Cham, Switzerland, 2016; Volume 628. [Google Scholar]
Zhang, G.; Patuwo, B.E.; Hu, M.Y. Forecasting with artificial neural networks: The state of the art. Int. J. Forecast. 1998, 14, 35–62. [Google Scholar] [CrossRef]
Deka, P.C.; Prahlada, R. Discrete wavelet neural network approach in significant wave height forecasting for multistep lead time. Ocean Eng. 2012, 43, 32–42. [Google Scholar] [CrossRef]
Nury, A.H.; Hasan, K.; Alam, M.J.B. Comparative study of wavelet-ARIMA and wavelet-ANN models for temperature time series data in northeastern Bangladesh. J. King Saud Univ. Sci. 2017, 29, 47–61. [Google Scholar] [CrossRef]
Sharma, N.; Zakaullah, M.; Tiwari, H.; Kumar, D. Runoff and sediment yield modeling using ANN and support vector machines: A case study from Nepal watershed. Model. Earth Syst. Environ. 2015, 1, 23. [Google Scholar] [CrossRef]
Abbasimehr, H.; Shabani, M.; Yousefi, M. An optimized model using LSTM network for demand forecasting. Comput. Ind. Eng. 2020, 143, 106435. [Google Scholar] [CrossRef]
Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Networks Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef]
Wei, Y.; Bai, L.; Yang, K.; Wei, G. Are industry-level indicators more helpful to forecast industrial stock volatility? Evidence from Chinese manufacturing purchasing managers index. J. Forecast. 2021, 40, 17–39. [Google Scholar] [CrossRef]
Sun, C.; Shrivastava, A.; Singh, S.; Gupta, A. Revisiting Unreasonable Effectiveness of Data in Deep Learning Era. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
Stevenson, W.J. Operations Management; McMcGraw-Hill Irwin: Boston, MA, USA, 2018. [Google Scholar]
Cuartas, C.; Aguilar, J. Hybrid algorithm based on reinforcement learning for smart inventory management. J. Intell. Manuf. 2023, 34, 123–149. [Google Scholar] [CrossRef]
Van Hasselt, H.; Guez, A.; Silver, D. Deep reinforcement learning with double q-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, AZ, USA, 12–17 February 2016; Volume 30. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Selukar, M.; Jain, P.; Kumar, T. Inventory control of multiple perishable goods using deep reinforcement learning for sustainable environment. Sustain. Energy Technol. Assess. 2022, 52, 102038. [Google Scholar] [CrossRef]
Boute, R.N.; Gijsbrechts, J.; Van Jaarsveld, W.; Vanvuchelen, N. Deep reinforcement learning for inventory control: A roadmap. Eur. J. Oper. Res. 2022, 298, 401–412. [Google Scholar] [CrossRef]
Mishra, P.; Rao, U.S. Concentration vs. Inequality Measures of Market Structure: An Exploration of Indian Manufacturing. Econ. Political Wkly. 2014, 59, 59–65. [Google Scholar]
Omri, N.; Al Masry, Z.; Mairot, N.; Giampiccolo, S.; Zerhouni, N. Industrial data management strategy towards an SME-oriented PHM. J. Manuf. Syst. 2020, 56, 23–36. [Google Scholar] [CrossRef]
Gamboa, J.C.B. Deep learning for time-series analysis. arXiv 2017, arXiv:1701.01887. [Google Scholar]
Chhajer, P.; Shah, M.; Kshirsagar, A. The applications of artificial neural networks, support vector machines, and long-short term memory for stock market prediction. Decis. Anal. J. 2022, 2, 100015. [Google Scholar] [CrossRef]
Ghobbar, A.A.; Friend, C.H. Evaluation of forecasting methods for intermittent parts demand in the field of aviation: A predictive model. Comput. Oper. Res. 2003, 30, 2097–2114. [Google Scholar] [CrossRef]
Praveen, U.; Farnaz, G.; Hatim, G. Inventory management and cost reduction of supply chain processes using AI based time-series forecasting and ANN modeling. Procedia Manuf. 2019, 38, 256–263. [Google Scholar] [CrossRef]
Osb, I.; Blundell, C.; Pritzel, A.; Van Roy, B. Deep exploration via bootstrapped DQN. In Proceedings of the 30th Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; Volume 29. [Google Scholar]
Kolková, A.; Ključnikov, A. Demand forecasting: AI-based, statistical and hybrid models vs practicebased models-the case of SMEs and large enterprises. Econ. Sociol. 2022, 15, 39–62. [Google Scholar] [CrossRef]
Panda, S.; Saha, S.; Basu, M. An EOQ model with generalized ramp-type demand and Weibull distribution deterioration. Asia-Pac. J. Oper. Res. 2007, 24, 93–109. [Google Scholar] [CrossRef]
Hansen, S. Using deep q-learning to control optimization hyperparameters. arXiv 2016, arXiv:1602.04062. [Google Scholar]

Figure 1. Evaluation procedure.

Figure 2. Product categorization.

Figure 3. Actual quantity versus forecasts quantity.

Figure 4. Results for non-deteriorating items with time-varying demand.

Table 1. Literature Review: application of machine learning in inventory replenishment decision.

Paper	Supply Chain Environment Settings	Approach	Findings	Practical Implications
[18]	An ordering replenishment problem in a retailer setting with the objective to minimize cost. The authors assumes a fixed lead time from a supplier.	The authors suggest Q learning and Sara algorithm to find near optimal replenishment policies for perishable products and compare the outcome of six situations.	The numerical results show the discount factor, learning rate and exploration rate influences the model’s learning performance.	The results show an ordering policy that incorporates the age information and inventory quantity yields better outcomes than quantity depended policies. A comprehensive insight is cost analysis for the policies for demand variability, lead time, cost ratio and product lifetime.
[19]	Multi echelon supply chain setting with a factory, multiple warehouses with stochastic, seasonal demand.	Considers the following RL algorithms SARSA and Reinforce with an agent that acts based on the (s,Q) policy under a simple and complex scenario.	The finding shows both the SARSA and Reinforce approach yield a higher performance than the (s,Q) policy.	The authors show agents can be programmed to cope with demand trends, production level and stock allocation. The findings show agents can cope, operate, and take action under simple market conditions.
[20]	A multi echelon supply chain system with plant, warehouse, retailer and customers under a simple, complex and special case, each with different settings.	The authors consider a vanilla policy gradient algorithm to solve the supply chain problem through profit maximization. The production target and quantities target are either discrete or continuous variables. Afterward, the results are compared with the outcome of a (r,Q) policy.	Based on all three cases, the DRL agent outperforms the (r,Q) policy.	The results show that the two methods, (1) Action clipping and (2) Output activation function, both yield better results than the (r,Q) policy. The free DRL models do not consider transition probability and are thereby able to make decisions without knowing the demand probability.
[21]	Multi Echelon Systems. (2) Supplier—Warehouse—Transportation—Store. (3) Heuristic algorithms frequently result in high operating cost.	The authors consider the following 6 algorithms: (1) Multi-Agent Reinforcement Learning (MARL); (2) Mixed Integer Linear Programming (MILP); (3) Adaptive Control (AC); (4) Model-predictive Control (MPC); (5) Imitation Learning (IL); (6) Approximate Dynamic.	The numerical results demonstrate the RL model can be used for paralleled decision making if model is correct, and the RL models can further aid in minimizing the waste of perishable product.	This paper highlights that heuristic models such as (s,S) or (R,Q) models are not feasible. The paper seeks to explore algorithms that can handle multi-objective reward systems. The findings, which are based on actual data, demonstrate the potential strengths of an RL system that can handle complex environment settings with multiple objectives.
[22]	Consider a classical single item perishable inventory problem with stochastic demand in a single echelon setting with periodic review and a fixed lead time.	Utilizes rewards shaping to combat DRL drawbacks based on two heuristic teachers that modify the reward as the DRL agent transfer to a new state with the objective to improve the inventory policy.	Reward shaping improves the DLR learning process and reduces the cost and variability in comparison with an unshaped DLR model.	The numerical results show reward shaping is a relevant approach, since the method reduces computational requirements. Additionally, reward-shaped DQN yields a higher learning rate and better policy in comparison to an unshaped DQN. Reward shaping by using known inventory policies as teachers can serve as an enabling argument for the integration of DLR model in companies.
[23]	The authors consider a multi-echelon inventory system with lost sales and a complex cost structure.	DQN and DDQN frameworks are used to solve the stated inventory problem and compare the results with heuristic algorithms.	The paper’s findings showcase the DRL models’ ability to adapt to inventory problems with different complex cost structures, which rules-based heuristic has issues coping with.	Incorporation of two or more historical inventory records improves the DDLS model’s performance. Information sharing of parameter setting within supply chains can enhance performance of the DRL model, decreasing volatility during the training phase.
[24]	A multi-echelon supply chain system based on the beer game.	Reconfigure the DQN algorithm to operate in cooperative environment.	The results indicate DQN models obtain near-optimal policies when compared with agents that follow a base stock policy.	Transfer learning is an approach that makes the DQN agents flexible to cope with different cost structure and settings without the need for vast training. DQN shows promising numerical results in a supply chain setting with real-time information sharing between each chain entity.
[25]	Investigate DRL in three different inventory problems: (1) lost sales; (2) dual sourcing inventory; (3) multi-echelon inventory management.	Formulate the inventory problems as MDP and use the Asynchronous Advantage Actor–Critic (A3C) algorithm.	The paper provides a proof of concept of deep reinforcement learning ability to solve classic inventory problems. Additionally, the paper underlines that the AC3 algorithm adapts well to the stochastic environment company.	The authors conduct Sensitivity with the objective to find the optimal gaps between state-of-the-art heuristic polices and the AC3 algorithm. The findings showcase the AC3 algorithm yields a good performance for long lead times and has a higher overall performance than the heuristic algorithms.

Table 2. Cost parameters for all three items.

ItemID	Holdingcost (HC)	Ordering Cost (OC)	Variable Order Cost (VOC)	Selling Price (SP)
Item-1	119	500	1193	10,896
Item-2	3443	500	8010	34,443
Item-3	0.315	500	3.15	31

Table 3. Results for non-deteriorating item with time-varying demand.

Item-1
Methods	Holding Cost	Ordering Cost	Variable Order Cost	Total Profit	Total Demand	Total Procured	Number of Orders
Actual	743,752	19,000	4,823,299	38,466,476	4077	4043	35
EOQ	486,386	15,500	4,863,861	39,057,150	4077	4077	31
Q-Learning	288,944	19,500	4,863,861	39,250,686	4077	4077	39
Deep Q Network	395,598	13,500	4,896,072	39,117,821	4077	4104	27
Item-2
Actual	2,313,696	17,500	6,840,540	20,199,613	853	854	35
EOQ	2,937,364	38,500	6,832,530	19,571,740	853	853	77
Q-Learning	860,750	17,500	6,832,530	21,660,569	853	853	35
Deep Q Network	2,741,836	5500	6,960,690	19,672,109	853	869	11
Item-3
Actual	238,537	16,000	1,142,766	9,848,969	363,233	362,783	32
EOQ	114,418	8000	1,144,183	9,997,689	363,233	363,233	16
Q-Learning	83,026	13,000	1,144,183	10,020,013	363,233	363,233	26
Deep Q Network	155,494	9,000	1,206,746	9,888,982	363,233	383,094	18

Table 4. Results for constant-deteriorating item with time-varying demand Item-3.

Technique	Holding Cost	Ordering Cost	Variable Order Cost	Total Profit	Total Demand	Total Procured
EOQ	120,998	8000	1,140,551	9,877,461	363,233	363,233
Q-Learning	83,026	17,000	1,162,031	9,998,166	363,233	364,175

Table 5. Implication of forecasting under different KPIs based on the original data.

Item-1 (cV = 36.94)
KPI	FR (%)	AIL (qty.)	THC	# of Stockouts	MAE	MAPE (%)	RMSE
SES	89.3	−87	0	12	88.5	1107	104.57
ARIMA	98.56	12.25	61,200	6	73.91	1016	94.56
ANN	66.46	−315	0	12	106.25	710.28	138.21
LSTM	100	308	441,529	0	173.91	3795	208.39
SVR	15.42	−797	0	12	125.66	82.84	164.69
RF	92.69	−17.5	20,519	6	100.25	1232	116.39
W-ANN	96.24	14.5	81,124	5	99	1212	127.09
W-LSTM	100	150	216,290	0	127.08	1496	152.94
Item-2 (cV = 64.95)
SES	63	−107	0	12	17.5	204.18	29.5
ARIMA	73.85	−83	0	12	19.16	271.80	30.5
ANN	83.35	−30	210,023	10	19.75	282.87	29.91
LSTM	52.06	−117	0	12	21.83	169.93	28.5
SVR	90.14	373	15,490,057	2	90.16	836.03	130.57
RF	62.23	−111	0	12	17.5	221.79	30.87
W-ANN	92.57	5.75	946,825	6	34.58	129.85	41.12
W-LSTM	98.42	18	960,597	2	30.91	275.46	39.62
Item-3 (cV = 44.91)
SES	71.61	−27,103	0	12	4543	31.4	5644
ARIMA	69.52	−28,458	0	12	5094	35.82	6188
ANN	61.90	−41,384	0	12	4794	33.22	6485
LSTM	96.85	2508	14,395	5	3669	30.53	4055
SVR	90.56	16,679	26,270	7	9905	90.62	13,307
RF	57.58	−39,752	0	12	6283	45.03	7338
W-ANN	93	1742	15,985	4	5562	43.38	5768
W-LSTM	100	16,016	60,543	0	11,434	83	12,611

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wahedi, H.J.; Heltoft, M.; Christophersen, G.J.; Severinsen, T.; Saha, S.; Nielsen, I.E. Forecasting and Inventory Planning: An Empirical Investigation of Classical and Machine Learning Approaches for Svanehøj’s Future Software Consolidation. Appl. Sci. 2023, 13, 8581. https://doi.org/10.3390/app13158581

AMA Style

Wahedi HJ, Heltoft M, Christophersen GJ, Severinsen T, Saha S, Nielsen IE. Forecasting and Inventory Planning: An Empirical Investigation of Classical and Machine Learning Approaches for Svanehøj’s Future Software Consolidation. Applied Sciences. 2023; 13(15):8581. https://doi.org/10.3390/app13158581

Chicago/Turabian Style

Wahedi, Hadid J., Mads Heltoft, Glenn J. Christophersen, Thomas Severinsen, Subrata Saha, and Izabela Ewa Nielsen. 2023. "Forecasting and Inventory Planning: An Empirical Investigation of Classical and Machine Learning Approaches for Svanehøj’s Future Software Consolidation" Applied Sciences 13, no. 15: 8581. https://doi.org/10.3390/app13158581

APA Style

Wahedi, H. J., Heltoft, M., Christophersen, G. J., Severinsen, T., Saha, S., & Nielsen, I. E. (2023). Forecasting and Inventory Planning: An Empirical Investigation of Classical and Machine Learning Approaches for Svanehøj’s Future Software Consolidation. Applied Sciences, 13(15), 8581. https://doi.org/10.3390/app13158581

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Forecasting and Inventory Planning: An Empirical Investigation of Classical and Machine Learning Approaches for Svanehøj’s Future Software Consolidation

Abstract

1. Introduction

Literature Review

2. Methods

2.1. Company Background

2.2. Data Used

2.3. Methodology

2.4. Technical Details

2.4.1. Methods for Forecasting

2.4.2. Performance Measures for Evaluating Forecasting Methods

2.4.3. Methods for Inventory

3. Results

4. Discussion

4.1. Time Series Forecast Implications

4.2. Inventory Management Implications

4.3. Time-Varying Demand

4.4. Software Consolidation

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A. List of Notations

Appendix B. Optimization Model for Deteriorating Products

Appendix C. Description of Machine Learning Algorithm for Inventory Replenishment Decision

Appendix C.1. Forecasting and Inventory Model Parameters

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI