Prediction Framework with Kalman Filter Algorithm

The article describes the autonomous open data prediction framework, which is in its infancy and is designed to automate predictions with a variety of data sources that are mostly external. The framework has been implemented with the Kalman filter approach, and an experiment with road maintenance weather station data is being performed. The framework was written in Python programming language; the frame is published on GitHub with all currently available results. The experiment is performed with 34 weather station data, which are time-series data, and the specific measurements that are predicted are dew points. The framework is published as a Web service to be able to integrate with ERP systems and be able to be reusable.


Introduction
Unevenness and data availability is one of the essential things for information systems nowadays. As already known, enterprise resource planning (ERP) systems provide most of the business processes and the success of the business value increase, and it is required to ensure the credibility of data. Data processing logic is deemed as complex decision-making logic if it relies on analytical or managerial models for determining a course of action in business process execution and often requires domain-specific knowledge. Examples of decision-making logics are inventory replenishment, road-network maintenance, and production planning decisions. The demand for accurate decisions has grown, forcing ERP to improve the decision-making process, at both strategic and operational levels, by providing the necessary information, tools, and capabilities essential to enhance the decision-making process. However, many modules contain sophisticated forecasting methods and are part of a decision-making logic. Forecasting is a process for predicting the future based on past data-most often, through trend analyses. Forecasting is one such case that businesses need to gain higher profit and continue business processes successfully. ERP systems have limited forecasting capabilities that are implemented into an outline code. Enterprises spend much money to modify existing methods to satisfy their requirements. Some ERP systems do not have enough forecasting functionality. Forecasting functionality is used as an opportunity that can be enhanced by predictive capabilities in ERP systems. Facts and assumptions arising from various data sources may be identified as useful to the decision-maker. Based on this data, different business decisions can be made that can more effectively provide access to business gain. However, many data are not available in ERP systems to provide sufficient objectivity at the time of decision. Multiple data sources and forecasting algorithms are needed to be placed in ERP systems to increase reliability and make more efficient decisions from numerous references in the same period. It is well known, ERP systems are complex and require the appropriate specialists to perform the necessary programming work to make changes; therefore, outsourcing comes in handy when such specialists are lacking [1][2][3][4].
One of the most advanced outsourcing approaches is a software as a service (SaaS), which allows straitening the service for the company, giving several privileges offered by SaaS. The ERP systems problem that needs to be addressed to allow for increased capabilities from external data sources in ERP systems. These external data will enable us to increase the accuracy of forecasts and enable the decision-maker to make a more valuable and accurate decision in a given situation. Adding different data sources through an integrated approach can ensure an increase in the accuracy outcome. Combining external data into a single mathematical model and then into a unique algorithm under a single framework capable of achieving this goal. This is one of the initial tasks that is accomplished in this article by presenting actual achievements and reflecting a detailed solution with road maintainers case data obtained from metrological stations located within the territory of the Republic of Latvia. The part of the autonomous open data prediction framework (AODPF) will be presented as results gain during the research process [5].
Road maintenances works are very diverse and specific, ranging from laying of the road surface to daily maintenance activities. Many stakeholders, including managers, dispatchers, and maintenance teams, are involved, and maintenance decisions need to be made on time [6]. Proactive maintenance activities are enabled by forecasting. Forecasting provides advance information about the required maintenance activities. The anti-slip maintenance is performed only in winter to provide anti-slip materials at a specific time and place on the road surface. The anti-slip maintenance nowadays uses live contextual data from many different sources, including open data sources and decision-making results significantly depend on data availability [7], including open data sources [8], and decision-making results significantly depend on data availability [9]. A pastime may affect the essential operations of the road-network in total. One of them is driving conditions when traffic speed is rapid. The higher the speed rate, the greater the probability that accidents are possible [10]. As know, the road conditions are subject to rapid changes in surface temperature and precipitation sum. The road conditions most fluctuations are observed during the winter period; however, the road condition is also affected by the snow and icing. The road maintenance is performed for specific Road Sections belonging to a region. The region in research is the Republic of Latvia road network, and the dataset is taken from the State Joint Stock Company "Latvian State Roads". The region consists of 52 road monitoring weather stations that are relatively distant from each other. To be able to respond to changes in the environment, on-road sections near the road surface are located road monitoring weather stations and cameras operated by different entities. The road monitoring weather stations collect raw observations that are processed to be able to make the necessary forecasting for future decisions [11]. For road maintainers, those predictions are crucial to making decisions daily. The road monitoring weather stations and cameras operated by different entities can help for the decision-making. The road maintainers controlling smart road signs are available to give warning messages to drivers on a specific stage of the road network [12]. The missing information in the time-series of weather stations is unavoidable, owing to the full observation of all the continuous processes is almost impossible [13].
The AODPF as a framework for forecasting will be described in detail from the original data called raw data. After receiving the data, the processing is done to make the data more readable in the input state and ready for the next steps. Each step in this forecast framework will display and reflect the programming code that will be published on GitHub. GitHub is a company that provides hosting for software development version control using Git. Git is a distributed version control system for tracking changes in source code during software development. In this case, the demonstration of an example based on the Kalman filter algorithm. Kalman filtering, which is also known as linear quadratic estimation, an algorithm that uses the time to observe the measurement string containing statistical noise and other inaccuracies. The previous results already address the general mathematical approach of the Kalman filter algorithm [14].
The main goal is to reflect the results achieved. Stabilize understanding of the work to be done. Moreover, to present the results in a logically structured way as a proof-of-concept. The subject of the research is the forecasting capabilities of the ERP systems.
The structure of the study reflects the aim mentioned above. The following section starts with a discussion on the materials and methods within the autonomous open data prediction framework.
In Section 3, narrow down to the results of the experiment. In Section 4 closes with the conclusion and suggestion for future research.

Autonomous Open Data Prediction Framework (AODPF)
The definition of AODPF is to be able to perform forecasting with open data (OD) from different sources that can provide itself as a framework in the form of autonomous. In other words, a framework that can be driven by managers; who do not have delve into the essence of algorithms and the essence of the forecasting process. AODPF would be able to provide automated algorithm selection and deliver results to ERP systems. One of the most critical privileges is the integration of several data sources into a single framework and the possibility to link the results with ERP systems. These data sources could be configured and modified by managers to improve the decision-making process under specific circumstances. One of the last features would be the standard use in any other business process and with any other data sources as well as with different forecasting methods and approaches. This subsection will outline the components of the AODPF that should be implemented in the future.
Originally hereafter from which the raw data comes. There are three ways to transfer data to AODPF flat file, database, and application programming interface (API). Currently, it is implemented only from the database. API passes data outwards but not inwards. In addition, the flat file is not implanted in AODPF but can be quickly realized as needed. As mentioned above, data processing is a long process when it comes to big data. In this case, the AODPF is built on the principle of solving the big data problem so that when faced with this problem, it would continue to work in the usual rhythm. These data can be processed with data stream tools, which allows data sources to be combined into a single data set. As already known, data may also come off, and that is why there is a need to use different methods to replace the missing data to be able to make a prediction then. The first three processes of AODPF are illustrated in Figure 1.
Information 2020, 11, x FOR PEER REVIEW 3 of 8

Autonomous Open Data Prediction Framework (AODPF)
The definition of AODPF is to be able to perform forecasting with open data (OD) from different sources that can provide itself as a framework in the form of autonomous. In other words, a framework that can be driven by managers; who do not have delve into the essence of algorithms and the essence of the forecasting process. AODPF would be able to provide automated algorithm selection and deliver results to ERP systems. One of the most critical privileges is the integration of several data sources into a single framework and the possibility to link the results with ERP systems. These data sources could be configured and modified by managers to improve the decision-making process under specific circumstances. One of the last features would be the standard use in any other business process and with any other data sources as well as with different forecasting methods and approaches. This subsection will outline the components of the AODPF that should be implemented in the future.
Originally hereafter from which the raw data comes. There are three ways to transfer data to AODPF flat file, database, and application programming interface (API). Currently, it is implemented only from the database. API passes data outwards but not inwards. In addition, the flat file is not implanted in AODPF but can be quickly realized as needed. As mentioned above, data processing is a long process when it comes to big data. In this case, the AODPF is built on the principle of solving the big data problem so that when faced with this problem, it would continue to work in the usual rhythm. These data can be processed with data stream tools, which allows data sources to be combined into a single data set. As already known, data may also come off, and that is why there is a need to use different methods to replace the missing data to be able to make a prediction then. The first three processes of AODPF are illustrated in Figure 1. After receiving the data and finalizing the processing, it can start the best model selection, which consists of time-series prediction, algorithm knowledgebase, and accuracy summary. Specifically, time-series prediction is a process that repeats itself until it finds the best model to perform full-scale prediction on the entire data set. Time-series prediction looks at and uses the whole algorithm knowledge base that is available and compares with each other to get the best results and use what has gained the best accuracy. All results are recorded in an accuracy summary, which allows the process to determine which is the best algorithm for a particular data source. Currently, in AODPF, this process is ongoing and contains only the Kalman filter approach. The reason is that the purely simple development process is in the initial phase, and to generally understand that forecasting takes place with this method was started. As the algorithm is the only one that is not currently implemented, the autonomous selection of the algorithm from the knowledge base. However, the desired results that will be achieved in AODPF in the future can be seen in Figure 2 below. After receiving the data and finalizing the processing, it can start the best model selection, which consists of time-series prediction, algorithm knowledgebase, and accuracy summary. Specifically, time-series prediction is a process that repeats itself until it finds the best model to perform full-scale prediction on the entire data set. Time-series prediction looks at and uses the whole algorithm knowledge base that is available and compares with each other to get the best results and use what has gained the best accuracy. All results are recorded in an accuracy summary, which allows the process to determine which is the best algorithm for a particular data source. Currently, in AODPF, this process is ongoing and contains only the Kalman filter approach. The reason is that the purely simple development process is in the initial phase, and to generally understand that forecasting takes place with this method was started. As the algorithm is the only one that is not currently implemented, the autonomous selection of the algorithm from the knowledge base. However, the desired results that will be achieved in AODPF in the future can be seen in Figure 2 below. ERP systems managers that are also decision support system holder for successful decisionmaking in AODPF there is a need for integration between AODPF and ERP systems. As well, the framework must be able to provide forecasts regularly and with high-speed accuracy and connectivity. This can very well be provided by business intelligence (BI), which sets rules and specific triggers. At a particular time, one precise data source observation would give off a specific trigger of an event that would be able to provide ERP systems decision-making an additional criterion. Such triggers could be configured as many as needed so that the BI provides all the desires and needs. In order to connect AODPF to ERP systems, there is a need for a data connector. Which nowadays is mostly done with the help of APIs. AODPF has the following API available and can be tested by setting up AODPF and trying out all the instructions available on GitHub, which is listed next to Supplementary Materials. The AODPF connection between ERP systems can be seen in Figure  3 below. The ERP system itself with the data layer and In-memory database is also slightly illustrated. The experiment with road maintenance using the Kalman filter approach will be discussed below.

Case Study on Road Maintenance Using a Kalman Filter Approach
In order to better understand the theoretical and ADOPF concept, there is a need for a practical example that can be done with the help of ADOPF; the following study presents a case study with road maintenance using the Kalman filter approach.
The Kalman filtering process has two stages: the forecasting step and the update state. The system is predicted by previous measurements and the update phase, where the current state of the system is assessed considering the analysis over a given time period. Actions are translated into equations as follows [15]: • Prediction: ERP systems managers that are also decision support system holder for successful decision-making in AODPF there is a need for integration between AODPF and ERP systems. As well, the framework must be able to provide forecasts regularly and with high-speed accuracy and connectivity. This can very well be provided by business intelligence (BI), which sets rules and specific triggers. At a particular time, one precise data source observation would give off a specific trigger of an event that would be able to provide ERP systems decision-making an additional criterion. Such triggers could be configured as many as needed so that the BI provides all the desires and needs. In order to connect AODPF to ERP systems, there is a need for a data connector. Which nowadays is mostly done with the help of APIs. AODPF has the following API available and can be tested by setting up AODPF and trying out all the instructions available on GitHub, which is listed next to Supplementary Materials. The AODPF connection between ERP systems can be seen in Figure 3 below. ERP systems managers that are also decision support system holder for successful decisionmaking in AODPF there is a need for integration between AODPF and ERP systems. As well, the framework must be able to provide forecasts regularly and with high-speed accuracy and connectivity. This can very well be provided by business intelligence (BI), which sets rules and specific triggers. At a particular time, one precise data source observation would give off a specific trigger of an event that would be able to provide ERP systems decision-making an additional criterion. Such triggers could be configured as many as needed so that the BI provides all the desires and needs. In order to connect AODPF to ERP systems, there is a need for a data connector. Which nowadays is mostly done with the help of APIs. AODPF has the following API available and can be tested by setting up AODPF and trying out all the instructions available on GitHub, which is listed next to Supplementary Materials. The AODPF connection between ERP systems can be seen in Figure  3 below. The ERP system itself with the data layer and In-memory database is also slightly illustrated. The experiment with road maintenance using the Kalman filter approach will be discussed below.

Case Study on Road Maintenance Using a Kalman Filter Approach
In order to better understand the theoretical and ADOPF concept, there is a need for a practical example that can be done with the help of ADOPF; the following study presents a case study with road maintenance using the Kalman filter approach.
The Kalman filtering process has two stages: the forecasting step and the update state. The system is predicted by previous measurements and the update phase, where the current state of the system is assessed considering the analysis over a given time period. Actions are translated into equations as follows [15]: • Prediction: The ERP system itself with the data layer and In-memory database is also slightly illustrated. The experiment with road maintenance using the Kalman filter approach will be discussed below.

Case Study on Road Maintenance Using a Kalman Filter Approach
In order to better understand the theoretical and ADOPF concept, there is a need for a practical example that can be done with the help of ADOPF; the following study presents a case study with road maintenance using the Kalman filter approach.
The Kalman filtering process has two stages: the forecasting step and the update state. The system is predicted by previous measurements and the update phase, where the current state of the system is assessed considering the analysis over a given time period. Actions are translated into equations as follows [15]: x k|k =x k|k−1 + K kỹk (5) wherex k|k−1 and P k|k−1 are predicted state mean and covariance, respectively, on the time step k before seeing the measurement.x k|k−1 is vector, where k is the size of the state. B k is the control-input model which controls are applied vector u k .ỹ k is mean of the value on time step k. z k the observation noise of the remaining calculation over the period k. S k covariance matrices prediction of calculation of the time step covariance k. K k the optimal Kalman gain filter how much the predictions will be corrected in time step k.x k|k is updated state covariance. P k|k is updated estimate covariance and lastlyỹ k|k measurement post-fit residual [15].
The next section summarizes the results of the case study experiment, and the rest of the results are available on GitHub.

Results
Data for the period from 19 January to 19 February 2020, are used for the experiment. The data consists of a time series and a dew point which is then predicted during the experiment. Each weather station's dew point is forecasted separately among the region consisting of 52 road monitoring weather stations. Some weather stations do not contain data or do not work during the period. They are automatically taken out and ignored, leaving a total of 34 weather stations. The area is sufficient to perform the initial experiment with the Kalman filter. The previous algorithm that is mentioned is used, and the algorithm can be found on the GitHub.
Weather stations with code "LV01", "LV09", "LV36" and "LV64" are shown in Figure 4 as results from the experiment (a) as "LV01", (b) as "LV09", (c) as "LV36" and (d) as "LV64". • Update step: where | and | are predicted state mean and covariance, respectively, on the time step k before seeing the measurement. | is vector, where k is the size of the state. is the controlinput model which controls are applied vector . is mean of the value on time step k. the observation noise of the remaining calculation over the period k.
covariance matrices prediction of calculation of the time step covariance k.
the optimal Kalman gain filter how much the predictions will be corrected in time step k. | is updated state covariance. | is updated estimate covariance and lastly | measurement post-fit residual [15].
The next section summarizes the results of the case study experiment, and the rest of the results are available on GitHub.

Results
Data for the period from 19 January to 19 February 2020, are used for the experiment. The data consists of a time series and a dew point which is then predicted during the experiment. Each weather station's dew point is forecasted separately among the region consisting of 52 road monitoring weather stations. Some weather stations do not contain data or do not work during the period. They are automatically taken out and ignored, leaving a total of 34 weather stations. The area is sufficient to perform the initial experiment with the Kalman filter. The previous algorithm that is mentioned is used, and the algorithm can be found on the GitHub.
Weather stations with code "LV01", "LV09", "LV36" and "LV64" are shown in Figure 4 as results from the experiment (a) as "LV01", (b) as "LV09", (c) as "LV36" and (d) as "LV64".   As can be seen in Figure 4, the results are relatively weak. Estimates and measurements vary by several orders of magnitude, which indicates that the Kalman filter accuracy is poor. However, it should be noted that the same velocity is used in the Kalman filter approach, which is not good and would need to be corrected by creating a matrix approach offered by the Kalman filter algorithm. Below are the results of the experiment outcome RMSE and MSE in the period from 19 January to 19 February 2020, see Figure 5. Dramatic findings show that the outcome needs to be changed so that it can get closer to the actual situation. A more detailed overview can be found in Table 1. Each weather station is listed with its average RMSE and MSE.  As can be seen in Figure 4, the results are relatively weak. Estimates and measurements vary by several orders of magnitude, which indicates that the Kalman filter accuracy is poor. However, it should be noted that the same velocity is used in the Kalman filter approach, which is not good and would need to be corrected by creating a matrix approach offered by the Kalman filter algorithm. Below are the results of the experiment outcome RMSE and MSE in the period from 19 January to 19 February 2020, see Figure 5.  As can be seen in Figure 4, the results are relatively weak. Estimates and measurements vary by several orders of magnitude, which indicates that the Kalman filter accuracy is poor. However, it should be noted that the same velocity is used in the Kalman filter approach, which is not good and would need to be corrected by creating a matrix approach offered by the Kalman filter algorithm. Below are the results of the experiment outcome RMSE and MSE in the period from 19 January to 19 February 2020, see Figure 5. Dramatic findings show that the outcome needs to be changed so that it can get closer to the actual situation. A more detailed overview can be found in Table 1. Each weather station is listed with its average RMSE and MSE.  Dramatic findings show that the outcome needs to be changed so that it can get closer to the actual situation. A more detailed overview can be found in Table 1. Each weather station is listed with its average RMSE and MSE.
The RMSE average for all 34 weather stations was 2.81; the MSE average for all was 16.12. Far from the desired result, however, it could be said that the realization of the Kalman filter was done without velocity. In addition, AODPF public access via GitHub was provided, and a web service was tested, which could already be used for integration between ERP systems. The first steps were taken to start the implementation and integration of AODPF between ERP systems in the future.

Conclusions
AODPF has been presented and also published, and the initial stage is, of course, volatile. There is a need to improve the Kalman filter approach. Repeat the experiment to see if the method gives a better result. Besides, there is a need for other procedures in order to compare and draw conclusions. Much work needs to be done, and there is a need for an automation process and a full-fledged AODPF process. However, it can be emphasized that the Python programming language is advantageous in the given solution and is also able to build the web service that is necessary for future integration with ERP systems. As already mentioned, AODPF is in its infancy. On the other hand, it can be said that the approach works-and the results are marked in a logically structured way as a proof-of-concept.
AODPF is published on GitHub and is available to the general public to use the required implementation with any system in a standard form and, of course, best with ERP systems for tangible results and flexibility. Partial documentation has been created and published alongside the AODPF, which is also available to the general public.
In future work, there will be an over-emphasis on adding other approaches, the choice of automated best model selection, and result representation from AODPF with a user interface implementation by choosing the time-frame and desired measurement type. There is a need for other alternative methods in the future, such as ARIMA and others.
Supplementary Materials: The following framework with source code and results are available online at https://github.com/JanisPeksa/Autonomous-Open-Data-Prediction-Framework.
Funding: This publication was supported by the Riga Technical University's Doctoral Grant program.