Evaluation of Photovoltaic Generation Forecasting Using Model Output Statistics and Machine Learning

Kim, Eun Ji; Jeon, Yong Han; Park, Youn Cheol; Park, Sung Seek; Oh, Seung Jin

doi:10.3390/en19020486

Open AccessArticle

Evaluation of Photovoltaic Generation Forecasting Using Model Output Statistics and Machine Learning

by

Eun Ji Kim

¹

,

Yong Han Jeon

²,

Youn Cheol Park

³,

Sung Seek Park

^4,* and

Seung Jin Oh

^1,5,*

¹

Clean Energy Transition Group, Korea Institute of Industrial Technology (KITECH), 102 Jejudaehak-ro, Jeju-si 63243, Republic of Korea

²

Department of Fire Protection Engineering, SanJi University, 83, Sangjidae-gil, Wonju-si 26339, Republic of Korea

³

Department of Mechanical Engineering, Jeju National University, 102, Jejudaehak-ro, Jeju-si 63243, Republic of Korea

⁴

Carbon Zero Technology Institute, 10, Cheomdan-ro, Jeju-si 63152, Republic of Korea

⁵

Department of Convergence Manufacturing System Engineering, University of Science and Technology (UST), 217, Gajeong-ro, Yuseong-gu, Daejeon 34113, Republic of Korea

^*

Authors to whom correspondence should be addressed.

Energies 2026, 19(2), 486; https://doi.org/10.3390/en19020486

Submission received: 16 December 2025 / Revised: 11 January 2026 / Accepted: 15 January 2026 / Published: 19 January 2026

(This article belongs to the Special Issue Renewable Energy-Integrated Power Systems and Techniques into High Voltage)

Download

Browse Figures

Versions Notes

Abstract

Accurate forecasting of photovoltaic (PV) power generation is essential for mitigating weather-induced variability and maintaining power-system stability. This study aims to improve PV power forecasting accuracy by enhancing the quality of numerical weather prediction (NWP) inputs rather than modifying forecasting model structures. Specifically, systematic errors in temperature, wind speed, and solar radiation data produced by the Unified Model–Local Data Assimilation and Prediction System (UM-LDAPS) are corrected using a Model Output Statistics (MOS) approach. A case study was conducted for a 20 kW rooftop PV system in Buan, South Korea, comparing forecasting performance before and after MOS application using a random forest-based PV forecasting model. The results show that MOS significantly improves meteorological input accuracy, reducing the root mean square error (RMSE) of temperature, wind speed, and solar radiation by 38.1–62.3%. Consequently, PV power forecasting errors were reduced by 70.0–78.7% across lead times of 1–6 h, 7–12 h, and 19–24 h. After MOS correction, the normalized mean absolute percentage error (nMAPE) remained consistently low at approximately 7–8%, indicating improved forecasting robustness across the evaluated lead-time ranges. In addition, an economic evaluation based on the Korean renewable energy forecast-settlement mechanism estimated an annual benefit of approximately 854 USD for the analyzed 20 kW PV system. A complementary valuation using an NREL-based framework yielded an annual benefit of approximately 296 USD. These results demonstrate that improving meteorological data quality through MOS enhances PV forecasting performance and provide measurable economic value.

Keywords:

photovoltaic power forecasting; machine learning; model output statistics (MOS); random forest; numerical weather prediction

1. Introduction

Currently, more than 2200 GW of photovoltaic (PV) capacity is installed worldwide, reducing CO₂ emissions by an estimated 1.1 Gt between 2019 and 2023 [1]. As energy systems transition from centralized fossil-fuel structures toward decentralized renewable architectures, solar power plays an increasingly important role in reducing transmission costs, improving system resilience, and supporting decarbonization efforts [2]. However, PV generation remains inherently variable due to weather-driven fluctuations, creating operational challenges for maintaining grid stability and balancing supply and demand [3,4,5,6].

Accurate PV forecasting is therefore essential for electricity market operation, reserve scheduling, and integrating large shares of renewable energy [7,8,9]. Forecasting accuracy directly affects system costs, balancing requirements, storage management, and dispatch decisions. Recent advances—including probabilistic forecasting and machine learning methods such as neural networks and decision-tree ensembles—have improved performance by learning nonlinear relationships between meteorological inputs and PV output [7,8,9,10,11,12]. Forecasting research has also expanded to residential and building integrated PV, multi-predictor models, and short-interval prediction frameworks using NWP, satellite imagery, or hybrid structures [13,14,15,16,17,18,19,20,21,22].

A recurring conclusion in prior work is that meteorological input quality is a dominant factor limiting forecasting performance, especially for irradiance, temperature, wind speed, humidity, and cloud-cover estimates. Because many forecasting models rely directly on raw NWP outputs, systematic biases in NWP data are often carried into PV-power forecasts, resulting in degraded accuracy. Thus, improving the reliability of meteorological inputs rather than only refining forecasting algorithms has been identified as a critical research direction [23,24].

Previous studies on photovoltaic (PV) power forecasting have predominantly sought to reduce prediction errors by optimizing machine-learning or deep-learning architectures, feature selection, and hyperparameter tuning. While these studies achieved notable accuracy improvements, they generally treated numerical weather prediction (NWP) inputs as given, implicitly assuming sufficient input data quality.

In contrast, limited research has systematically investigated how improving the quality of NWP predictors themselves influences downstream PV forecasting performance across different lead times. Although several works have applied MOS or machine-learning-based post-processing for bias correction, these efforts often focused on model-specific accuracy gains within short horizons, without isolating the effect of meteorological data enhancement or assessing its contribution to forecast robustness across multiple lead time ranges.

This study advances prior research by explicitly formulating MOS as a dedicated data-quality correction stage preceding the forecasting model. By holding the forecasting model constant and applying horizon-independent MOS correction, the present work isolates the influence of input data quality improvement and provides a quantitative assessment of its impact on multi-horizon accuracy, stability, and potential economic value. Forecasting requirements also differ by application context. Ultra-short-term and short-term forecasts support electricity market participation and grid-frequency stabilization, whereas mid-term forecasts are critical for operational planning, maintenance scheduling, and capacity-integration decisions [6,25].

Previous studies have demonstrated that improvements in solar forecasting accuracy can yield substantial system-level economic benefits—for example, a 25% improvement in forecasting accuracy was associated with a 1.56% reduction in annual net generation costs (USD 46.5 million) in an NREL assessment [3,26]. In this study, mid-term PV power forecasting was conducted over a four-month period using a random forest model driven by UM-LDAPS meteorological data, with MOS applied as a quality-control step to correct systematic NWP errors. A comparative analysis was performed to evaluate forecasting performance before and after MOS application.

This study aims to examine whether incorporating MOS can improve the reliability and stability of PV power forecasts across all considered lead-time ranges, highlighting the importance of systematic meteorological data calibration for enhancing PV forecasting performance.

The main contributions of this study are as follows: (i) the formulation of a MOS–RF forecasting pipeline in which MOS is explicitly designed as a standalone, horizon-independent data-quality correction stage applied prior to PV power forecasting, rather than as a model-specific post-processing step; (ii) a lead-time-resolved evaluation of forecasting performance that isolates the effect of meteorological bias correction by deliberately holding the forecasting model constant; and (iii) an assessment of the economic benefits associated with improved forecasting accuracy using both the Korean PV forecast–settlement scheme and the NREL benchmarking framework.

2. Methodology

The remainder of this paper is arranged as follows. In Section 2, we discuss the methodology employed in this study, including the description of the study site and the collected data, as well as the overall forecasting process used for PV power generation.

2.1. Site and Data Description

A 20 kW rooftop PV array was installed at Jeon-buk, Bu-an, South Korea (35.71 N, 126.59 E) and used as the collection site for the actual PV power generation data used in this study. The site consists of 20 multi-crystalline 250 W modules (Hyundai Heavy Industries Co., Ltd., Ulsan, Republic of Korea) and 20 single-crystalline 250 W modules (Solarpark-Korea Co., Seoul, Republic of Korea). The specifications of the installed PV modules are listed in Table 1.

The PV energy generation data were collected at the site every 10 min. The actual energy generation data, including direct-current voltage, direct-current amperage, alternating-current voltage, and alternating-current amperage, were measured and collected from 1 January to 31 December during daytime operating hours (10:00–18:00), because only this period corresponds to physically meaningful solar production. Nighttime intervals with zero output were therefore not included, as the objective of this study was to evaluate forecasting performance under actual power producing conditions.

Although the case study focused on a 20 kW rooftop PV array, the proposed MOS–RF forecasting pipeline is modular and data-driven, and therefore has the potential to be applied to PV systems with different capacities or in other regions. In line with IEA-PVPS Task 16 recommendations, such deployment would require moderate site-specific calibration or transfer learning based on local measurements [17]. While the present study demonstrates consistent forecasting improvements across multiple horizons (1–6 h, 7–12 h, and 19–24 h), broader validation across additional climates and sites would be needed to fully assess its general applicability.

In addition to the 20 kW rooftop PV array, a weather observation device (AWS) was installed at the site to collect actual weather information. The AWS comprised an RTD temperature sensor (JS-RTD100, Jinseongeng Eng., Seongnam, Republic of Korea), anemometer (Model 05103, R.M. Young Co., Traverse City, MI, USA), hygrometer (HMP110, Vaisala Co., Vantaa, Finland), pyranometer (MS-602, EKO Instruments Co., Ltd., San Jose, CA, USA), shading chamber (JS-ACF12, Jinseongeng Eng., Seongnam, Republic of Korea), and a data logger (RU-DL200, Jubix Co., Gunpo, Republic of Korea).

The grid information closest to the grid in the UM-LDAPS numerical prediction data was used to extract temperature, humidity, wind velocity, wind direction, air pressure, solar radiation, and cloud forecast information. The actual data for the application of MOS (Model Output Statistics) were based on the AWS data, and the Korea Meteorological Administration’s Automated Synoptic Observing System (ASOS); the ASOS is located near Jeonbuk National University’s New and Renewable Energy Material Development Center. Additionally, a 36-core Linux cluster server was set up (Table 2) and used to collect a diverse set of weather data by time and day; this data was used to perform pattern analysis and machine learning.

2.2. Process Description

Figure 1 shows the process flow for this study. First, as part of the data preparation-1 stage, NWP data were prepared using the UM-LDAPS. The resultant data were preprocessed for data interpolation and extraction of key factors used in PV energy generation forecasting. Before extracting weather prediction factors, the MOS method was applied to verify the accuracy of PV energy generation forecasting with and without the application of MOS; this technique served to eliminate statistical errors in the data.

Thus, two processes took place during the data extraction stage: simple data extraction and data extraction with MOS correction. When MOS was applied, actual measurement data produced in the ASOS and AWS were used.

The temperature, wind velocity, and solar radiation data extracted during the previous stage were used to perform machine learning using the random forest technique. The NWP and PV measurement data were sampled at a 1 h interval, and after machine learning was applied, the PV generation forecasts produced with and without the application of MOS were compared to the actual power generation data for preceding timeframes (1–6 h, 7–12 h, and 19–24 h) to evaluate forecasting performance.

3. Forecasting Method

In Section 3, the forecasting methods used in this study are described. The section covers the UM-LDAPS numerical weather forecasting model, the machine learning approach applied for PV power forecasting, and the Model Output Statistics technique used to improve forecast accuracy.

3.1. UM-LDAPS Model

Numerical forecasting uses computers to analyze current weather conditions and quantitatively predict future weather conditions by numerically integrating the governing equations of weather phenomena dynamics and physical principles.

In this study, the Korea Meteorological Administration’s (KMA) UM-LDAPS (Unified Model-Local Data Assimilation and Prediction System), which uses variable grids, was used as a weather forecast model. The UM-LDAPS has a spatial resolution of 1.5 km, as shown in Figure 2 and consists of 70 vertical floors up to an altitude of approximately 40 km [27,28,29].

Additionally, the Unified Model-Global Data Assimilation and Prediction System (UM-GDAPS) was also used. The UM-GDAPS is a global model of the same UM model with identical boundary conditions as those of the UM-LDAPS; it calculates weather phenomena in a 1.5 × 1.5 km area using the Navier–Stokes equation.

The UM-LDAPS receives the boundary condition from the UM-GDAPS every three hours to produce forecasts eight times a day (00, 06, 12, 18 UTC: 36 h forecasts, and 03, 09, 15, 18 UTC: 3 h forecasts). Additionally, forecast data are produced by operating an individual analysis–forecast cycle system that uses a three-dimensional variational data assimilation technique; the data are provided in the GRIB2 format, as recommended by the World Meteorological Organization (WMO) [30].

In this study, based on the UM-LDAPS update cycle and IEA-PVPS Task 16 guidelines, the 1–6 h, 7–12 h, and 19–24 h lead time map to the operational PV generation 10:00–18:00. Thus, all forecast evaluations were performed during periods with actual irradiance, ensuring that nighttime hours did not affect the results or economic assessment.

3.2. Machine Learning

The application of machine learning techniques in PV power forecasting has rapidly emerged as the most important method for increasing the forecast accuracy of irregular PV power generation. Machine learning is an artificial intelligence technology which has the advantage of providing numerous methods, variations, classifications, and regression analyses to solve problems that cannot be expressed by explicit algorithms.

Voyant et al. have described the application of machine learning methodologies, such as neural networks, and support vector regression; regression trees, random forest, gradient boosting, and many other methods have more recently begun to appear in PV power forecasting models. Notably, the performance ranking of machine learning methods is complex because of the diversity of datasets, time phases, forecast ranges, settings, and performance indicators which are used in different studies. Generally, machine learning techniques, such as support vector regression, ANN, k-nearest neighbors, regression trees, boosting, bagging, or random forest showed systematically better results than traditional regression methods [31].

Zamo et al. conducted a study comparing various machine learning techniques applied to the PV power generation produced in an NWP model using 31 predictors. They reported that the random forest model is superior for hourly PV generation forecasting in comparison to other machine learning models, including SVM, generalized regression, boosting, bagging, and persistence [32].

In addition, several recent studies have demonstrated that RF (Random Forest) continues to perform competitively across diverse forecasting conditions. Niu et al. reported that RF outperformed ANN models in short-term solar radiation and PV output prediction, and Asiedu et al. showed that the RF regressor achieved the lowest MAE 0.046 and RMSE 0.11 among regression and classification models for solar energy output prediction. Furthermore, Jogunuri et al. highlighted that RF requires lower computational resources than deep-learning models while maintaining comparable forecasting performance. Recent review studies also note that RF remains a widely used baseline in solar forecasting research, often outperforming simpler statistical methods and serving as a comparison point for advanced deep-learning architectures [33,34,35,36].

Based on these considerations, we conducted preliminary benchmarking among several candidate machine-learning models, including decision tree, random forest, support vector machine, neural network, and multinomial logistic regression. The random forest regressor demonstrated the closest agreement with real PV generation and exhibited the most stable performance across different periods; therefore, the RF model was selected as the forecasting method in this study.

The random forest technique was proposed by Breeman L. and is based on bagging, a machine learning method. A binary tree used for bagging is not statistically independent because it is made from the same dataset and, therefore, cannot reduce the variance of the mean indefinitely. To compensate for this shortcoming, the random forest method was used, which adds another randomization step based on a random subset of predictors for each partition of each bagged tree [36].

To train the RF model, the full dataset collected from January to December was divided into training and test subsets using a chronologically ordered approach. Approximately the first 70% of the data was used for model training, while the remaining 30% was reserved for testing. This forward time split preserves the temporal structure of the series and prevents future observations from being used in forecasting past events. The RF model was implemented using standard default parameters, except that a hyperparameter was set to 0.15 to control the proportion of predictors considered in each split. No additional hyperparameter optimization was performed. The selected configuration provided stable performance suitable for evaluating the effectiveness of MOS correction, but systematic hyperparameter tuning could further improve forecast accuracy.

Finally, the forecast model used only raw UM-LDAPS meteorological variables, such as temperature, wind speed, and irradiance, as input predictors.

In this study, the forecasting model was deliberately held constant to isolate the influence of the MOS-based meteorological bias correction. Incorporating multiple forecasting architectures or performing extensive hyperparameter tuning could obscure whether the observed performance improvements stem from the MOS correction itself or from model-specific optimization.

Before finalizing the forecasting model, several machine learning models—including decision tree, random forest, support vector machine, neural network, and multinomial logistic regression—were evaluated for their suitability. Among these candidates, the RF model exhibited the most stable performance and the closest alignment with the measured PV generation. Accordingly, random forest was selected as the baseline forecasting model for the subsequent analysis, considering both its performance and the characteristics of the available dataset.

3.3. Model Output Statistics (MOS)

The MOS technique can be applied to eliminate the systematic errors in numerical forecast models. MOS explains the systematic error and phase error of the numerical forecast model, and can produce forecast elements that the numerical forecast model cannot produce and forecast elements for specific points other than grid points. Therefore, the MOS technique has been developed and utilized as a numerical forecast model [37].

Mathiesen and Kleissl reported that data correction via the application of the MOS technique to a weather forecast model was useful when performing intraday solar forecasting and that doing so could minimize mean bias errors. Weather forecast models on which this was examined included the North American Model, Global Forecast System, European Center for Medium-Range Weather Forecasts, and GHI [38,39].

In summary, the principle behind the MOS technique involves reducing the error between numerically forecasted values and observed values. In this study, the MOS regression was formulated using the NWP forecast for each meteorological variable (temperature, wind speed, and solar radiation) as the predictor and the corresponding on-site observation as the response, enabling the MOS model to learn a direct correction mapping from

X^{N W P}

to

X^{o b s}

. Here

X^{N W P}

represents the raw NWP-predicted value and

X^{o b s}

denotes the actual measured value at the site.

Because systematic biases differ by variable, an independent MOS model was trained for each meteorological variable, while the MOS step itself is horizon-independent and is applied prior to PV forecasting.

In the present study, the primary purpose of applying MOS to the UM-LDAPS forecasts was to consistently reduce forecast error across all lead times and to mitigate the rapid error growth that typically occurs as the forecast horizon increases in photovoltaic (PV) power applications.

3.4. Metrics for Evaluating Model Performance

In this study, a statistical verification method was applied and evaluated in two steps to compare and verify the accuracy of the PV power forecast model before and after applying the MOS technique.

In the first step, the data before applying MOS and those after applying MOS through the random forest method were compared to the observed values for verification.

Two methods were used: the root mean squares for error (RMSE) method, which evaluates the accuracy in terms of the mean square of the difference between observed values and numerical forecast values and the index of agreement (IOA) method, which determines the agreement between observed and numerical forecast values. The

R M S E

and

I O A

verification methods, expressed in Equations (1) and (2), are typical statistical verification indicators used to evaluate input and output data for meteorological and air quality models. Here,

M

is a numerical forecast value, and

O

is an observed value [27,40,41,42].

R M S E = \sqrt{\frac{1}{N} \sum_{i = 1}^{n} {(M_{i} - O_{i})}^{2}}

(1)

I O A = 1 - \frac{\sum_{i = 1}^{n} {(M_{i} - O_{i})}^{2}}{\sum_{i = 1}^{n} {(|M_{i} - \bar{O}| + |O_{i} - \bar{O}|)}^{2}}

(2)

In the second step, the PV power forecast model was verified before and after applying MOS using the mean absolute percentage error (MAPE) method, as shown in Equation (3). The

n M A P E

verification method expresses the accuracy of a forecast model as a percentage and is the most widely used and suggested method in various forecast studies; this is because it facilitates the comparison of the accuracy of different forecast models [43,44,45,46,47].

n M A P E = \frac{1}{N} \sum_{i = 1}^{N} \frac{|u_{i} - \hat{u_{i}}|}{u_{e}} \times 100

(3)

Here,

u_{e}

is the equipment capacity,

u_{i}

and

\hat{u_{i}}

are the actual and observed values, respectively, and

N

is the number of observed data.

3.5. Economic Evaluation Method

The settlement amount under the Korean Renewable Energy Generation Forecasting System is determined based on the generator type, hourly generated energy, and the corresponding hourly forecast error. The settlement unit price varies with the hourly prediction error rate: when the error is greater than 8%, no payment is provided; when the error is between 6% and 8%, the settlement rate is 3 KRW/kWh; and when the error is less than or equal to 6%, the settlement rate increases to 4 KRW/kWh. The hourly settlement amount is calculated by multiplying the generated energy by the applicable accuracy-based unit price, as shown in Equation (4). The daily settlement amount is obtained by summing the hourly settlements across all time intervals, as shown in Equation (5), and the total settlement for a generator is computed by aggregating the daily amounts over the evaluation period

I = \sum_{t = 1}^{24} (P_{t} \times {F P}_{i, t})

(4)

M_{I} = \sum_{D = 1}^{31} D_{I}

(5)

I

is the daily settlement profit,

{F P}_{i, t}

is the settlement unit price at time

t

,

M_{I}

is the mosaic settlement profit, and

D_{I}

is the settlement amount for day

D

.

It should be noted that the economic evaluation presented in this study is based on simplified assumptions regarding the forecast-settlement mechanism and unit pricing structure defined by the Korea Power Exchange. The analysis focuses on the direct impact of forecast accuracy on settlement outcomes and does not explicitly account for uncertainty factors such as electricity price volatility, regulatory changes, or operational constraints. Therefore, the estimated economic benefits should be interpreted as indicative values rather than precise economic predictions.

4. Results

This section presents the results and discussion related to the application of the MOS technique to numerical weather forecasting data and its impact on photovoltaic power forecasting. First, the improved accuracy of meteorological data after applying MOS is analyzed, focusing on error rate reductions in temperature, wind speed, and solar radiation. Next, a comparison of solar power forecasting performance before and after MOS application is provided, highlighting significant improvements in forecasting error rates, particularly the reduction in normalized nMAPE across multiple forecast lead times.

4.1. Results of Numerical Weather Prediction Data Applying MOS

The effects of applying MOS to the UM-LDAPS numerical weather prediction data are summarized in Figure 3. Temperature, wind speed, and solar radiation exhibited different responses to MOS correction, reflecting variable-specific error characteristics.

For temperature (Figure 3a), MOS primarily reduced the magnitude of forecast errors, lowering the RMSE by 38.1%, while the IOA remained nearly unchanged. This indicates that the temporal evolution of temperature was already well captured by the NWP model, and MOS mainly corrected amplitude-related biases. Wind speed (Figure 3b) showed the most substantial improvement among the three variables, with RMSE reduced by 62.3% and IOA increased by 48.4% after MOS correction, indicating the presence of both magnitude and structural biases in the original forecasts. For solar radiation (Figure 3c), MOS reduced RMSE by 48.6%, while IOA remained relatively stable, suggesting that irradiance forecasts generally captured temporal variability but suffered from systematic amplitude errors. Although the IOA values indicate that the temporal patterns of the meteorological variables were already well captured, the RMSE reductions achieved through MOS effectively reduced magnitude- and bias-related errors in the input distributions, thereby leading to nonlinear improvements in RF-based PV generation forecasting accuracy and substantial reductions in nMAPE.

Overall, these results demonstrate that MOS provides variable-specific improvements, with particularly strong effects for wind speed, while consistently reducing forecast errors across all examined meteorological parameters. By correcting systematic and magnitude-dependent biases, MOS yields more reliable meteorological inputs for subsequent PV power forecasting.

4.2. Comparison of Solar Forecasting Results

In this study, the forecast and observed PV power generation values before and after the application of MOS were compared to systematically analyze the effect of the MOS technique on the PV power generation forecast model. The preceding times considered were 1–6 h, 7–12 h, and 19–24 h, corresponding to the update intervals of the UM-LDAPS model. Forecast performance was evaluated for approximately five months between 10 a.m. and 6 p.m.

Figure 4 illustrates time-series comparisons between observed and predicted PV power generation for representative periods at different forecast lead-time ranges. Figure 4a shows the results for a lead time of 1–6 h, where the MOS-corrected forecasts follow the observed PV output more closely, particularly during peak irradiance hours, while the forecasts without MOS exhibit overestimation. Figure 4b,c present similar comparisons for longer lead times. Across all cases, the MOS-enhanced forecasts consistently track the temporal evolution of the observed generation more accurately, whereas the non-corrected forecasts show noticeable positive bias, especially around midday peaks. These visual results are consistent with the quantitative performance metrics summarized in Figure 5, indicating that the MOS-based correction maintains stable forecasting performance across different lead-time ranges.

Before applying MOS, forecasting errors increased as lead time lengthened; however, after MOS correction, this dependence on lead time was significantly reduced. This demonstrates that the forecasting model becomes more robust to temporal uncertainty once the meteorological inputs are corrected.

And Mathiesen and Kleissl examined MOS-corrected forecasts of solar irradiance over continental U.S. SURFRAD sites for intra-day horizons, reporting MAPE decreases from roughly 20–30% down to ~5–10% when MOS statistical post-processing was applied [38].

Horat et al. applied post-processing to NWP-based predictions using the case of the Jacumba PV power plant, and the MAPE of the uncorrected model was approximately 9–12%, but when MOS or machine learning-based correction was applied, the MAPE decreased to 4–7%, and the error was reduced by approximately 40–60%, confirming a similar trend to this study [42].

The persistence model assumes that the PV power output at time t + h is equal to the most recent observed value at time t and serves solely as a baseline for contextual comparison. As shown in Figure 5, the resulting persistence nMAPE values were 16.1% for both the 1–6 h and 7–12 h lead-time ranges, and 7.21% for the 19–24 h range.

Across all evaluated lead-time ranges, the with MOS model consistently achieved substantially lower nMAPE values than the non-MOS forecasting model. For the 1–6 h lead-time range, the without MOS model exhibited a high nMAPE of 33.68%, whereas the MOS-applied model reduced the error to 7.17%, corresponding to an improvement of approximately 78.7% and clearly outperforming the persistence benchmark (16.1%). Similarly, for the 7–12 h lead-time range, the MOS-applied model maintained a low nMAPE of 7.80%, achieving an error reduction of approximately 72.6% compared to the non-MOS model and again outperforming the persistence benchmark (16.1%). In the longer-term forecast range of 19–24 h, the nMAPE of the MOS-applied model increased slightly to 8.10%. While this value is marginally higher than the persistence benchmark (7.21%), it remains substantially lower than the corresponding non-MOS forecasting error, indicating that MOS continues to provide meaningful performance improvements relative to the baseline forecasting framework.

These results indicate that MOS-based correction improves forecasting performance by reducing sensitivity to increasing lead time.

Beyond the comparison of forecasting accuracy, the economic implications of improved forecast performance were also quantitatively evaluated based on the economic assessment framework described in Section 3.5. Using the Korea Power Exchange forecast–settlement scheme, as formulated in Equations (4) and (5), the MOS-adjusted forecasts yielded an estimated benefit of approximately 2.34 USD/day, corresponding to about 854 USD annually for the analyzed 20 kW rooftop PV system. In addition, an internationally comparable assessment using the NREL cost-based framework estimated an annual economic benefit of approximately 296 USD. These values reflect the direct economic impact of forecast accuracy improvement within the examined evaluation period.

5. Conclusions

In this study, quality control was applied to UM-LDAPS numerical weather prediction data using the Model Output Statistics (MOS) technique, and the accuracy of both the meteorological inputs and the resulting photovoltaic (PV) power forecasts was assessed before and after MOS application. MOS-based correction for temperature, wind velocity, and solar radiation reduced the RMSE of all three variables by 38.1–62.3% and substantially improved the IOA for wind velocity, indicating enhanced reliability of NWP-derived inputs under the examined conditions.

These improvements in meteorological data quality translated into higher forecasting performance when using the random forest model. For forecast lead times of 1–6 h, 7–12 h, and 19–24 h, PV power forecasting accuracy was improved, maintaining nMAPE values within 7.17–8.10% after applying MOS during the evaluation period.

The economic evaluation suggests that improved forecast accuracy can lead to tangible economic benefits. While the quantified results depend on local regulatory frameworks and cost assumptions, the complementary assessment using a cost-based framework provides a transferable methodological perspective. Accordingly, although the numerical values are site-specific, the qualitative economic implications of improved forecast accuracy are relevant to broader electricity market contexts. Overall, these findings indicate that MOS-based meteorological correction can effectively reduce NWP errors and improve PV power forecasting performance across multiple lead-time ranges for the analyzed rooftop PV system and study period.

However, the analysis was limited to one 20 kW PV system and a four-month evaluation period, which may not fully capture seasonal or regional variability. Further work should therefore include multi-site and long-term validations, and explore advanced data-quality enhancement techniques such as outlier detection, adaptive filtering, and machine-learning-based data cleaning.

Author Contributions

Conceptualization, S.S.P. and E.J.K.; methodology, E.J.K. and S.J.O.; investigation, E.J.K.; writing—original draft preparation, S.S.P. and E.J.K.; writing—review and editing, Y.H.J., Y.C.P., S.S.P. and S.J.O.; supervision, S.S.P.; project administration, S.J.O.; funding acquisition, S.J.O. and S.S.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Industrial Technology Innovation Project of the Korea Institute of Energy Technology, Evaluation and Planning (KETEP); the Disaster Safety Ministry Collaboration R&D Program; it was granted financial resources from the Ministry of Trade, Industry, and Energy and the Ministry of the Interior and Safety, Republic of Korea (No. 20172420108770, 20226210100050, RS-2025-02311088).

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding authors.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

PV	Photovoltaic
MOS	Model Output Statistics
UM-LDPS	Unified Model-Local Data Assimilation and Prediction System
UM-GDAPS	Unified Model-Global Data Assimilation and Prediction System
RMSE	Root Mean Square Error
RF	Random Forest
MAPE	Mean Absolute Percentage Error
KMA	Korea Meteorological Administration’s
WMO	World Meteorological Organization

References

Masson, G.; Van Rechem, A.; del’Epine, M.; Jäger-Waldau, A. IEA PVPS Task 1, Snapshot of Global PV Markets 2025; International Energy Agency Photovoltaic Power Systems Programme (IEA PVPS): Paris, France, 2025. [Google Scholar] [CrossRef]
Korea Environmental Industry & Technology Institute (KEITI). Kinetic Report; Korea Environmental Industry & Technology Institute: Seoul, Republic of Korea, 2023; Volume 85. [Google Scholar]
Brancucci Martínez-Anido, C.; Florita, A.; Hodge, B.-M. The Impact of Improved Solar Forecasts on Bulk Power System Operations in ISO-NE: Preprint, NREL/CP-5D00-62817, National Renewable Energy Laboratory. September 2014. Available online: https://docs.nrel.gov/docs/fy14osti/62817.pdf docs.nrel.gov (accessed on 1 December 2024).
Porter, K.; Rogers, J. Status of Centralized Wind Power Forecasting in North America: May 2009–May 2010; Subcontract Report NREL/SR-550-47853; Exeter Associates, Inc.: Columbia, MD, USA; National Renewable Energy Laboratory: Columbia, MD, USA, 2010. Available online: https://docs.nrel.gov/docs/fy10osti/47853.pdf (accessed on 30 September 2025).
Rajput, S.K.; Kulshrestha, D.; Paliwal, N.; Saxena, V.; Manna, S.; Alsharif, M.H.; Kim, M.-K. Forecasting capacitor banks for improving efficiency of grid-integrated PV plants: A machine learning approach. Energy Rep. 2025, 13, 140–160. [Google Scholar] [CrossRef]
Inman, R.H.; Pedro, H.T.C.; Coimbra, C.F.M. Solar forecasting methods for renewable energy integration. Prog. Energy Combust. Sci. 2013, 39, 535–576. [Google Scholar] [CrossRef]
Ahmed, R.; Sreeram, V.; Mishra, Y.; Arif, M.D.M. A review and evaluation of the state-of-the-art in PV solar power forecasting: Techniques and optimization. Renew. Sustain. Energy Rev. 2020, 124, 109792. [Google Scholar] [CrossRef]
El Hendouzi, A.; Bourouhou, A. Solar Photovoltaic Power Forecasting. J. Electr. Comput. Eng. 2020, 2020, 8819925. [Google Scholar] [CrossRef]
Iheanetu, K.J.; Zhao, Y.; Asumadu, J.A.; Iqbal, T. Solar Photovoltaic Power Forecasting: A Review. Sustainability 2022, 14, 17005. [Google Scholar] [CrossRef]
Ali, M.; Gul, M.; Ahmad, S. Enhancing PV power forecasting through feature selection and artificial neural networks: A case study. Sci. Rep. 2025, 15, 7038. [Google Scholar] [CrossRef]
Jufri, F.H.; Jung, J.S. Photovoltaic generation forecasting using artificial neural networks model with input variables and model parameters selection algorithm in Korea. Int. J. Mach. Learn. Comput. 2017, 7, 156–161. [Google Scholar] [CrossRef]
Iliadis, P.; Ntomalis, S.; Atsonios, K.; Nikolopoulos, N.; Grammelis, P. Energy management and techno-economic assessment of a predictive battery storage system applying a load levelling operational strategy in island systems. Int. J. Energy Res. 2021, 45, 2709–2727. [Google Scholar] [CrossRef]
Ehtsham, M.; Rotilio, M.; Cucchiella, F. Deep learning augmented medium-term photovoltaic energy forecasting: A coupled approach using PVGIS and numerical weather model data. Energy Rep. 2025, 13, 4299–4317. [Google Scholar] [CrossRef]
Singh, U.; Singh, S.; Gupta, S.; Alotaibi, M.A.; Malik, H. Forecasting rooftop photovoltaic solar power using machine learning techniques. Energy Rep. 2025, 13, 3616–3630. [Google Scholar] [CrossRef]
Balali, M.H.; Nouri, N.; Rashidi, M.; Safavi, A.A.; Rahbar, K. A multi-predictor model to estimate solar and wind energy generations. Int. J. Energy Res. 2017, 42, 696–706. [Google Scholar] [CrossRef]
Kaur, A.; Nonnenmacher, L.; Pedro, H.T.; Coimbra, C.F. Benefits of solar forecasting for energy imbalance markets. Renew. Energy 2016, 86, 819–830. [Google Scholar] [CrossRef]
IEA PVPS Task 16. Best Practices Handbook for the Collection and Use of Solar Resource Data for Solar Energy Applications: Fourth Edition; Report IEA-PVPS T16-6:2024; International Energy Agency: Paris, France, 2024. Available online: https://www.osti.gov/servlets/purl/2448063/ (accessed on 30 October 2025).
Das, S. Short term forecasting of solar radiation and power output of 89.6 kWp solar PV power plant. Mater. Today Proc. 2020, 39, 1959–1969. [Google Scholar] [CrossRef]
Li, B.; Zhang, J. A review on the integration of probabilistic solar forecasting in power systems. Sol. Energy 2020, 210, 68–86. [Google Scholar] [CrossRef]
Prema, V.; Rao, K.U. Predictive models for power management of a hybrid microgrid—A review. In Proceedings of the International Conference on Advances in Energy Conversion Technologies (ICAECT) 2014, Manipal, India, 23–25 January 2014; pp. 7–12. [Google Scholar] [CrossRef]
Nehrir, M.H.; Wang, C.; Strunz, K.; Aki, H.; Ramakumar, R.; Bing, J.; Miao, Z.; Salameh, Z. A review of hybrid renewable/alternative energy systems for electric power generation: Configurations, control, and applications. IEEE Trans. Sustain. Energy 2011, 2, 392–402. [Google Scholar] [CrossRef]
Jo, H.D.; Han, S.K. The development of the practical photovoltaic power prediction and renewal model. In Proceedings of the 47th KIEE Summer Conference, Pyeongchang, Republic of Korea, 3–7 July 2016; pp. 1116–1117. [Google Scholar]
Yang, D.; Kleissl, J.; Gueymard, C.A.; Pedro, H.T.; Coimbra, C.F. History and trend in solar irradiance and PV power forecasting: A preliminary assessment and review using text mining. Sol. Energy 2018, 168, 60–101. [Google Scholar] [CrossRef]
Amarasinghe, A.T.; Pathirana, P.N.; Caetano, M.; Hewage, K. Ensemble models for solar power forecasting—A weather classification approach. AIMS Energy 2020, 8, 252–276. [Google Scholar] [CrossRef]
Falces, A.; Capellan-Villacian, C.; Mendoza-Villena, M.; Zorzano-Santamaria, P.J.; Lara-Santillan, P.M.; Garcia-Garrido, E.; Fernandez-Jimenez, L.A.; Zorzano Alba, E. Short-term net load forecast in distribution networks with PV penetration behind the meter. Energy Rep. 2023, 9, 115–122. [Google Scholar] [CrossRef]
Liu, Z.; Du, Y. Evolution towards dispatchable PV using forecasting, storage, and curtailment: A review. Electr. Power Syst. Res. 2023, 223, 109554. [Google Scholar] [CrossRef]
Guermoui, M.; Zhang, R.; Mohamed, A.; Wu, D.; Liang, Y.; Kaouane, M.; Boudiaf, R.; Jurasz, Z. An analysis of case studies for advancing photovoltaic power forecasting through multi-scale fusion techniques. Sci. Rep. 2024, 14, 57398. [Google Scholar] [CrossRef]
Korea Meteorological Administration. Verification Report of the 2024 Numerical Weather Prediction System, Korea Meteorological Administration, 2024. Available online: https://www.kma.go.kr/kma/resources/introduce/yearbook_2024.pdf (accessed on 1 December 2024).
Korea Meteorological Administration. Synoptic Weather Observation Data, Korea Meteorological Administration. Available online: https://data.kma.go.kr/data/rmt/rmtList.do?code=340&pgmNo=65 (accessed on 1 December 2024).
Voyant, C.; Notton, G.; Kalogirou, S.; Nivet, M.L.; Paoli, C.; Motte, F.; Fouilloy, A. Machine learning methods for solar radiation forecasting: A review. Renew. Energy 2017, 105, 569–582. [Google Scholar] [CrossRef]
Zamo, M.; Mestre, O.; Arbogast, P.; Pannekoucke, O. A benchmark of statistical regression methods for short-term forecasting of photovoltaic electricity production, Part I: Deterministic forecast of hourly production. Sol. Energy 2014, 105, 792–803. [Google Scholar] [CrossRef]
Niu, D.; Wang, K.; Sun, L.; Wu, J.; Xu, X. Short-term photovoltaic power generation forecasting based on random forest feature selection and CEEMD: A case study. Appl. Soft Comput. 2020, 93, 106389. [Google Scholar] [CrossRef]
Jogunuri, S.; Josh, F.T.; Stonier, A.A.; Ganji, V.; Jayaraj, J. Random forest machine learning algorithm based seasonal multi-step ahead short-term solar photovoltaic power output forecasting. IET Renew. Power Gener. 2025, 19, e12921. [Google Scholar] [CrossRef]
Asiedu, S.T.; Suvedi, A.; Wang, Z.; Rekabdarkolaee, H.M.; Hansen, T.M. Spatiotemporal Downscaling Model for Solar Irradiance Forecast Using Nearest-Neighbor Random Forest and Gaussian Process. Energies 2025, 18, 2447. [Google Scholar] [CrossRef]
Gaboitaolelwe, J.; Zungeru, A.M.; Yahya, A.; Lebekwe, C.K.; Vinod, D.N.; Salau, A.O. Machine Learning Based Solar Photovoltaic Power Forecasting: A Review and Comparison. IEEE Access 2023, 11, 40840–40865. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
Dallavalle, J.P.; Erickson, M.C.; Maloney, J.C., III. Model output statistics (MOS) guidance for short-range projections. In Proceedings of the 19th Conference on Weather Analysis and Forecasting, San Antonio, TX, USA, 13–17 July 2002. [Google Scholar]
Mathiesen, P.; Kleissl, J. Evaluation of numerical weather prediction for intra-day solar forecasting in the continental United States. Sol. Energy 2011, 85, 1511–1520. [Google Scholar] [CrossRef]
Glahn, H.R.; Lowry, D.A. The use of model output statistics (MOS) in objective weather forecasting. Meteorology 1972, 11, 1203–1211. [Google Scholar] [CrossRef]
Kim, C.H.; Lee, S.H.; Jang, M.; Kim, H.J.; Park, Y.; Choi, J. A study on statistical parameters for the evaluation of regional air quality modeling results. J. Environ. Impact Assess. 2020, 29, 272–285. [Google Scholar] [CrossRef]
Zhang, R.; Bu, S.; Zhou, M.; Li, G.; Zhan, B.; Zhang, Z. Deep reinforcement learning based interpretable photovoltaic power prediction framework. Sustain. Energy Technol. Assess. 2024, 67, 103830. [Google Scholar] [CrossRef]
Horat, N.; Klerings, S.; Lerch, S. Improving Model Chain Approaches for Probabilistic Solar Energy Forecasting through Post-processing and Machine Learning. Adv. Atmos. Sci. 2025, 42, 297–312. [Google Scholar] [CrossRef]
Hanke, J.E.; Reitsch, A.G. Business Forecasting, 5th ed.; Prentice Hall: Hoboken, NJ, USA, 1995. [Google Scholar]
Bowerman, B.L.; O’Connell, R.T.; Koehler, A.B. Forecasting, Time Series, and Regression: An Applied Approach, 4th ed.; Brooks/Cole (Thomson Brooks/Cole): Pacific Grove, CA, USA, 2005. [Google Scholar]
Hyndman, R.J.; Koehler, A.B. Another look at measures of forecast accuracy. Int. J. Forecast. 2006, 22, 679–688. [Google Scholar] [CrossRef]
Mustaqeem, M.I.; Kwon, S. A CNN Assisted deep echo state network using multiple Time Scale dynamic learning reservoirs for generating Short Term solar energy forecasting. Sustain. Energy Technol. Assess. 2022, 52, 102275. [Google Scholar] [CrossRef]
Ghasempour, A. Support Vector Regression to Predict Power Consumption; Electrical & Computer Engineering Technical Report; University of New Mexico: Albuquerque, NM, USA, 2024; Available online: https://digitalrepository.unm.edu/cgi/viewcontent.cgi?article=1069&context=ece_rpts (accessed on 30 September 2025).

Figure 1. Process flow of PV power generation forecasting.

Figure 2. Configuration and operation of the UM-LDAPS model: (a) variable lattice field over the Korean Peninsula; (b) data production cycle for forecast generation.

Figure 3. Comparison of RMSE and IOA for (a) temperature, (b) wind velocity, and (c) solar radiation before and after the application of MOS.

Figure 4. Comparison of solar power generation forecasting before and after MOS application: (a) preceding time 1–6 h, (b) 7–12 h, and (c) 19–24 h.

Figure 5. Comparison of nMAPE of solar power generation forecasting before and after MOS application.

Table 1. Detailed specifications of the 20 kW PV module.

Module 1		Module 2
Pmax (Tolerance) (±3%)	250 W	Pmax (±3%)	250 W
Vmpp	30.9 V	Vmpp	30.1 V
Voc	37.4 V	Voc	37.7 V
Isc	8.7 A	Isc	8.83 A
Max. System voltage	1000 V	Impp	8.31 A
Max. Series fuse rating	15 A	Max. Series fuse rating	15 A

Table 2. Specifications of the server for PV power generation forecasting.

Type	Details
OS	CentOS 6.x Final (Kernel Tunning Ver related Shared Memory)
Library	NetCDF(v3, v4) compiled by each compiler ncl-ncarg, ncview, hdf4, hdf5, opengrads, nco, png, jpeg, jasper, szip, udunits, etc.
Compiler	PGI Compiler v12.x or v13.x, Intel Compiler v14.x or v15.x
Linux Cluster	Linux server for calculation R740XD (PowerEdge R740xd, Dell Technologies, Round Rock, TX, USA)
Specifications	Intel(R) 18Core GOLD 6254 3.1 GHz 2EA, 25 M 192 GB Memory (12 × 16 GB) 2933 MT/s RDIMM OS Mirror—480 GB SSD SAS 12 Gbps RAID Controller, 2 Gb NV Cache
Storage Device	DATA HDD—12 TB 7.2 K 12 Gbps SAS

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Kim, E.J.; Jeon, Y.H.; Park, Y.C.; Park, S.S.; Oh, S.J. Evaluation of Photovoltaic Generation Forecasting Using Model Output Statistics and Machine Learning. Energies 2026, 19, 486. https://doi.org/10.3390/en19020486

AMA Style

Kim EJ, Jeon YH, Park YC, Park SS, Oh SJ. Evaluation of Photovoltaic Generation Forecasting Using Model Output Statistics and Machine Learning. Energies. 2026; 19(2):486. https://doi.org/10.3390/en19020486

Chicago/Turabian Style

Kim, Eun Ji, Yong Han Jeon, Youn Cheol Park, Sung Seek Park, and Seung Jin Oh. 2026. "Evaluation of Photovoltaic Generation Forecasting Using Model Output Statistics and Machine Learning" Energies 19, no. 2: 486. https://doi.org/10.3390/en19020486

APA Style

Kim, E. J., Jeon, Y. H., Park, Y. C., Park, S. S., & Oh, S. J. (2026). Evaluation of Photovoltaic Generation Forecasting Using Model Output Statistics and Machine Learning. Energies, 19(2), 486. https://doi.org/10.3390/en19020486

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Evaluation of Photovoltaic Generation Forecasting Using Model Output Statistics and Machine Learning

Abstract

1. Introduction

2. Methodology

2.1. Site and Data Description

2.2. Process Description

3. Forecasting Method

3.1. UM-LDAPS Model

3.2. Machine Learning

3.3. Model Output Statistics (MOS)

3.4. Metrics for Evaluating Model Performance

3.5. Economic Evaluation Method

4. Results

4.1. Results of Numerical Weather Prediction Data Applying MOS

4.2. Comparison of Solar Forecasting Results

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI