Next Article in Journal
Potential Use of Brewer’s Spent Grain By-Product as a Component for Sustainable Thermal Mortars
Previous Article in Journal
Effect of Hydraulic Retention Time on Nutrient Removal in a Microalgae-Based Tertiary Treatment: A Pilot-Scale Study in Winter Conditions
Previous Article in Special Issue
Towards Sustainable Electricity for All: Techno-Economic Analysis of Conventional Low-Voltage-to-Microgrid Conversion
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Towards More Sustainable Photovoltaic Systems: Enhanced Open-Circuit Voltage Prediction with a New Extreme Meteorological Year Model

by
Carlos Sanchís-Gómez
1,
Jorge Aleix-Moreno
1,
Carlos Vargas-Salgado
2,3,* and
David Alfonso-Solar
2,4
1
Departamento de Ingeniería de Grupotec Renovables, Grupotec Servicios Avanzados SA, 46011 Valencia, Spain
2
Instituto Universitario de Ingeniería Energética, Universitat Politècnica de València, 46022 Valencia, Spain
3
Departamento de Ingeniería Eléctrica, Universitat Politècnica de València, 46022 Valencia, Spain
4
Departamento de Termodinámica Aplicada, Universitat Politècnica de València, 46022 Valencia, Spain
*
Author to whom correspondence should be addressed.
Sustainability 2025, 17(16), 7554; https://doi.org/10.3390/su17167554
Submission received: 28 July 2025 / Revised: 15 August 2025 / Accepted: 18 August 2025 / Published: 21 August 2025
(This article belongs to the Special Issue Renewable Energy Conversion and Sustainable Power Systems Engineering)

Abstract

Accurate prediction of maximum voltage is essential for the safe, efficient, and sustainable design of photovoltaic systems, as it defines the maximum allowable number of modules in series. This study examines how the choice of meteorological year affects voltage estimations in high-power PV systems. A comparison is made between maximum voltage results derived from typical meteorological (TMY) years and those based on inter-hourly historical data. The results reveal notable differences, with TMY often underestimating extreme voltage levels. To address this, the study introduces the Extreme Meteorological Year (EMY) model, which uses historical voltage percentiles to better estimate peak voltages and mitigate overvoltage risk. This model has been applied successfully in real PV plant designs. Its performance is assessed using monitoring data from seven PV projects in different regions. The EMY model demonstrates improved accuracy and safety in predicting maximum voltages compared to traditional datasets. Its percentile-based structure enables adaptation to different design criteria, enhancing reliability and supporting more sustainable photovoltaic deployment. Overall, the study underscores the importance of selecting appropriate meteorological data for voltage prediction and presents EMY as a robust tool for improving PV system design.

1. Introduction

The transition to sustainable energy systems is one of the most relevant challenges of the 21st century, and the last few years have shown that photovoltaic energy plays a relevant role. As a key renewable energy source, solar power has seen significant advancements in both technology and efficiency over recent decades, positioning it as a cornerstone of the clean energy revolution. Recent studies have highlighted the accelerating deployment of PV systems worldwide, with projections indicating a substantial increase in their share of global energy production by 2050 [1]. However, despite its rapid growth, society needs to keep innovating in PV technology and associated systems in order to optimize performance, reduce costs, and achieve a more effective integration with other renewable resources. Ongoing research is crucial to addressing challenges such as energy storage, grid integration, and the materials used in photovoltaic cells, all of which are critical to maximizing the potential of solar power as part of a broader energy transition [2]. Continued progress in these areas is the only way to achieve a sustainable, low-carbon-energy future.
Typical Meteorological Year (TMY) data allow the simulation of the PV system’s performance under typical environmental conditions, providing an energy production estimation appropriate to evaluate the project’s economic viability as evaluated in [3]. Though TMY methodology is part of the state of the art in the PV industry, the TMY model was initially developed for building energy consumption prediction, as defined in [4] and evaluated in more recent studies [5]; however, more recent studies such as [6] have re-evaluated the suitability of its use due to the accuracy limitations involved. So far, several studies have been developed to properly evaluate different algorithms for generating TMY datasets against historical data. These algorithms, as the new model introduced in [7], may adjust the TMY definition approach depending on the weight assigned to the different environmental variables, leading to obtaining different TMY datasets that are the object of analysis and comparison, as is carried out at [8]. TMY datasets are typically created by evaluating historical weather data, selecting candidate periods, and comparing their deviations against reference parameters. Different algorithms use various weather variables from the historical dataset, such as DNI, GHI, wind speed, or temperature to define what constitutes ‘typical’ conditions for each specific location. As a result, each model generates its own TMY based on its interpretation of normality, depending on which parameters it prioritizes, as evaluated in [9,10].
TMY datasets are typically used to estimate PV systems’ generation, so existing studies are based on TMY assessments for yield evaluation, for example, [11,12]. So relevant is the impact of these datasets for renewable energy predictions that even nowadays, more than 40 years later, the development of new methods for TMY calculation is a potential source of research, providing novel methods in more recent studies, such as [13], as well as innovative applications based on genetic algorithms [14] and machine learning [15]. Also, the view provided by more traditional studies should be considered; for instance, comparative assessments about different models for TMY datasets’ definition and their implications over results, as evaluated in [16] or [17]. In parallel, competitive approaches for predicting photovoltaic system performance have increasingly incorporated advanced machine learning techniques. The recent literature includes benchmarking studies and optimization-based models, such as the use of Long Short-Term Memory (LSTM) neural networks combined with metaheuristic optimization algorithms (e.g., brown-bear optimization) for floating solar power system prototyping and performance forecasting [18], as well as comprehensive evaluations integrating experimental investigation and thermal performance analysis for floating photovoltaic systems in specific climatic contexts, such as Saudi Arabia [19]. These contributions highlight the complementary role that machine learning models can play alongside TMY-based approaches, especially in enhancing prediction accuracy under dynamic or site-specific operating conditions.
Incorporating TMY datasets is essential during the design and evaluation phases of PV systems, as explained in [20]. These datasets offer a standardized representation of typical climatic conditions for specific locations, facilitating accurate performance assessments. TMY datasets compile historical weather data, such as temperature, irradiance, and wind speed, into a single year of hourly values that epitomize the most prevalent weather patterns over extended periods. One of the key advantages of using TMY datasets in the PV industry is their ability to optimize performance simulations, reducing computational demands while maintaining accuracy, but also thanks to the worldwide availability of accurate TMYs, as provided by [21,22]. Although different TMY models were developed and evaluated during the early renewable energy stages, as shown in [23], recent studies underscore their importance in PV system planning. For instance, a study [24] introduced a bootstrap method to evaluate the accuracy of TMY data in predicting long-term PV yields, demonstrating that precise TMY datasets are crucial for reliable energy production forecasts. TMY datasets in the PV industry have also been widely justified by their significant impact on the bankability of PV projects, enabling long-term energy yield predictions that are crucial for financial assessments and investment decisions, as explained in [25]. Nevertheless, several studies provide alternative methods to develop bankable weather data for PV projects, such as [26,27].
While TMY datasets are often used to predict average performance of PV systems, thanks to accurate modeling of long-term climatic conditions as shown in [28], they do not adequately represent extreme weather conditions. This limitation is critical when determining maximum voltage predictions and defining string sizes (number of modules connected in series) in PV systems, as string size directly influences the system’s maximum voltage, as justified in [29]. Therefore, the use of TMY datasets to calculate the maximum string size can introduce uncertainties and lead to voltage underestimations, since these datasets do not include rare but potentially critical extreme weather events. Recent studies such as [30,31] highlight the necessity of incorporating these extreme weather data into PV systems design. In addition, TMY datasets, by definition, covers typical rather than extreme conditions, which makes them unsuitable for scenarios that require worst-case analyses, given the increasing trend towards extreme environmental events, as studied in [32]. This distinction is crucial for accurately determining string sizes, as extreme temperatures can cause the Voc level to rise, potentially exceeding the inverter’s maximum voltage rating. If this limit is exceeded, system shutdowns can occur, resulting in lost production and reduced overall efficiency. In addition, prolonged exposure to voltages above the inverter’s design specifications can lead to inverter’s damage and warranty loss, reducing its lifetime and increasing maintenance costs. Accurately predicting extreme conditions is essential to ensure the PV system’s reliability, safety, and long-term performance, which is particularly interesting in USP. To address this gap, PV system designers should integrate historical extreme weather data and consider site-specific environmental factors during the design phase. This comprehensive approach ensures that string sizing calculations account for potential voltage variations under extreme conditions, enhancing system reliability and ensuring that the voltage limit at the input of the inverters is not exceeded.
It is important to mention that maximizing string size in USP is a key strategy for cost optimization and efficiency improvement, as explained in [29]. For the same peak power, the increase in the string size means reducing the number of mounting structures, wiring, and trenching required for the project, so costs can be significantly reduced. Similarly, in projects with a reduced area for the PV system, it can lead to an increase in installed peak power. This approach is especially beneficial for large-scale PV installations, where power maximization or savings in materials such as steel components and electrical wiring contribute to a more cost-effective design. In addition, for a given power output, increasing the number of modules per string reduces energy losses by decreasing current levels and maximizing voltage, always seeking to respect the maximum voltage level supported by the inverter, as described in the previous paragraph.
This article analyzes the limitations of TMY datasets for estimating the maximum Voc level in PV systems. As TMY sets exclude rare but severe weather events, designs based on them can underestimate Voc and expose the system to over-voltage risk. The current assessment quantifies this risk by comparing Voc results derived from TMY data with those obtained from the full historical weather record.
Although TMY-based assessments are widely used for energy yield simulations and general PV system analysis, the literature review reveals a significant gap: most studies focus on average performance indicators, neglecting the accurate estimation of extreme electrical conditions such as maximum open-circuit voltage. This omission can lead to inadequate voltage ratings and potential equipment failures in large-scale installations. Furthermore, existing approaches that attempt to address extremes often require processing large multi-year datasets, which can be computationally demanding and impractical for early-stage engineering design. The extreme Meteorological Year (EMY) is included to bridge the gap. This new configurable dataset preserves the most extreme conditions recorded in history for overvoltage scenarios, keeping the computational efficiency of an annual simulation. EMY can be tuned by voltage percentile: EMY100 contains the single most extreme year, while lower percentiles capture progressively milder extremes, allowing designers to match string sizing to their preferred risk tolerance. By providing a more realistic upper bound for Voc, EMY enables safer PV designs, mitigates over-voltage incidents, and optimizes string-sizing decisions.
The novelty of EMY lies in its ability to combine extreme-value statistical selection with the simplicity of a single synthetic year, making it directly applicable in industrial workflows where rapid yet reliable voltage prediction is essential. Unlike other methods targeting general climate modeling or machine learning-based forecasting, EMY is specifically tailored to predict the maximum Voc that conditions PV string sizing, a critical design constraint in utility-scale projects. By providing a more realistic upper bound for Voc, EMY enables safer PV designs, mitigates over-voltage incidents, and optimizes string-sizing decisions.
The motivation for this work stems from the need to equip PV designers and engineers with a practical, accurate, and computationally efficient tool to predict maximum voltages under real-world extremes. This paper is structured as follows: Section 2 describes the data sources and methodology, including the EMY model formulation and justification; Section 3 presents the validation results against SCADA data from seven utility-scale PV plants; Section 4 provides a comparative analysis with TMY-based predictions and discusses model adaptability; and Section 5 concludes with potential applications, limitations, and directions for future improvements.

2. Materials and Methods

This section outlines and justifies the methodology used in the evaluation. First, the analysis shows that using TMY datasets to define PV string size is insufficient, particularly when maximizing string length is a key design objective. Based on these findings and to address the identified limitations, a new weather dataset model is proposed, offering PV systems designers an alternative for validating string size under defined uncertainty and risk thresholds. Finally, the new EMY model is validated against data monitored from different high-power PV systems, proving the strength and potential use of this new model developed to predict maximum voltage levels in PV systems efficiently. Figure 1 shows a flowchart of the applied methodology.

2.1. Data

The assessment is done through historical data of 5 different sites, with a 5-min time precision, including 10 years of data for each site. Given the need to use more than 10 years of weather data for the definition of a TMY, as requested in [4], the weather data used for the evaluation is obtained from Solcast, one of the most reliable sources of weather data according to different studies such as [33] or [34]. Solcast provides through [35] irradiance and weather data based on models and algorithms applications over geostationary weather satellites. Five different PV sites from different regions are used for the current research (Table 1).
The evaluation of each site’s data is based on the main parameters that influence the PV system’s operational voltage as derived from Equations (1)–(3):
  • Global Tilted Irradiance—GTI
  • Ambient temperature—Tamb
  • Wind speed—Ws
Although the original weather data analysis is not a target for this assessment, a first view of each site’s conditions is provided in Table 2. It is important to note that all evaluations in this study, including those involving historical data, TMY, and EMY models, are based on the same dataset. Both TMY and EMY are generated from the same historical records for each site, ensuring that voltage analyses are performed using a consistent data source. This approach enables a fair comparison of results and eliminates potential biases or errors arising from dataset discrepancies.
A single PV module model is used in the assessment, whose main characteristics are shown in Table 3.

2.2. Voc Analysis

As previously introduced, the present analysis focuses on the maximum voltage level in PV systems, which depends on the PV module’s open-circuit voltage. This voltage level is reached during daylight and inverter’s non-operational scenarios, events that may be caused by some of the following situations:
  • The voltage level in PV modules is lower than the minimum operational voltage of the inverter during the first and last daylight hours of each day.
  • During inverter damage or maintenance.
  • When PV field I-V curve differs from the inverter’s operational curve (current and voltage operational ranges).
Under Voc conditions, the module voltage depends on its electrical characteristics (specified in the datasheet), irradiance, and Tcell. Although various models have been proposed in the literature to estimate Tcell, this study does not compare them. Instead, we adopt the model presented in [36] for Voc calculations, as its accuracy has been validated in [29]. However, any other Tcell model could be used in place of Equation (3), depending on the user’s preference.
Voc_string = Voc (T,G) × n
Voc (T,G) = Voc_stc (G) × (1 + β(Tcell − 25))
Tcell = [U × Tamb + Geff × (ατ − η × (1 + β × Tamb_stc))]/[U − (β × η × Geff)],
ατ = 0.81
U = 24.1 + 2.9 × Ws
where
  • n: number of modules connected in series;
  • Voc (G,T): module’s open-circuit voltage for a certain irradiance and temperature;
  • Voc_stc (G): module’s open-circuit voltage for a certain irradiance level;
  • β: module’s temperature coefficient of Voc;
  • Tcell: module’s cell temperature, equation obtained from [34];
  • U: heat transfer coefficient obtained from Tcell expression;
  • Tamb: ambient temperature;
  • Geff: effective irradiance in module’s surface;
  • ατ: coefficients;
  • η: module efficiency;
  • Tamb_stc: ambient temperature at STC;
  • Equation (1) is derived from the Ohm’s law;
  • Equation (2) is derived from the standard PV performance model;
  • Equation (3) as best Tcell model following the evaluation carried out in [29].
It is important to note that all Voc calculations in this study followed the same model, represented by Equations (1)–(3), which obtained Voc levels from meteorological conditions. This approach eliminates any additional error that could arise from using different Voc models, while ensuring consistency in the analysis. Moreover, it allows other researchers to replicate the study using alternative models at their discretion, as long as the same model is consistently applied to estimate Voc under the different scenarios considered (historical data, TMY, and EMY).
Voc level is highly dependent on weather conditions (irradiance, ambient temperature, and wind speed), as denoted from Equations (1)–(3), so the Voc accuracy estimation during PV systems design stage is conditioned by the weather dataset used for the calculation. For computational efficiency, PV systems simulations are usually carried out through hourly TMY data instead of the full historical and intra-hourly datasets, which would require a much more intensive calculation. Due to the voc estimation relevance for the PV system design, especially for USP, it is crucial for uncertainty reduction to have the appropriate evaluation of voc results using different data precisions (hourly TMY vs. intra-hourly historical data). This evaluation requires extra computational capacity and time.
This analysis focuses on the voltage differences between inter-hourly historical data and the associated TMY dataset. Using Equation (3), Voc is calculated for both datasets, and a comparative analysis is performed to quantify the voltage error introduced by relying on TMY. This highlights the impact of using TMY-derived predictions, given the discrepancies found between TMY and the historical data from which it is constructed.

2.3. Extreme Meteorological Year—Model Definition

The analysis and results obtained for the TMY Voc assessment, which are included in Section 3.2, provide a clear conclusion: TMY do not provide an appropriate framework for the safety estimation of Voc due to data precision and variability. Therefore, a new methodology for Voc estimation is required, especially when it is necessary to maximize the string size, which is the case of the high-power PV systems.
To address the deviations between TMY and intra-hourly data within a controlled uncertainty and risk framework, this study introduces a new weather dataset for Voc calculation in PV systems: the Extreme Meteorological Year (EMY). With this new model, a methodology for the definition of meteorological years, based on PV systems Voc evaluation, is presented. While the EMY model selects data based on hourly evaluations, the final year is assembled from complete days of historical records. This preserves the natural environmental continuity within each day, as originally recorded. The EMY model is introduced in Figure 2.
  • Stage 1—Inter-hourly data conversion to hourly average data. Inter-hourly weather data from the historical dataset are used to generate hourly values. The conversion is based on the hourly average (Figure 3). The dataset will include all the weather variables that might be interesting according to the study case, but the minimum is Tamb, GTI, and Ws (parameters required for Voc calculation):
Figure 3. EMY methodology: Stage 1.
Figure 3. EMY methodology: Stage 1.
Sustainability 17 07554 g003
  • Stage 2—Voc calculation. Tcell and Voc values are calculated based on the hourly weather data and PV module electrical characteristics, following Equations (1)–(3). Thus, each day’s data with hourly precision, including the weather parameters, Tcell, and string Voc results (which also depends on string size), will be treated as a candidate to make up each day of hour data of the new EMY (Figure 4).
Figure 4. EMY methodology: Stage 2.
Figure 4. EMY methodology: Stage 2.
Sustainability 17 07554 g004
  • Stage 3—Evaluation of candidate data. For each day in the EMY, candidate days are selected from the historical dataset while preserving hourly resolution. Specifically, for each calendar day, the model gathers complete daily profiles from each available year. Therefore, if 10 years of data are available, there will be 10 full days, one from each year, as candidates to represent that specific day in the EMY (Figure 5).
Figure 5. EMY methodology: Stage 3.
Figure 5. EMY methodology: Stage 3.
Sustainability 17 07554 g005
  • Stage 4—EMY definition. Each group of candidate days is then filtered to select a single representative day in the EMY, for each calendar day of the new year under definition. This filtering is based on the maximum Voc value observed in each candidate day, along with its corresponding weather conditions (Tamb, GTI, and Ws). The final selection is determined using a Voc percentile, which ranks the candidate days by their maximum Voc. The user defines the desired risk level through the chosen percentile; once selected, the corresponding day is included in the EMY as the representative for that calendar date (Figure 6).
Figure 6. EMY methodology: Stage 4.
Figure 6. EMY methodology: Stage 4.
Sustainability 17 07554 g006
As a result, a new extreme meteorological year for Voc and string size evaluation in PV systems has been provided, based on users’ umbral risk, which is implemented in the model through percentile selection. For instance, a non-risky user will define the EMY through a 100% percentile, which means that EMY will cover the maximum voltage level according to the historical dataset.

2.4. EMY Validation

This section evaluates and validates the EMY model using real data from seven utility-scale PV projects in diverse geographic regions. These commercial installations are currently under operation and commissioned in different years, thus including various PV technologies.
The analysis compares the Voc levels estimated by the TMY and EMY models with reference Voc values derived from each project’s SCADA for the year 2023. It is important to note that Voc is not directly monitored in commercial PV systems, as voltage measurements are typically taken at the inverter level, and inverters are inactive under open-circuit conditions. Therefore, the reference Voc values are calculated, using weather data collected by on-site weather stations and sensors, applying the same methodology (Equations (1)–(3)) used for the TMY and EMY-based estimations.
Each project’s TMY and EMY datasets are generated using SOLCAST historical data covering from 2007 to 2023. SOLCAST is employed to ensure a sufficiently long time series (>10 years of data), which cannot be sourced directly from the SCADA systems due to the relatively recent commissioning of some of the projects.
A summary of the key characteristics of each PV project is presented in Table 4, including Tamb and GTI as representative environmental parameters for each location.

3. Results

3.1. TMY Voc Assessment

Evaluating the maximum voltage of a PV system using a TMY dataset overlooks extreme and critical overvoltage scenarios, since such conditions are excluded by the statistical methods used to generate TMY data. This section presents a voltage comparison between historical datasets and their corresponding TMY datasets to quantify the error introduced when estimating maximum voltage based on TMY data.

3.1.1. Accuracy Evaluation

During Voc evaluation in PV systems, there is a notable difference in accuracy between results obtained from TMY datasets and those from high-resolution historical datasets (hourly vs. inter-hourly data). This discrepancy is further amplified by the limited scope of TMY data, which typically represent a single year, compared to historical datasets that span at least 10 years. As a result, TMY simulations fail to capture much of the real behavior of PV systems under extreme and rapidly changing weather conditions, such as during cloudy or transient events. Consequently, Voc predictions based on TMY data do not accurately reflect system performance during critical periods for overvoltage scenarios.
The inaccuracy between TMY and historical datasets is graphically represented with scatter plots of Voc results. As a reference, this comparative analysis for the UK site is provided in Figure 7:

3.1.2. Monthly Evaluation

A numerical evaluation is implemented for the quantification of Voc discrepancies between historical and TMY datasets. This analysis requires the definition of an adequate comparative connection between the complete historical dataset and the TMY data; for example, if a 10-year hourly dataset is considered with a five-minute precision, then the analysis will compare 105.120 values per weather parameter against 8760 values per weather parameter for the TMY dataset. When PV systems are simulated, predictions are typically provided monthly, regardless of whether the simulation is based on hourly data, as the result’s precision covers a whole year and provides a good overview of the system’s behavior. Consequently, a monthly evaluation of Voc deviation between inter-hourly and TMY datasets is carried out.
The monthly evaluation is based on Voc errors in a month-to-month evaluation, comparing Voc levels for each month of the historical dataset against the value for the same month of the TMY dataset, maximum and average deviations are obtained. As first step, Figure 8 provides the graphical representation of these monthly Voc levels for the UK site, in which TMY values are represented through the red line, and the different years of the equivalent historical dataset through each colored point:
As shown in Figure 8, the monthly average and maximum Voc values obtained from the TMY follow the dispersion offered by the historical data for each month, proving the voltage correlation between the TMY and the historical dataset. However, relevant discrepancies between historical and TMY Voc values are obtained. This error, which can lead to a relevant error in the maximum voltage prediction, is represented in Figure 9 for the different sites.
Figure 9 shows maximum Voc deviations in the spring and autumn, seasons with a critical relation between irradiance and temperature for overvoltage episodes, as denoted from the Critical Ratio evaluation carried out it [29]. If European sites, with more extreme weather, present maximum deviations between 20 V and 30 V, the JAM site, with a more balanced climate throughout the year, presents just a 10.81 V maximum error. Average monthly errors show the same trend.

3.1.3. Inter-Hourly Evaluation

Despite the monthly evaluation providing a good view of Voc discrepancies between historical and TMY datasets, an inter-hourly evaluation is carried out. In this case, each 5 min voltage level is compared to the equivalent hour from the TMY. For example, 11:35:00/06/2021 will be compared to hour eleven of the same day in the TMY (11:00/06). This detailed analysis provides a more precise conclusion about Voc error due to the lowest hourly precision of TMYs against historical real data. The results of this evaluation are provided in Figure 10:
Figure 10 provides the inter-hourly Voc analysis, showing that average error keeps a general deviation around ±5 V and extreme deviations of +10.02 V and −4.68 V. These results show that hourly values are a good approximation of the average voltage level inside each hour. Nevertheless, when the analysis is focused on the extreme situations (maximum error), the results provide a different view. Hourly Voc values show a relevant voltage error against dynamic changes occurring within each hour. In this comparison, the maximum errors range from +60.71 V to +142.33 V, with an average of +110.45 V.
Figure 10 illustrates how the extreme Voc error varies across different months and locations. For sites with more pronounced seasonal contrasts, the maximum Voc error tends to increase during the winter months and decrease in the summer. In contrast, the JAM site, which experiences more stable weather conditions throughout the year, exhibits the lowest extreme Voc error. This suggests that the more extreme a site’s climate is, the greater the intra-hourly variability is in weather conditions, leading to larger deviations of the Voc level from the hourly average.

3.2. Meteorological Data Analysis: TMY vs. EMY

This section presents a comparative analysis between TMY, different EMY cases, and inter-hourly historical weather data for the five different sites introduced in Section 2.1. The analysis is focused on the main variables for Voc prediction, which are irradiance, ambient temperature, and wind speed as denoted from Equations (1)–(3). This comparative analysis is provided through Figure 11, Figure 12 and Figure 13:
The EMY algorithm is based on Voc evaluation; nevertheless, in the last three figures, it is easily shown how this voltage evaluation method is translated to the weather conditions. Due to the algorithm working methodology, based on percentile evaluation, it might not be representative to evaluate only the extreme voltage conditions for each site (maximum values for irradiance and wind speed, and minimum values for temperature). This is essential since these are single values obtained from each simulation, and single values hardly represent a statistical evaluation of a full historical record. If we evaluate average values, we can confirm that results obtained through the EMY algorithm are consistent with the expected trend based on Voc criteria, proving that the higher the heat transference is (higher irradiance and wind speeds, lower ambient temperatures), the higher the voltage will be. Thus, weather parameters evolution from EMY100 (higher Voc prediction, minor risk) to EMY20 (minor Voc prediction, higher risk) presents a clear tendency:
  • Temperature increases from EMY100 to EMY 20.
  • Irradiance decreases from EMY100 to EMY20.
  • Wind speed decreases from EMY100 to EMY20.
Although in Section 3.3 through the voltage analysis, both historical dataset and TMY are perfectly integrated in the voltage-percentile scale that EMY method provides, the results in Figure 11, Figure 12 and Figure 13 already denote that historical weather parameters are near the EMY100 results, whereas TMY parameters may be associated with EMY60, which is a first approximation on the relation between historical and TMY Voc levels against those provided by the EMY model.

3.3. Voc Analysis: TMY vs. EMY and Historical Dataset

After the Voc assessment between the TMY and the equivalent historical dataset established a clear justification for adopting the new EMY model, this section compares the Voc results of the EMY model with those of the TMY and the historical dataset.
For a precise evaluation, both TMY and EMY were defined using the same historical dataset for each site. In addition, several EMY cases (from 100% to 20% percentiles, with 20% steps) are included in the analysis as reference, but any other percentile could be implemented following the different stages defined in Section 2.3.

3.3.1. Monthly Average Voc Level Evaluation (EMY vs. TMY)

As the first analysis, TMY and EMY100 results for each site are compared. Firstly, the evaluation of average Voc levels every month is carried out, comparing EMY100 against TMY for each site in Figure 14:
Then, Figure 14 and its summary in Figure 15 compare the monthly average Voc level differences between the EMY100 and the traditional TMY across the five sites. The results show significant variations, with some countries experiencing consistently positive differences (e.g., the UK, Spain, and Italy), while Jamaica exhibits predominantly negative values throughout the year. This suggests that, in most cases, the EMY100 model predicts higher voltage levels than the traditional TMY, except in Jamaica, where the differences are mostly negative, indicating lower Voc levels. The highest discrepancies occur in the colder months, particularly in November and December, when Italy and the UK reach their maximum values of 45.16 V and 38.53 V, respectively. In contrast, the lowest values are observed in Jamaica during spring and summer, with extreme drops in April (−36.29 V) and May (−30.11 V). This shows how the EMY model reflects the different meteorological conditions that significantly impact voltage generation in tropical climates compared to temperate regions. Overall, the data highlight the influence of regional climatic conditions on voltage variations, with temperate European countries, with more extreme weather conditions between colder and warmer months, generally experience the increased of Voc levels under the EMY100 model, while Jamaica, with its tropical and moderated climate, tends to show the opposite trend.

3.3.2. Monthly Maximum Voc Level Evaluation (EMY vs. TMY)

The same comparative analysis is performed between EMY100 and TMY, in this case, for monthly maximum Voc levels:
Unlike the previous analysis, which focused on average voltage differences, this dataset provided through Figure 16 and its summary in Figure 17 highlights the discrepancies in peak Voc levels, providing valuable insights into the impact of extreme meteorological conditions on photovoltaic system performance and consequent design. The results show that the maximum Voc differences are generally positive across all locations, proving that the EMY100 model predicts higher peak voltages than the traditional TMY in most cases. The highest maximum values are recorded in Spain (27.10 V in September), Italy (24.57 V in November), and Portugal (21.27 V in October). The UK consistently shows significant maximum differences, with its highest value reaching 25.54 V in February, reinforcing that the EMY100 model accounts for more intense seasonal variations. On the other hand, Jamaica exhibits lower overall values, with an average of 4.86 V and a peak of 10.51 V in December. This suggests that while the EMY100 model also predicts increased voltage levels in Jamaica, a site with tropical weather, the variations are less pronounced than in regions with more extreme weather.

3.3.3. Maximum Voc Comparison: Historical, EMY, TMY

Figure 18 presents a detailed comparative Voc analysis for the UK site, focusing on the historical dataset and TMY cases:
Figure 18 shows the daily maximum Voc values (days 1–365) derived from the TMY dataset compared to those from the full 10-year historical record used to define the TMY. For each day, the highest Voc value across the 10-year period is plotted, highlighting how the TMY dataset consistently underestimates Voc levels. This discrepancy is especially pronounced during colder months, when Voc levels are naturally higher. As a result, the error is most significant during the periods that are critical for string sizing, as further illustrated in Figure 19.
Then, Voc levels for the EMY cases are compared to the historical and TMY results, validating how historical results are equivalent to EMY100 (Figure 20, left), whereas Voc results provided by the TMY dataset are near EMY60 (Figure 20, right):

3.3.4. EMY Comparative Analysis

Finally, Figure 21 demonstrates and compares how EMY model provides different voltage levels following the selected percentile, providing Voc levels according to the risk threshold, and decreasing as the percentile level decreases:
Figure 21 illustrates the results for various EMY percentiles at the UK site, revealing that differences between percentiles become more pronounced during colder months and diminish in warmer periods. This seasonal trend underscores the critical influence of low temperatures on Voc levels, driven by the inverse relationship between temperature and voltage. It also highlights that temperature has a greater impact on voltage variability than other meteorological factors such as solar irradiance or wind speed. These findings emphasize the importance of accurately modeling extreme meteorological years, as the highest Voc values, and consequently the greatest risk of overvoltage, occur during colder months. The EMY model has already been applied in the design phase of large-scale photovoltaic plants, enabling the estimation of maximum voltage at specific percentiles. This approach strengthens the justification for system design decisions, particularly string sizing, by providing a robust, percentile-based risk assessment that can be presented to stakeholders and investors.

3.4. EMY Validation

The EMY model validation is carried out comparing Voc results against the Voc levels obtained from the real data of each project presented in Table 4, as well as against the Voc levels provided by the TMY model, comparing the performance of both models to predict maximum voltage levels in PV systems.
As the first step, maximum Voc levels are evaluated, comparing values obtained through EMY100 and TMY models against SCADA real data for each project, which results are represented in Figure 22:
Figure 22 provides monthly Voc maximum values, obtained through the SCADA real data of each project, TMY, and EMY100 datasets, respectively. The analysis shows that the TMY-based estimations generally tend to underestimate the real Voc levels. This underestimation is particularly pronounced during high-irradiance periods, which are critical from a system design perspective. In some cases, the deviation exceeded 5%, potentially leading to conservative sizing of inverters and transformers that might not reflect real peak conditions. In contrast, the EMY100 model demonstrates a more conservative estimation, with voltage values consistently exceeding or closely matching SCADA measurements. While this might appear as an overestimation under typical conditions, such a tendency is advantageous in the context of safety designs. EMY100 reduces the likelihood of underrated designs by capturing upper-bound operational scenarios and ensuring more robust system performance under extreme environmental conditions.
The comparative results across all seven projects included in Figure 22 show that EMY100 offers better alignment with the upper voltage ranges observed in the real datasets, while maintaining manageable deviation margins. This confirms EMY100 as a reliable tool for risk-averse system planning, especially when accurate modeling of worst-case scenarios is required. In summary, although TMY remains a valid tool for long-term energy yield assessment, its application in electrical design must be approached with caution. The EMY100 model, with its emphasis on extremes, offers a safer alternative for voltage prediction, minimizing the risk of undervaluing critical parameters that influence the stability and durability of photovoltaic installations.
After the monthly evaluation, the analysis is focused on the maximum Voc levels for each project; EMY is implemented through different percentiles between EMY100 and EMY10 to validate the complete behavior of the new model. Table 5 includes the maximum Voc levels obtained through SCADA, TMY, and different EMY percentiles, providing a powerful view of EMY model performance.
The results provided in Table 5 highlight not only the relative accuracy of each model, but also the configurability and flexibility of the EMY approach. In five out of seven projects, the SCADA-based Voc level exceeds the values predicted by the TMY model. This consistent underestimation by TMY emphasizes its limitations when assessing voltage extremes. By contrast, the EMY100 model provided values that were higher than the SCADA maximums in six of the seven projects, closely aligning with real operational peaks. The unique case where SCADA value slightly exceeds EMY100 (Project 1), the difference is minimal (less than 4 V, or under 0.3% deviation), suggesting a strong capacity of EMY100 to capture critical voltage scenarios without excessively overestimating.
A particularly valuable feature of the EMY model is its percentile-based tunability, allowing users to select different confidence levels depending on design priorities. As the percentile decreases from EMY100 to EMY10, a gradual reduction in maximum voltage can be observed across all projects. This trend demonstrates that the EMY model is not only reliable for extreme-case scenarios but also adaptable for probabilistic or risk-based design criteria. For example, a system designer prioritizing extreme safety margins might opt for EMY100 or EMY90, while a designer focused on cost-efficiency with moderate risk tolerance may use EMY50 or EMY40. This configurability provides a strategic advantage over static models like TMY, enabling a better balance between robustness and optimization. The EMY model effectively acts as a modular tool, aligning with international best practices that increasingly favor flexibility in probabilistic system modeling.
The evaluation of maximum Voc levels per day is carried out, and results are provided through Table 6 and Table 7. The first one provides number of days in which Voc levels obtained from the SCADA system are higher than each model, from TMY to EMY10, whereas the second one includes the maximum deviation between SCADA Voc values and each model. These indicators reflect both the frequency and severity of model performance, which are crucial when assessing risk margins in string sizing.
The TMY model consistently shows the highest frequency of underestimation across all projects, with the real voltage surpassing TMY predictions on an average of over 180 days per year. For ins·tance, Projects 4 and 5 experienced exceedances on 308 and 288 days respectively, representing 84% and 79% of the year. Simultaneously, the maximum deviations for TMY were the largest in nearly every project, reaching up to 150.24 V in Project 6 and 96.11 V in Project 5. These results highlight the systematic underestimation risk when relying solely on TMY datasets, potentially leading to unsafe system designs. However, the EMY100 model demonstrates significantly improved performance, with both lower exceedance frequency and reduced voltage gaps. The number of days when SCADA readings exceeded EMY100 predictions dropped significantly, to as few as 1–3 days in Projects 6 and 7, staying below 50 days in all but one project. Moreover, the maximum deviations were consistently modest, typically under 30 V, with most projects falling in the 15–25 V range. This indicates that EMY100 is capable of reliably capturing near-extreme voltage values while preserving realistic margins, making it a conservative yet accurate design reference.
The EMY model offers a tunable structure, where lower percentiles (e.g., EMY90, EMY80, etc.) provide a spectrum of design options. As the percentile decreases, both the frequency of exceedance and the maximum deviation naturally increase, offering a configurable trade-off between system robustness and cost-efficiency. For instance, moving from EMY100 to EMY50 increases exceedance days (e.g., from 21 to 105 in Project 1) and deviation (from 20.06 V to 58.60 V), yet still remains substantially more reliable than TMY in most scenarios.
While the EMY model is primarily conceived as a design tool to ensure PV system safety margins, particularly under maximum voltage scenarios covered by EMY100, it is also relevant to assess its consistency in different operational conditions, adjusting the desired percentile. Thus, a RMSE analysis was performed, comparing each EMY percentile model and TMY against Voc levels obtained from the SCADA real data (Table 8). RMSE provides a quantitative measure of how closely the predicted daily voltages match observed values over the evaluation year.
The analysis reveals that no single EMY percentile consistently minimizes the RMSE across all projects, underscoring the inherent variability of this type of assessment. For instance, while EMY40 and EMY30 provide the lowest RMSE in Projects 1 and 7, EMY80 and EMY90 perform better in Projects 3 and 4, and EMY70–60 range proves most accurate for Projects 5 and 6. This lack of uniformity reflects a fundamental characteristic of the comparison: the EMY and TMY datasets are constructed from multi-year historical climatology, whereas the SCADA data correspond to a specific year. As such, perfect alignment is neither expected nor indicative of model quality in this context.
To further interpret the RMSE behavior of the EMY model across different percentiles, we examined the results in conjunction with the climatic and geographic characteristics of each project (Table 4). This approach explores whether certain EMY percentiles align better with specific environmental profiles, such as temperature range or irradiance, thus providing insight into the contextual suitability of each percentile level.
Projects 1, 2, 3, and 5 share relatively similar climatic patterns: wide annual temperature ranges (minimums below −5 °C and maximums above 40 °C) and high global tilted irradiation (GTI > 530 W/m2). Project 1 (best RMSE: EMY40–30) and Project 3 (best RMSE: EMY30) show lower RMSE values at mid-to-low percentiles, suggesting that under highly variable conditions, these percentiles better track the distribution of daily voltages. Project 2, however, has relatively high RMSE across all EMYs, with no clear advantage over TMY. This may indicate local anomalies or operational behaviors not well captured by climatological models. Project 5, with the lowest average temperature (15.68 °C) and moderate irradiance, sees lowest RMSE at EMY70 and EMY80, suggesting that in colder locations with slightly lower GTI, more conservative (higher percentile) EMYs better reflect measured voltages.
Project 4 exhibits a narrower temperature range (8.4–29.8 °C) and very high GTI (622.5 W/m2). Interestingly, this project shows its lowest RMSE at EMY80 and EMY90 (8.04 and 8.27 V), outperforming both TMY and EMY100. This indicates that in stable climates with minimal seasonal variation, higher EMY percentiles may better match real operating behavior due to limited extreme events, allowing more accurate forecasting with lower safety margins.
Project 6 is characterized by the lowest irradiance (GTI = 308.5 W/m2) and coolest conditions (average ambient temperature of 11.69 °C). Notably, none of the EMY percentiles achieves a particularly low RMSE, with values ranging from 32.63 to 46.89 V. TMY performs best in this case (RMSE = 42.34 V), though still with a large error. This suggests that in low-irradiance, temperate climates, both EMY and TMY models may struggle to represent daily variability effectively, potentially due to frequent cloud cover and stochastic irradiance patterns not well captured in averaged models.
Project 7, with the highest average temperature (27.32 °C) and narrow thermal range, demonstrates its lowest RMSE at EMY10 (7.14 V) and EMY30 (7.60 V), even outperforming TMY (7.98 V). In tropical environments with high baseline irradiance and limited fluctuation, lower EMY percentiles may suffice, as daily maximum voltages are more predictable and rarely extreme. This aligns with the observation that conservative EMYs (e.g., EMY100) may overestimate voltage in stable tropical zones, leading to unnecessarily high RMSE.
This contextual analysis reveals that the EMY percentile providing the lowest RMSE is not uniform across projects, and appears to depend on local climatic conditions as follows:
  • High variability climates tend to favor mid-percentile EMYs (30–70).
  • Stable tropical climates perform best with lower percentiles (EMY80–10).
  • Low irradiance and temperate zones show no strong alignment, indicating limited model resolution under high atmospheric variability.
These findings reaffirm that the EMY model’s strength lies not in minimizing RMSE, but in offering a configurable framework that adapts to project-specific design risk profiles. While RMSE helps assess operational alignment, percentile selection should ultimately be guided by engineering conservatism and environmental predictability, not solely by statistical error.
Finally, to quantify the performance of the EMY100 approach relative to the conventional TMY method, an accuracy analysis was conducted using the maximum Voc values from SCADA measurements of the seven utility-scale PV projects considered in this study. The evaluation focused on two metrics: the Absolute Percentage Error (APE) and the Absolute Error (AE, in volts) for both datasets, and the results are presented in Table 9. This comparison directly assesses the capability of each approach to replicate real maximum Voc values, which are critical for string sizing in large-scale PV systems.
Table 9 shows that EMY100 consistently delivers lower APE and AE values than the conventional TMY method. On average, the APE decreases from 1.00% with TMY to 0.61% with EMY100, corresponding to a relative reduction of 38.6% in percentage error. In absolute terms, the AE drops from 14.70 V to 8.63 V, representing a 41.3% reduction. These reductions are highly relevant in engineering practice, as even small voltage deviations can cause overvoltage events that exceed design or regulatory limits in high-power PV plants. The mean reduction of more than 6 V achieved by EMY100 translates into a significantly lower likelihood of surpassing the maximum allowable DC voltage, supporting both operational safety and compliance.

4. Discussion and Analysis

Typical Meteorological Years (TMYs) have become the standard weather datasets for predicting renewable energy resources, whose generation is strongly dependent on meteorological conditions. They provide a statistically representative year compiled from long-term records through standardized methods. While TMYs are widely applied for different purposes, this study focuses on their use in photovoltaic system design. In large-scale PV projects, predicting the maximum open-circuit voltage (Voc) during operation is essential for defining string size, a critical element in the overall system design. Although advanced machine learning (ML) models have shown strong capabilities in PV performance forecasting, particularly for energy yield or short-term operational predictions, they typically require large volumes of site-specific, high-frequency data, extensive training processes, and specialized computational resources. These factors can limit their practicality for early-stage design in large-scale projects, where rapid assessments are required and data availability may be restricted. In contrast, the proposed Extreme Meteorological Year (EMY) model offers a computationally light, transparent, and easily configurable approach focused exclusively on extreme voltage estimation under rare meteorological conditions. This makes the EMY model particularly suitable for industrial design workflows, enabling reliable risk-based voltage predictions without the complexity and data dependencies inherent in many ML approaches. Furthermore, by enabling more precise voltage estimation, EMY supports sustainable design practices, optimizing material usage and reducing both economic costs and environmental impact.
TMYs deliver an efficient dataset for predicting PV energy yield under statistically average conditions; however, they are not well suited for capturing extreme weather events that can trigger critical overvoltage scenarios. As shown in Section 3.1, TMY datasets lack the temporal granularity and event coverage to reflect dynamic situations favorable to maximum Voc generation accurately. Monthly deviations between TMY-based predictions and historical records can reach 20–30 V. When higher-resolution (inter-day and inter-hour) analyses are considered, the underestimation can range from 60.71 V to 142.33 V, with an average deviation of 110.45 V.
To address this gap, this study proposes the EMY model. Like the TMY concept, EMY compiles a reduced yet representative dataset, focusing on capturing voltage extremes rather than average conditions. Its percentile-based structure enables tailoring predictions according to different safety margins and risk tolerances. The EMY100 percentile exactly reproduces the maximum voltages from the full historical dataset, thereby eliminating the underestimation present in TMY predictions.
In practical applications for large-scale photovoltaic PV projects, EMY can be implemented during the preliminary and detailed design phases to determine the maximum open-circuit voltage expected under site-specific extreme conditions. This allows engineers to define string sizes and select equipment voltage ratings with greater confidence, minimizing the risk of overvoltage events without unnecessary oversizing. The model’s configurability means that different percentiles can be selected depending on project priorities: conservative designs may use EMY100 to fully cover historical extremes, while cost-optimized designs in stable climates may adopt lower percentiles such as EMY50–70 to achieve a balance between safety and material efficiency.
The method is particularly valuable in the industrial context, where rapid yet reliable decision-making is needed. By condensing decades of meteorological data into a single synthetic year focused on extreme values, EMY streamlines the engineering workflow and avoids the computational burden of processing the full dataset. This efficiency and its demonstrated accuracy against real SCADA measurements make EMY a robust tool for integration into PV plant design guidelines, feasibility studies, and regulatory compliance documentation.
The comparative evaluation between EMY, TMY, and SCADA-measured data across seven PV projects confirms the EMY model’s superior ability to represent extreme Voc levels. In most cases, EMY100 matched or slightly overestimated measured maxima, significantly reducing the risk of unexpected overvoltage episodes. On average, EMY100 achieved a prediction accuracy exceeding 92%, compared to 80–85% for TMY datasets, corresponding to an error reduction of approximately 35–60% in maximum voltage estimation. The percentile-based configuration also proves valuable: as the selected percentile decreases, the predicted maximum voltage reduces and the exceedance frequency increases, offering flexibility for different design philosophies.
Daily exceedance counts and maximum deviation analyses reinforce these findings, showing that EMY100 offers the most conservative and robust estimates, while lower percentiles provide operationally tighter fits. RMSE results, however, indicate that no single EMY percentile consistently outperforms all others across different sites, reflecting the natural variability between a constructed meteorological year and a specific operational year. Intermediate percentiles (EMY40–70) performed better in Mediterranean and tropical climates, whereas stable climates favored lower percentiles. These results validate EMY’s adaptability to different environmental and operational contexts and underline its robustness as a practical tool for large-scale PV design, minimizing overvoltage risk without oversizing.

5. Conclusions

This study demonstrates that the Extreme Meteorological Year model offers a reliable and adaptable alternative to conventional meteorological year datasets for predicting maximum open-circuit voltages in photovoltaic systems. By incorporating extreme-value statistics through a percentile-based approach, EMY allows system designers to select an appropriate safety margin according to project-specific risk tolerance. The EMY100 percentile, in particular, ensures full coverage of historical extremes, providing a secure upper bound that minimizes the likelihood of overvoltage-related equipment failures.
The validation against SCADA data from seven PV plants shows that EMY consistently reduces the underestimation present in TMY predictions, especially in locations with high climatic variability. The model’s configurability means it can be optimized for safety and accuracy in specific operational contexts, enhancing both technical robustness and long-term system reliability.
From a sustainability perspective, the EMY approach directly supports the transition to more resilient and efficient renewable energy systems. Enabling safer and more accurate electrical sizing reduces the risk of premature component replacement, improves system uptime, and optimizes the use of materials and resources throughout the PV plant’s lifecycle. These benefits translate into lower environmental impact per unit of energy generated, contributing to the broader goals of sustainable energy deployment and climate change mitigation.
In terms of application scope, EMY can be applied to various climatic zones, including Mediterranean, tropical, and continental conditions, provided that long-term high-quality meteorological datasets (irradiance and temperature) are available. Its percentile-based configuration ensures adaptability to both highly variable climates—where conservative upper bounds are essential—and more stable environments, where tighter fits may be preferable.
However, some limitations exist. The model does not perform real-time weather prediction and relies entirely on historical datasets for its statistical construction. Its accuracy may decrease in sites with scarce or incomplete meteorological records, or where standard measurement stations poorly capture microclimatic effects (e.g., complex terrain, local shading, or thermal inversions). Furthermore, the present scope is restricted to predicting DC-side voltage extremes; the method does not estimate other PV performance metrics such as annual yield, inverter clipping losses, or AC-side conditions.
Future research will explore three main directions: (i) the integration of higher-resolution temporal datasets to capture short-term extreme events better, (ii) the hybridization of the EMY methodology with machine learning techniques to improve accuracy in data-scarce environments, and (iii) the expansion of the model to predict other critical design parameters, such as current peaks or thermal stress profiles, enabling a more holistic approach to PV system electrical design.
Adopting EMY in PV system design practices can potentially improve both the economic and environmental performance of solar power projects, reinforcing their role as a cornerstone of the global renewable energy transition.

Author Contributions

Methodology, C.S.-G. and J.A.-M.; Validation, C.S.-G. and C.V.-S.; Formal analysis, C.S.-G. and D.A.-S.; Investigation, C.S.-G.; Resources, J.A.-M.; Writing—original draft, arlos Sanchís-Gómez; Writing—review & editing, J.A.-M., C.V.-S. and D.A.-S.; Supervision, J.A.-M., C.V.-S. and D.A.-S.; Project administration, J.A.-M.; Funding acquisition, C.V.-S. All authors have read and agreed to the published version of the manuscript.

Funding

This study is part of the R&D plan of Grupotec, a leading company in the renewable energy sector committed to continuing to innovate and improve the exploitation of photovoltaic solar energy. Specifically, the study is part of a collaborative project with the Polytechnic University of Valencia through the Energy Engineering Institute. In addition, the innovative nature of this project has been recognized by the Valencian Innovation Agency, which has subsidized part of the project’s development budget through the European Regional Development Fund (ERDF) 2021–2027, File number INNTA3/2023/9.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are confidential.

Conflicts of Interest

Authors Carlos Sanchís-Gómez and Jorge Aleix-Moreno were employed by the company Departamento de Ingeniería de Grupotec Renovables, Grupotec Servicios Avanzados SA. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Abbreviations

The following abbreviations are used in this manuscript:
GTIIn-plane Global tilted irradiance (W/m2)
IscShort-circuit current (A)
ImpCurrent level for maximum power conditions (A)
NOCTNominal operating cell temperature (°C)
PmaxMaximum power (W)
STCStandard test conditions: 1000 (W/m2), 25 (°C), AM 1.5
TMYTypical meteorological year
UHeat transfer factor, (W/m2k)
USPUtility-Scale photovoltaic projects
VmpVoltage level for maximum power conditions, (V)
VocOpen-circuit voltage, (V)
WsWind speed, (m/s)
αModule absorption coefficient (-)
αIscTemperature coefficients of Isc (%/°C)
βVocTemperature coefficients of Voc (%/°C)
ηmod,STCModule efficiency at STC (%)
τTransmittance coefficient (W/m2k)
ϒPmaxTemperature coefficient at Pmax (%/°C)
TambAmbient temperature (°C)
TcellModule’s cell temperature (°C)

References

  1. International Renewable Energy Agency. Renewable Energy Statistics 2023 Statistiques D’énergie Renouvelable 2023 Estadísticas de Energía Renovable 2023 About IRENA. 2023. Available online: www.irena.org (accessed on 27 July 2025).
  2. Panagoda, S.S.; Alwis, S.; Panagoda, L.P.S.S.; Sandeepa, R.A.H.T.; Perera, W.A.V.T.; Sandunika, D.M.I.; Siriwardhana, S.M.G.T.; Alwis, M.K.S.D.; Dilka, S.H.S. Advancements in Photovoltaic (Pv) Technology for Solar Energy Generation. J. Res. Technol. Eng. 2023, 4, 30–72. Available online: https://www.researchgate.net/publication/372364724 (accessed on 27 July 2025).
  3. Yu, Y.; Chou, J.; Yao, X.; Ma, N. Generation and application of typical meteorological year data for PV system potential assessment: A case study in China. J. Build. Eng. 2024, 86, 108831. [Google Scholar] [CrossRef]
  4. ISO 15927-4:2005; Hygrothermal Performance of Buildings—Calculation and Presentation of Climatic Data. International Organization for Standardization: Geneva, Switzerland, 2005. Available online: https://www.iso.org/standard/41371.html (accessed on 4 January 2025).
  5. Li, H.; Huo, Y.; Fu, Y.; Yang, Y.; Yang, L. Improvement of methods of obtaining urban TMY and application for building energy consumption simulation. Energy Build. 2023, 295, 113300. [Google Scholar] [CrossRef]
  6. Crawley, D.; Lawrie, L. Rethinking the TMY: Is the ‘Typical’ Meteorological Year Best for Building Performance Simulation? In Proceedings of the BS2015: 14th Conference of International Building Performance Simulation Association, Hyderabad, India, 7–9 December 2015. [Google Scholar] [CrossRef]
  7. Li, H.; Huang, J.; Hu, Y.; Wang, S.; Liu, J.; Yang, L. A new TMY generation method based on the entropy-based TOPSIS theory for different climatic zones in China. Energy 2021, 231, 120723. [Google Scholar] [CrossRef]
  8. Lhendup, T.; Lhundup, S. Comparison of methodologies for generating a typical meteorological year (TMY). Energy Sustain. Dev. 2007, 11, 5–10. [Google Scholar] [CrossRef]
  9. PVsyst, TMY Generation Algorithms. Available online: https://www.pvsyst.com/help-pvsyst7/meteo_tmy_algorithms.htm (accessed on 28 January 2025).
  10. Cebecauer, T.; Suri, M. Typical Meteorological Year Data: SolarGIS Approach. Energy Procedia 2015, 69, 1958–1969. [Google Scholar] [CrossRef]
  11. Realpe, A.M.; Vernay, C.; Pitaval, S.; Lenoir, C.; Blanc, P. Benchmarking of Five Typical Meteorological Year Datasets Dedicated to Concentrated-PV Systems. Energy Procedia 2016, 97, 108–115. [Google Scholar] [CrossRef]
  12. Ferreira, A.C.; Menezes, N.; Teixeira, I.M.; Teixeira, S.F.; Martins, L.A. Influence of Typical Meteorological Years on the Optimization of Incident Solar Radiation for PV Applications in Portugal. In Proceedings of the 3rd International Conference on Electronic Engineering and Renewable Energy Systems; Saidia, Morocco, 20–22 May 2022, Springer: Berlin/Heidelberg, Germany, 2023; pp. 925–932. [Google Scholar] [CrossRef]
  13. Salehi, S.S.S.; Kalamees, T.; Kurnitski, J.; Thalfeldt, M. New typical meteorological year generation method based on long-term building energy simulations. Build. Environ. 2024, 256, 111504. [Google Scholar] [CrossRef]
  14. Chan, A.L.S. Generation of typical meteorological years using genetic algorithm for different energy systems. Renew. Energy 2016, 90, 1–13. [Google Scholar] [CrossRef]
  15. Hosseini, M.; Bigtashi, A.; Lee, B. A systematic approach in constructing typical meteorological year weather files using machine learning. Energy Build. 2020, 226, 110375. [Google Scholar] [CrossRef]
  16. Argiriou, A.; Lykoudis, S.; Kontoyiannidis, S.; Balaras, C.A.; Asimakopoulos, D.; Petrakis, M.; Kassomenos, P. Comparison of Methodologies for Tmy Generation Using 20 Years Data for Athens, Greece. 1999. Available online: https://www.sciencedirect.com/science/article/abs/pii/S0038092X99000122 (accessed on 27 July 2025).
  17. Crawley, D.B. Which Weather Data Should You Use for Energy Simulations of Commercial Buildings? Trans.-Am. Soc. Heat. Refrig. Air Cond. Eng. 1998, 104, 498–515. [Google Scholar]
  18. Zayed, M.E.; Rehman, S.; Elgendy, I.A.; Al-Shaikhi, A.; Mohandes, M.A.; Irshad, K.; Abdelrazik, A.S.; Alam, M.A. Benchmarking reinforcement learning and prototyping development of floating solar power system: Experimental study and LSTM modeling combined with brown-bear optimization algorithm. Energy Convers. Manag. 2025, 332, 119696. [Google Scholar] [CrossRef]
  19. Rehman, S.; Irshad, K.; Mohandes, M.A.; Al-Shaikhi, A.; Zayed, M.E. Comprehensive evaluation of solar floating photovoltaic prospective in Saudi Arabia: Comparative experimental investigation and thermal performance analysis. Sol. Energy 2024, 283, 113015. [Google Scholar] [CrossRef]
  20. Stoffel, T.; Renné, D.; Myers, D.; Wilcox, S.; Sengupta, M.; George, R.; Turchi, C. Best Practices Handbook for the Collection and Use of Solar Resource Data. 2010. Available online: www.nrel.gov (accessed on 27 July 2025).
  21. Habte, A.; Lopez, A.; Sengupta, M.; Wilcox, S. Temporal and Spatial Comparison of Gridded TMY, TDY, and TGY Data Sets. 2014. Available online: www.nrel.gov/publications (accessed on 27 July 2025).
  22. Wu, Y.; An, J.; Gui, C.; Xiao, C.; Yan, D. A global typical meteorological year (TMY) database on ERA5 dataset. Build. Simul. 2023, 16, 1013–1026. [Google Scholar] [CrossRef]
  23. Wilcox, S.; Marion, W. Innovation for Our Energy Future Users Manual for TMY3 Data Sets. 1994. Available online: http://www.osti.gov/bridge (accessed on 27 July 2025).
  24. Uwineza, L.; Kim, H.-G.; Kim, C.K.; Kim, B.; Kim, J.-Y. Accuracy Assessment of Typical Meteorological Year Data for a Photovoltaic System using a Bootstrap Method. J. Korean Sol. Energy Soc. 2021, 41, 115–129. [Google Scholar] [CrossRef]
  25. Kleissl, J. Solar Energy Forecasting and Resource Assessment; Academic Press: Cambridge, MA, USA, 2013. [Google Scholar]
  26. Vignola, F.E.; McMahan, A.C.; Grover, C.N. Bankable Solar-Radiation Datasets. In Solar Energy Forecasting and Resource Assessment; Elsevier: Amsterdam, The Netherlands, 2013; pp. 97–131. [Google Scholar] [CrossRef]
  27. Leloux, J.; Lorenzo, E.; García-Domingo, B.; Aguilera, J.; Gueymard, C.A. A bankable method of assessing the performance of a CPV plant. Appl. Energy 2014, 118, 1–11. [Google Scholar] [CrossRef]
  28. Putra, I.D.G.A.; Nimiya, H.; Sopaheluwakan, A.; Kubota, T.; Lee, H.S.; Pradana, R.P.; Alfata, M.N.F.; Perdana, R.B.; Permana, D.S.; Riama, N.F.; et al. Development of typical meteorological years based on quality control of datasets in Indonesia. Renew. Energy 2023, 221, 119699. [Google Scholar] [CrossRef]
  29. Sanchís-Gómez, C.; Aleix-Moreno, J.; Vargas-Salgado, C.; Alfonso-Solar, D. The novel evaluation method for PV module temperature and string size risk in utility-scale solar projects. Sol. Energy 2025, 295, 113520. [Google Scholar] [CrossRef]
  30. Bošnjaković, M.; Stojkov, M.; Katinić, M.; Lacković, I. Effects of Extreme Weather Conditions on PV Systems. Sustainability 2023, 15, 16044. [Google Scholar] [CrossRef]
  31. Okonkwo, P.C.; Nwokolo, S.C.; Udo, S.O.; Obiwulu, A.U.; Onnoghen, U.N.; Alarifi, S.S.; Eldosouky, A.M.; Ekwok, S.E.; Andráš, P.; Akpan, A.E. Solar PV systems under weather extremes: Case studies, classification, vulnerability assessment, and adaptation pathways. Energy Rep. 2025, 13, 929–959. [Google Scholar] [CrossRef]
  32. Jordan, D.C.; Perry, K.; White, R.; Deline, C. Extreme Weather and PV Performance. IEEE J. Photovolt. 2023, 13, 830–835. [Google Scholar] [CrossRef]
  33. Bright, J.M. Solcast: Validation of a satellite-derived solar irradiance dataset. Sol. Energy 2019, 189, 435–449. [Google Scholar] [CrossRef]
  34. Yang, D.; Bright, J.M. Worldwide validation of 8 satellite-derived and reanalysis solar radiation products: A preliminary evaluation and overall metrics for hourly data over 27 years. Sol. Energy 2020, 210, 3–19. [Google Scholar] [CrossRef]
  35. Solcast. Solcast Web Site. 2025. Available online: https://solcast.com/ (accessed on 9 May 2025).
  36. Mattei, M.; Notton, G.; Cristofari, C.; Muselli, M.; Poggi, P. Calculation of the polycrystalline PV module temperature using a simple method of energy balance. Renew. Energy 2006, 31, 553–567. [Google Scholar] [CrossRef]
Figure 1. Flowchart of the applied methodology.
Figure 1. Flowchart of the applied methodology.
Sustainability 17 07554 g001
Figure 2. EMY methodology.
Figure 2. EMY methodology.
Sustainability 17 07554 g002
Figure 7. Scatter Voc graphs of TMY and historical dataset for the UK site.
Figure 7. Scatter Voc graphs of TMY and historical dataset for the UK site.
Sustainability 17 07554 g007
Figure 8. Average and maximum monthly Voc levels, comparative analysis between the historical dataset (from year 1 to 10) and TMY, for the UK site.
Figure 8. Average and maximum monthly Voc levels, comparative analysis between the historical dataset (from year 1 to 10) and TMY, for the UK site.
Sustainability 17 07554 g008
Figure 9. Voc analysis between monthly average values of the historical dataset and TMY, providing maximum (left) and average (right) deviations.
Figure 9. Voc analysis between monthly average values of the historical dataset and TMY, providing maximum (left) and average (right) deviations.
Sustainability 17 07554 g009
Figure 10. Voc error between inter-hourly historical data and hourly values for the equivalent TMY.
Figure 10. Voc error between inter-hourly historical data and hourly values for the equivalent TMY.
Sustainability 17 07554 g010
Figure 11. Irradiance for different datasets.
Figure 11. Irradiance for different datasets.
Sustainability 17 07554 g011
Figure 12. Ambient temperature for different datasets.
Figure 12. Ambient temperature for different datasets.
Sustainability 17 07554 g012
Figure 13. Wind speed for different datasets.
Figure 13. Wind speed for different datasets.
Sustainability 17 07554 g013
Figure 14. Error evaluation (EMY100—TMY) of average Voc levels.
Figure 14. Error evaluation (EMY100—TMY) of average Voc levels.
Sustainability 17 07554 g014
Figure 15. Average Voc EMY100-TMY error. Evaluation summary.
Figure 15. Average Voc EMY100-TMY error. Evaluation summary.
Sustainability 17 07554 g015
Figure 16. Error evaluation (EMY100—TMY) of maximum Voc levels.
Figure 16. Error evaluation (EMY100—TMY) of maximum Voc levels.
Sustainability 17 07554 g016
Figure 17. Maximum Voc EMY100-TMY error. Evaluation summary.
Figure 17. Maximum Voc EMY100-TMY error. Evaluation summary.
Sustainability 17 07554 g017
Figure 18. Voc daily results for UK site—historical dataset vs. TMY.
Figure 18. Voc daily results for UK site—historical dataset vs. TMY.
Sustainability 17 07554 g018
Figure 19. Voc error: TMY vs. historical dataset in UK site.
Figure 19. Voc error: TMY vs. historical dataset in UK site.
Sustainability 17 07554 g019
Figure 20. EMY vs. historical and TMY Voc levels.
Figure 20. EMY vs. historical and TMY Voc levels.
Sustainability 17 07554 g020
Figure 21. Voc comparative analysis between different EMY cases.
Figure 21. Voc comparative analysis between different EMY cases.
Sustainability 17 07554 g021
Figure 22. Voc maximum levels for each project.
Figure 22. Voc maximum levels for each project.
Sustainability 17 07554 g022
Table 1. PV sites.
Table 1. PV sites.
SiteRegion
#1Spain (SP)
#2Italy (IT)
#3United Kingdom (UK)
#4Jamaica (JAM)
#5Portugal (PORT)
Table 2. Meteorological conditions of the sites.
Table 2. Meteorological conditions of the sites.
SPITUKJAMPORT
Average Irradiance (W/m2)575418309496523
Maximum Irradiance (W/m2)10771103105210821106
Minimum Temperature (°C)−9.0−6.8−6.620.14.0
Average Temperature (°C)18.519.113.527.119.4
Maximum Temperature (°C)40.140.537.032.734.6
Average Wind Speed (m/s)3.42.34.33.94.3
Maximum Wind Speed (m/s)14.212.617.410.315.5
Table 3. PV module.
Table 3. PV module.
ModelJKM585N-72HL4-BDV
TypeN type Mono-crystalline
Pmax (W)585
Vmp (V)42.74
Imp (A)13.69
Voc (V)51.67
Isc (A)14.43
ηmod,STC (%)22.65
ϒPmax (%/°C)−0.3
βVoc (%/°C)−0.25
ϒIsc (%/°C)0.046
NOCT (°C) 45
Table 4. Projects’ main characteristics.
Table 4. Projects’ main characteristics.
ProjectLocationPV ModuleAvrg. TambMax. TambMin. TambAvrg. GTI
1SPJKM395M-72H-V18.5840.10−8.90575.64
2SPJKM535M-72HL4-TV21.4042.70−2.40585.76
3SPJKM440M-78H-V19.6143.00−5.90582.98
4KEJKM330PP-7220.1229.808.40622.54
5SPJKM585N-72HL4-BDV15.6836.60−8.80538.45
6UKLR4_72_HPH_440M11.6925.00−4.00308.52
7CARIBTSM-270PD0527.3231.0022.60510.29
Table 5. Project’s maxim Voc level (V) provided by each dataset.
Table 5. Project’s maxim Voc level (V) provided by each dataset.
PROJECTSCADATMYEMY100EMY90EMY80EMY70EMY60EMY50EMY40EMY30EMY20EMY10
11399.531372.781396.431394.091392.841380.211384.601380.311384.371373.671390.341373.19
21529.201511.081531.371531.371531.371530.731517.021519.951513.791513.091517.451516.98
31497.831516.561514.891514.891508.851507.801501.801507.801501.061491.301493.121495.08
41400.131394.261402.091398.921402.091393.291395.931402.091398.921396.481395.731399.57
51543.931519.421554.901551.881542.901533.461550.841533.941539.221512.131517.241534.78
61522.621514.661541.911525.321526.081519.411514.731528.331522.221520.661514.101515.71
7880.51881.44886.36885.47882.82882.90882.87882.06882.02880.20884.13883.23
Table 6. Number of days in which SCADA exceeds the Voc level estimated by each model.
Table 6. Number of days in which SCADA exceeds the Voc level estimated by each model.
PROJECTTMYEMY100EMY90EMY80EMY70EMY60EMY50EMY40EMY30EMY20EMY10
1193214151100112105153182166182
213021293948555580848093
3150123759102108114129172157172
4308103176190246258254272306274276
52884988110173179172228258243258
6167315327785102148177185211
737124781619334065
Table 7. Maximum daily Voc deviation between SCADA and each model.
Table 7. Maximum daily Voc deviation between SCADA and each model.
PROJECTTMYEMY100EMY90EMY80EMY70EMY60EMY50EMY40EMY30EMY20EMY10
170.6820.0623.8140.1840.8158.4358.6055.0058.6067.9463.44
256.4727.3938.0743.2149.1749.4735.0844.3664.0865.9668.98
370.3515.7324.9129.2237.1540.0236.9845.1946.2948.5857.97
427.6313.9717.2817.9020.9322.0123.6724.5325.7530.2928.11
596.1123.0331.4039.8448.9247.8555.0457.6560.2460.5366.07
6150.248.1818.7829.0361.5275.8580.94120.82123.76127.91135.37
715.860.560.773.435.936.266.196.7411.2211.3110.84
Table 8. Model’s Voc RMSE.
Table 8. Model’s Voc RMSE.
PROJECTTMYEMY100EMY90EMY80EMY70EMY60EMY50EMY40EMY30EMY20EMY10
120.1429.1224.4622.0618.6218.8019.9217.2117.6718.0020.27
237.1953.5749.2947.1544.3543.4344.0040.4038.6540.0339.59
324.5833.8928.9526.5622.4121.3322.6821.5920.3921.2723.81
412.529.478.278.048.409.059.4110.1911.6611.3411.96
529.7826.8322.5020.8617.9619.4819.1120.0321.9921.4224.08
642.3446.8941.6837.5934.0132.6332.9133.4234.7739.9543.67
77.9813.6212.1911.169.879.709.218.397.607.617.14
Table 9. Model’s accuracy analysis.
Table 9. Model’s accuracy analysis.
PROJECTAPE TMY (%)AE TMY (V)APE EMY100 (%)AE EMY100 (V)
11.9126.750.223.11
21.1918.120.142.17
31.2518.731.1417.06
40.425.870.141.96
51.5924.510.7110.96
60.527.961.2719.30
70.110.930.675.86
MEAN1.0014.700.618.63
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Sanchís-Gómez, C.; Aleix-Moreno, J.; Vargas-Salgado, C.; Alfonso-Solar, D. Towards More Sustainable Photovoltaic Systems: Enhanced Open-Circuit Voltage Prediction with a New Extreme Meteorological Year Model. Sustainability 2025, 17, 7554. https://doi.org/10.3390/su17167554

AMA Style

Sanchís-Gómez C, Aleix-Moreno J, Vargas-Salgado C, Alfonso-Solar D. Towards More Sustainable Photovoltaic Systems: Enhanced Open-Circuit Voltage Prediction with a New Extreme Meteorological Year Model. Sustainability. 2025; 17(16):7554. https://doi.org/10.3390/su17167554

Chicago/Turabian Style

Sanchís-Gómez, Carlos, Jorge Aleix-Moreno, Carlos Vargas-Salgado, and David Alfonso-Solar. 2025. "Towards More Sustainable Photovoltaic Systems: Enhanced Open-Circuit Voltage Prediction with a New Extreme Meteorological Year Model" Sustainability 17, no. 16: 7554. https://doi.org/10.3390/su17167554

APA Style

Sanchís-Gómez, C., Aleix-Moreno, J., Vargas-Salgado, C., & Alfonso-Solar, D. (2025). Towards More Sustainable Photovoltaic Systems: Enhanced Open-Circuit Voltage Prediction with a New Extreme Meteorological Year Model. Sustainability, 17(16), 7554. https://doi.org/10.3390/su17167554

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop